Helm

Contents 1
Basic Algebra
1.1 Mathematical Notation and Symbols 2
1.2 Indices 21
1.3 Simplification and Factorisation 40
1.4 Arithmetic of Algebraic Fractions 62
1.5 Formulae and Transposition 78
Learning outcomes
In this Workbook you will learn about some of the basic building blocks of mathematics.
As well as becoming familiar with the notation and symbols used in mathematics you
will learn the fundamental rules of algebra upon which much of mathematics is based.
In particular you will learn about indices and how to simplify algebraic expressions,
using a variety of approaches: collecting like terms, removing brackets and factorisation.
Finally, you will learn how to transpose formulae.
Mathematical Notation
and Symbols 1.1
Introduction
This introductory Section reminds you of important notations and conventions used throughout
engineering mathematics. We discuss the arithmetic of numbers, the plus or minus sign, ±, the
modulus notation | |, and the factorial notation !. We examine the order in which arithmetical
operations are carried out. Symbols are introduced to represent physical quantities in formulae and
equations. The topic of algebra deals with the manipulation of these symbols. The Section closes
with an introduction to algebraic conventions. In what follows a working knowledge of the addition,
subtraction, multiplication and division of numerical fractions is essential.
#
• be able to add, subtract, multiply and divide
fractions
Prerequisites
Before starting this Section you should . . . • be able to express fractions in equivalent
forms
"
!

Learning Outcomes • recognise and use a wide range of common

mathematical symbols and notations
On completion you should be able to . . .

2 HELM (2006):
Workbook 1: Basic Algebra
®
1. Numbers, operations and common notations

A knowledge of the properties of numbers is fundamental to the study of engineering mathematics.
Students who possess this knowledge will be well-prepared for the study of algebra. Much of the
terminology used throughout the rest of this Section can be most easily illustrated by applying it to
numbers. For this reason we strongly recommend that you work through this Section even if the
material is familiar.
The number line

A useful way of picturing numbers is to use a number line. Figure 1 shows part of this line. Positive
numbers are represented on the right-hand side of this line, negative numbers on the left-hand side.
Any whole or fractional number can be represented by a point on this line which is also called the
real number line, or simply the real line. Study Figure 1 and note that a minus sign is always
used to indicate that a number is negative, whereas the use of a plus sign is optional when describing
positive numbers.
The line extends indefinitely both to the left and to the right. Mathematically we say that the line
extends from minus infinity to plus infinity. The symbol for infinity is ∞.
3
−2 2.5 π
−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
Figure 1: Numbers can be represented on a number line

The symbol > means ‘greater than’; for example 6 > 4. Given any number, all numbers to the right
of it on the number line are greater than the given number. The symbol < means ‘less than’; for
example −3 < 19. We also use the symbols ≥ meaning ‘greater than or equal to’ and ≤ meaning
‘less than or equal to’. For example, 7 ≤ 10 and 7 ≤ 7 are both true statements.
Sometimes we are interested in only a small section, or interval, of the real line. We write [1, 3] to
denote all the real numbers between 1 and 3 inclusive, that is 1 and 3 are included in the interval.
Therefore the interval [1, 3] consists of all real numbers x, such that 1 ≤ x ≤ 3. The square brackets,
[, ] mean that the end-points are included in the interval and such an interval is said to be closed.
We write (1, 3) to represent all real numbers between 1 and 3, but not including the end-points. Thus
(1, 3) means all real numbers x such that 1 < x < 3, and such an interval is said to be open. An
interval may be closed at one end and open at the other. For example, (1, 3] consists of all numbers
x such that 1 < x ≤ 3. Intervals can be represented on a number line. A closed end-point is
denoted by •; an open end-point is denoted by ◦. The intervals (−6, −4), [−1, 2] and (3, 4] are
illustrated in Figure 2.
−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
Figure 2: The intervals (−6, −4), [−1, 2] and (3, 4] depicted on the real line
HELM (2006): 3
Section 1.1: Mathematical Notation and Symbols
2. Calculation with numbers
To perform calculations with numbers we use the operations, +, −, × and ÷.
Addition (+)
We say that 4 + 5 is the sum of 4 and 5. Note that 4 + 5 is equal to 5 + 4 so that the order in which
we write down the numbers does not matter when we are adding them. Because the order does not
matter, addition is said to be commutative. This first property is called commutativity.
When more than two numbers are to be added, as in 4 + 8 + 9, it makes no difference whether we
add the 4 and 8 first to get 12 + 9, or whether we add the 8 and 9 first to get 4 + 17. Whichever
way we work we will obtain the same result, 21. Addition is said to be associative. This second
property is called associativity.
Subtraction (−)
We say that 8 − 3 is the difference of 8 and 3. Note that 8 − 3 is not the same as 3 − 8 and
so the order in which we write down the numbers is important when we are subtracting them i.e.
subtraction is not commutative. Subtracting a negative number is equivalent to adding a positive
number, thus 7 − (−3) = 7 + 3 = 10.
The plus or minus sign (±)

In engineering calculations we often use the notation plus or minus, ±. For example, we write
12 ± 8 as shorthand for the two numbers 12 + 8 and 12 − 8, that is 20 and 4. If we say a number
lies in the range 12 ± 8 we mean that the number can lie between 4 and 20 inclusive.
Multiplication (×)
The instruction to multiply, or obtain the product of, the numbers 6 and 7 is written 6×7. Sometimes
the multiplication sign is missed out altogether and we write (6)(7).
Note that (6)(7) is the same as (7)(6) so multiplication of numbers is commutative. If we are
multiplying three numbers, as in 2 × 3 × 4, we obtain the same result whether we multiply the 2 and
3 first to obtain 6 × 4, or whether we multiply the 3 and 4 first to obtain 2 × 12. Either way the
result is 24. Multiplication of numbers is associative.
Recall that when multiplying positive and negative numbers the sign of the result is given by the
rules given in Key Point 1.
Key Point 1
Multiplication
When multiplying numbers:
positive × positive = positive negative × negative = positive
positive × negative = negative negative × positive = negative
4 HELM (2006):
®
For example, (−4) × 5 = −20, and (−3) × (−6) = 18.

1
When dealing with fractions we sometimes use the word ‘of’ as in ‘find of 36’. In this context ‘of’
2
is equivalent to multiply, that is
1 1
of 36 is equivalent to × 36 = 18
2 2
Division (÷) or (/)

8
The quantity 8 ÷ 4 means 8 divided by 4. This is also written as 8/4 or and is known as the
4
8
quotient of 8 and 4. In the fraction the top line is called the numerator and the bottom line is
4
called the denominator. Note that 8/4 is not the same as 4/8 and so the order in which we write
down the numbers is important. Division is not commutative.
When dividing positive and negative numbers, recall the following rules in Key Point 2 for determining
the sign of the result:
Key Point 2
Division
When dividing numbers:
positive positive
= positive = negative
positive negative
negative negative
= negative = positive
positive negative
The reciprocal of a number

2 3
The reciprocal of a number is found by inverting it. If the number is inverted we get . So the
3 2
2 3 4 1
reciprocal of is . Because we can write 4 as , the reciprocal of 4 is .
3 2 1 4
HELM (2006): 5
Task
6 1
State the reciprocal of (a) , (b) , (c) −7.
11 5
Your solution
(a) (b) (c)
Answer
11 5 1
(a) (b) (c) −
6 1 7
The modulus notation (| | )

We shall make frequent use of the modulus notation | |. The modulus of a number is the size of
that number regardless of its sign. For example |4| is equal to 4, and | − 3| is equal to 3. The
modulus of a number is thus never negative.
Task
1 1
State the modulus of (a) −17, (b) , (c) − (d) 0.
5 7
Your solution
(a) (b) (c) (d)
Answer
1 1
The modulus of a number is found by ignoring its sign. (a) 17 (b) (c) (d) 0
5 7
The factorial symbol (!)

Another commonly used notation is the factorial, denoted by the exclamation mark ‘!’. The number
5!, read ‘five factorial’, or ‘factorial five’, is a shorthand notation for the expression 5 × 4 × 3 × 2 × 1,
and the number 7! is shorthand for 7 × 6 × 5 × 4 × 3 × 2 × 1. Note that 1! equals 1, and by
convention 0! is defined as 1 also. Your scientific calculator is probably able to evaluate factorials of
small integers. It is important to note that factorials only apply to positive integers.
Key Point 3
Factorial notation
If n is a positive integer then n! = n × (n − 1) × (n − 2) . . . 5 × 4 × 3 × 2 × 1
6 HELM (2006):
®
Example 1
(a) Evaluate 4! and 5! without using a calculator.
(b) Use your calculator to find 10!.
Solution
(a) 4! = 4 × 3 × 2 × 1 = 24. Similarly, 5! = 5 × 4 × 3 × 2 × 1 = 120. Note that

5! = 5 × 4! = 5 × 24 = 120.
(b) 10! = 3, 628, 800.
Task
Find the factorial button on your calculator and hence compute 11!.
(The button may be marked ! or n!). Check that 11! = 11 × 10!
Your solution
11! = 11 × 10! =
Answer
11! = 39916800
11 × 10! = 11 × 3628800 = 39916800
3. Rounding to n decimal places

In general, a calculator or computer is unable to store every decimal place of a real number. Real
numbers are rounded. To round a number to n decimal places we look at the (n + 1)th digit in the
decimal expansion of the number.
• If the (n + 1)th digit is 0, 1, 2, 3 or 4 then we round down: that is, we simply chop to n
places. (In other words we neglect the (n + 1)th digit and any digits to its right.)
• If the (n + 1)th digit is 5, 6, 7, 8 or 9 then we round up: we add 1 to the nth decimal place
and then chop to n places.
For example
1
= 0.3333 rounded to 4 decimal places
3
8
= 2.66667 rounded to 5 decimal places
3
π = 3.142 rounded to 3 decimal places
2.3403 = 2.340 rounded to 3 decimal places
HELM (2006): 7
Sometimes the phrase ‘decimal places’ is abbreviated to ‘d.p.’ or ‘dec.pl.’.
Example 2
Write down each of these numbers rounded to 4 decimal places:
0.12345, −0.44444, 0.5555555, 0.000127351, 0.000005, 123.456789
Solution
0.1235, −0.4444, 0.5556, 0.0001, 0.0000, 123.4568
Task
Write down each of these numbers, rounded to 3 decimal places:
0.87264, 0.1543, 0.889412, −0.5555, 45.6789, 6.0003
Your solution
Answer
0.873, 0.154, 0.889, −0.556, 45.679, 6.000
4. Rounding to n significant figures

This process is similar to rounding to decimal places but there are some subtle differences.
To round a number to n significant figures we look at the (n + 1)th digit in the decimal expansion
of the number.
places, inserting zeros if necessary before the decimal point. (In other words we neglect the
(n + 1)th digit and any digits to its right.)
and then chop to n places, inserting zeros if necessary before the decimal point.
Examples are given on the next page.
8 HELM (2006):
®
1
= 0.3333 rounded to 4 significant figures
3
8
= 2.66667 rounded to 6 significant figures
3
π = 3.142 rounded to 4 significant figures
2136 = 2000 rounded to 1 significant figure
36.78 = 37 rounded to 2 significant figures
6.2399 = 6.240 rounded to 4 significant figures

Sometimes the phrase “significant figures” is abbreviated as “s.f.” or “sig.fig.”
Example 3
Write down each of these numbers, rounding them to 4 significant figures:
0.12345, −0.44444, 0.5555555, 0.000127351, 25679, 123.456789, 3456543
Solution
0.1235, −0.4444, 0.5556, 0.0001274, 25680, 123.5, 3457000
Task
Write down each of these numbers rounded to 3 significant figures:
0.87264, 0.1543, 0.889412, −0.5555, 2.346, 12343.21, 4245321
Your solution
Answer
0.873, 0.154, 0.889, −0.556, 2.35, 12300, 4250000
Arithmetical expressions
A quantity made up of numbers and one or more of the operations +, −, × and / is called an
arithmetical expression. Frequent use is also made of brackets, or parentheses, ( ), to sepa-
rate different parts of an expression. When evaluating an expression it is conventional to evaluate
quantities within brackets first. Often a division line implies bracketed quantities. For example in the
3+4
expression there is implied bracketing of the numerator and denominator i.e. the expression
7+9
(3 + 4) 7
is and the bracketed quantities would be evaluated first resulting in the number .
(7 + 9) 16
HELM (2006): 9
The BODMAS rule
When several arithmetical operations are combined in one expression we need to know in which order
to perform the calculation. This order is found by applying rules known as precedence rules which
specify which operation has priority. The convention is that bracketed expressions are evaluated first.
Any multiplications and divisions are then performed, and finally any additions and subtractions. For
short, this is called the BODMAS rule.
Key Point 4
The BODMAS rule
Brackets, ( ) First priority: evaluate terms within brackets
Of, ×
Division, ÷ Second priority: carry out all multiplications and divisions
Multiplication, ×
Addition, + Third priority: carry out all additions and subtractions

Subtraction, −
If an expression contains only multiplication and division we evaluate by working from left to right.
Similarly, if an expression contains only addition and subtraction we evaluate by working from left to
right. In Section 1.2 we will meet another operation called exponentiation, or raising to a power. We
shall see that, in the simplest case, this operation is repeated multiplication and it is usually carried
out once any brackets have been evaluated.
Example 4
Evaluate 4 − 3 + 7 × 2
Solution
The BODMAS rule tells us to perform the multiplication before the addition and subtraction. Thus
4 − 3 + 7 × 2 = 4 − 3 + 14
Finally, because the resulting expression contains just addition and subtraction we work from the
left to the right, that is
4 − 3 + 14 = 1 + 14 = 15
10 HELM (2006):
®
Task
Evaluate 4 + 3 × 7 using the BODMAS rule to decide which operation to carry
out first.
Your solution
4+3×7=
Answer
25 (Multiplication has a higher priority than addition.)
Task
Evaluate (4 − 2) × 5.
Your solution
(4 − 2) × 5 =
Answer
2 × 5 = 10. (The bracketed quantity must be evaluated first.)
Example 5
Evaluate 8 ÷ 2 − (4 − 5)
Solution
The bracketed expression is evaluated first:

8 ÷ 2 − (4 − 5) = 8 ÷ 2 − (−1)
Division has higher priority than subtraction and so this is carried out next giving
8 ÷ 2 − (−1) = 4 − (−1)
Subtracting a negative number is equivalent to adding a positive number. Thus
4 − (−1) = 4 + 1 = 5
HELM (2006): 11
Task
9−4
Evaluate .
25 − 5
(Remember that the dividing line implies that brackets are present around the
numerator and around the denominator.)
Your solution
Answer
9−4 (9 − 4) 5 1
= = =
25 − 5 (25 − 5) 20 4
Exercises
5 1 √
1. Draw a number line and on it label points to represent −5, −3.8, −π, − , − , 0, 2, π, 5.
6 2
2. Simplify without using a calculator (a) −5 × −3, (b) −5 × 3, (c) 5 × −3, (d) 15 × −4,
18 −21 −36
(e) −14 × −3, (f) , (g) , (h) .
−3 7 −12
3. Evaluate (a) 3 + 2 × 6, (b) 3 − 2 − 6, (c) 3 + 2 − 6, (d) 15 − 3 × 2, (e) 15 × 3 − 2,
(f) (15 ÷ 3) + 2, (g) 15 ÷ 3 + 2, (h) 7 + 4 − 11 − 2, (i) 7 × 4 + 11 × 2, (j) −(−9),
(k) 7 − (−9), (l) −19 − (−7), (m) −19 + (−7).
4. Evaluate (a) | − 18|, (b) |4|, (c) | − 0.001|, (d) |0.25|, (e) |0.01 − 0.001|, (f) 2!,
9!
(g) 8! − 3!, (h) .
8!
5. Evaluate (a) 8 + (−9), (b) 18 − (−8), (c) −18 + (−2), (d) −11 − (−3)
9
6. State the reciprocal of (a) 8, (b) .
13
1
7. Evaluate (a) 7 ± 3, (b) 16 ± 7, (c) −15 ± , (d) −16 ± 0.05, (e) | − 8| ± 13,
2
(f) | − 2| ± 8.
8. Which of the following statements are true ?

(a) −8 ≤ 8, (b) −8 ≤ −8, (c) −8 ≤ |8|, (d) | − 8| < 8, (e) | − 8| ≤ −8,
(f) 9! ≤ 8!, (g) 8! ≤ 10!.
9. Explain what is meant by saying that addition of numbers is (a) associative, (b) commutative.
Give examples.
10. Explain what is meant by saying that multiplication of numbers is (a) associative, (b) commu-
tative. Give examples.
12 HELM (2006):
®
Answers
1.
−6
5
−2
1 √
−3.8 −π 2 π
−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
2. (a) 15, (b) −15, (c) −15, (d) −60, (e) 42, (f) −6, (g) −3, (h) 3.
3. (a) 15, (b) −5, (c) −1, (d) 9, (e) 43, (f) 7, (g) 7, (h) −2, (i) 50, (j) 9, (k) 16, (l) −12,
(m) −26
4. (a) 18, (b) 4, (c) 0.001, (d) 0.25, (e) 0.009, (f) 2, (g) 40314, (h) 9,
5. (a) −1, (b) 26, (c) −20, (d) −8

1 13
6. (a) , (b) .
8 9
1 1
7. (a) 4,10, (b) 9,23, (c) −15 , −14 , (d) −16.05, −15.95, (e) −5, 21, (f) −6, 10
2 2
8. (a), (b), (c), (g) are true.
9. For example (a) (1 + 2) + 3 = 1 + (2 + 3), and both are equal to 6. (b) 8 + 2 = 2 + 8.
10. For example (a) (2 × 6) × 8 = 2 × (6 × 8), and both are equal to 96. (b) 7 × 5 = 5 × 7.
5. Using symbols
Mathematics provides a very rich language for the communication of engineering concepts and ideas,
and a set of powerful tools for the solution of engineering problems. In order to use this language it
is essential to appreciate how symbols are used to represent physical quantities, and to understand
the rules and conventions which have been developed to manipulate these symbols.
The choice of which letters or other symbols to use is largely up to the user although it is helpful to
choose letters which have some meaning in any particular context. For instance if we wish to choose
a symbol to represent the temperature in a room we might use the capital letter T . Similarly the
lower case letter t is often used to represent time. Because both time and temperature can vary we
refer to T and t as variables.
In a particular calculation some symbols represent fixed and unchanging quantities and we call these
constants. Often we reserve the letters x, y and z to stand for variables and use the earlier letters
of the alphabet, such as a, b and c, to represent constants. The Greek letter pi, written π, is used to
represent the constant 3.14159.... which appears for example in the formula for the area of a circle.
Other Greek letters are frequently used as symbols, and for reference, the Greek alphabet is given in
Table 1.
HELM (2006): 13
Table 1: The Greek alphabet
A α alpha I ι iota P ρ rho

B β beta Λ λ lambda T τ tau
Γ γ gamma K κ kappa Σ σ sigma
∆ δ delta M µ mu Υ υ upsilon
E epsilon N ν nu Φ φ phi
Z ζ zeta Ξ ξ xi X χ chi
H η eta O o omicron Ψ ψ psi
Θ θ theta Π π pi Ω ω omega
Mathematics is a very precise language and care must be taken to note the exact position of any
symbol in relation to any other. If x and y are two symbols, then the quantities xy, xy , xy can all
mean different things. In the expression xy you will note that the symbol y is placed to the right of
and slightly higher than the symbol x. In this context y is called a superscript. In the expression
xy , y is placed lower than and to the right of x, and is called a subscript.
Example The temperature in a room is measured at four points as shown in Figure 3.
T1
T2 T3
T4
Figure 3: The temperature is measured at four points

Rather than use different letters to represent the four measurements we can use one symbol, T ,
together with four subscripts to represent the temperature. Thus the four measurements are denoted
by T1 , T2 , T3 and T4 .
6. Combining numbers together using +, −, ×, ÷

Addition (+)
If the letters x and y represent two numbers, then their sum is written as x + y. Note that x + y is
the same as y + x just as 4 + 7 is equal to 7 + 4.
Subtraction (−)
Subtracting y from x yields x − y. Note that x − y is not the same as y − x just as 11 − 7 is not
the same as 7 − 11, however in both cases the difference is said to be 4.
14 HELM (2006):
®
Multiplication (×)
The instruction to multiply x and y together is written as x × y. Usually the multiplication sign is
omitted and we write simply xy. An alternative notation is to use a dot to represent multiplication
and so we could write x.y The quantity xy is called the product of x and y. As discussed earlier
multiplication is both commutative and associative:
i.e. x×y =y×x and (x × y) × z = x × (y × z)
This last expression can thus be written x × y × z without ambiguity. When mixing numbers and
symbols it is usual to write the numbers first. Thus 3 × x × y × 4 = 3 × 4 × x × y = 12xy.
Example 6
Simplify (a) 9(2y), (b) −3(5z), (c) 4(2a), (d) 2x × (2y).
Solution
(a) Note that 9(2y) means 9×(2×y). Because of the associativity of multiplication 9×(2×y)
means the same as (9 × 2) × y, that is 18y.
(b) −3(5z) means −3 × (5 × z). Because of associativity this is the same as (−3 × 5) × z,
that is −15z.
(c) 4(2a) means 4 × (2 × a). We can write this as (4 × 2) × a, that is 8a.
(d) Because of the associativity of multiplication, the brackets are not needed and we can
write 2x × (2y) = 2x × 2y which equals
2 × x × 2 × y = 2 × 2 × x × y = 4xy.
Example 7
What is the distinction between 9(−2y) and 9 − 2y ?
Solution
The expression 9(−2y) means 9 × (−2y). Because of associativity of multiplication we can write
this as 9 × (−2) × y which equals −18y.
On the other hand 9 − 2y means subtract 2y from 9. This cannot be simplified.
HELM (2006): 15
Division (÷)
x
The quantity x ÷ y means x divided by y. This is also written as x/y or and is known as the
y
x
quotient of x and y. In the expression the symbol x is called the numerator and the symbol y
y
is called the denominator. Note that x/y is not the same as y/x. Division by 1 leaves a quantity
x
unchanged so that is simply x.
1
Algebraic expressions
A quantity made up of symbols and the operations +, −, × and / is called an algebraic expression.
One algebraic expression divided by another is called an algebraic fraction. Thus
x+7 3x − y
and
x−3 2x + z
are algebraic fractions. The reciprocal of an algebraic fraction is found by inverting it. Thus the
2 x x+7 x−3
reciprocal of is . The reciprocal of is .
x 2 x−3 x+7
Example 8
State the reciprocal of each of the following expressions:
y x+z 1 1
(a) , (b) , (c) 3y, (d) , (e) −
z a−b a + 2b y
Solution
z
(a) .
y
a−b
(b) .
x+z
3y 1
(c) 3y is the same as so the reciprocal of 3y is .
1 3y
1 a + 2b
(d) The reciprocal of is or simply a + 2b.
a + 2b 1
1 y
(e) The reciprocal of − is − or simply −y.
y 1
Finding the reciprocal of complicated expressions can cause confusion. Study the following Example
carefully.
16 HELM (2006):
®
Example 9
Obtain the reciprocal of:
1 1
(a) p + q, (b) +
R1 R2
Solution
p+q 1
(a) Because p + q can be thought of as its reciprocal is . Note in particular
1 p+q
1 1
that the reciprocal of p + q is not + . This distinction is important and a common
p q
cause of error. To avoid an error carefully identify the numerator and denominator in the
original expression before inverting.
1 1 1
(b) The reciprocal of + is . To simplify this further requires knowledge of
R1 R2 1 1
+
R1 R2
the addition of algebraic fractions which is dealt with in 1.4. It is important to
1 1
note that the reciprocal of + is not R1 + R2 .
R1 R2
The equals sign (=)

The equals sign, =, is used in several different ways.
Firstly, an equals sign is used in equations. The left-hand side and right-hand side of an equation
are equal only when the variable involved takes specific values known as solutions of the equation.
For example, in the equation x − 8 = 0, the variable is x. The left-hand side and right-hand side are
only equal when x has the value 8. If x has any other value the two sides are not equal.
Secondly, the equals sign is used in formulae. Physical quantities are often related through a formula.
For example, the formula for the length, C, of the circumference of a circle expresses the relationship
between the circumference of the circle and its radius, r. This formula states C = 2πr. When used
in this way the equals sign expresses the fact that the quantity on the left is found by evaluating the
expression on the right.
Thirdly, an equals sign is used in identities. An identity looks just like an equation, but it is true
for all values of the variable. We shall see shortly that (x − 1)(x + 1) = x2 − 1 for any value of x
whatsoever. This mean that the quantity on the left means exactly the same as that on the right
whatever the value of x. To distinguish this usage from other uses of the equals symbol it is more
correct to write (x − 1)(x + 1) ≡ x2 − 1, where ≡ means ‘is identically equal to’. However, in
practice, the equals sign is often used. We will only use ≡ where it is particularly important to do
so.
HELM (2006): 17
The ‘not equals’ sign (6=)
The sign 6= means ‘is not equal to’. For example, 5 6= 6, 7 6= −7.
The notation for the change in a variable (δ )

The change in the value of a quantity is found by subtracting its initial value from its final value.
For example, if the temperature of a mixture is initially 13◦ C and at a later time is found to be 17◦ C,
the change in temperature is 17 − 13 = 4◦ C. The Greek letter δ is often used to indicate such a
change. If x is a variable we write δx to stand for a change in the value of x. We sometimes refer
to δx as an increment in x. For example if the value of x changes from 3 to 3.01 we could write
δx = 3.01 − 3 = 0.01. It is important to note that this is not the product of δ and x, rather the
whole symbol ‘δx’ means ‘the increment in x’.
Sigma (or summation) notation ( )

P
This provides a concise and convenient way of writing long sums.

The sum
x1 + x2 + x3 + x4 + . . . + x11 + x12
P
is written using the capital Greek letter sigma, , as
12
X
xk
k=1
P
The symbol stands for the sum of all the values of xk as k ranges from 1 to 12. Note that
the lower-most and upper-most values of k are written at the bottom and top of the sigma sign
respectively.
Example 10
5
X
Write out explicitly what is meant by k3.
k=1
Solution
5
X
We must let k range from 1 to 5. k 3 = 13 + 23 + 33 + 43 + 53
k=1
18 HELM (2006):
®
Task
1 1 1 1
Express + + + concisely using sigma notation.
1 2 3 4
1
Each term has the form where k varies from 1 to 4. Write down the sum using the sigma notation:
k
Your solution
1 1 1 1
+ + + =
1 2 3 4
Answer
4
X 1
k=1
k
Example 11
3
X 4
X
Write out explicitly (a) 1, (b) 2.
k=1 k=0
Solution
(a) Here k does not appear explicitly in the terms to be added. This means add the constant 1,
three times.
3
X
1=1+1+1=3
k=1
n
X
In general 1 = n.
k=1
(b) Here k starts at zero so there are n + 1 terms where n = 4:

4
X
2 = 2 + 2 + 2 + 2 + 2 = 10
k=0
HELM (2006): 19
Exercises
1 1 2
1. State the reciprocal of (a) x, (b) , (c) xy, (d) , (e) a + b, (f)
z xy a+b
2. The pressure p in a reaction vessel changes from 35 pascals to 38 pascals. Write down the
value of δp.
3. Express as simply as possible (a) (−3) × x × (−2) × y, (b) 9 × x × z × (−5).
4. Simplify (a) 8(2y), (b) 17x(−2y), (c) 5x(8y), (d) 5x(−8y)
5. What is the distinction between 5x(2y) and 5x − 2y ?
6. The value of x is 100 ± 3. The value of y is 120 ± 5. Find the maximum and minimum values
of
x y
(a) x + y, (b) xy, (c) , (d) .
y x
n
X n
X
7. Write out explicitly (a) fi , (b) fi xi .
i=1 i=1
5
X 5
X
8. By writing out the terms explicitly show that 3k = 3 k
k=1 k=1
3
X
9. Write out explicitly y(xk )δxk .
k=1
Answers
1 1 1 a+b
1. (a) , (b) z, (c) , (d) xy, (e) , (f) .
x xy a+b 2
2. δp = 3 pascals.
3. (a) 6xy, (b) −45xz
4. (a) 16y, (b) −34xy, (c) 40xy, (d) −40xy
5. 5x(2y) = 10xy, 5x − 2y cannot be simplified.
6. (a) max 228, min 212, (b) 12875, 11155, (c) 0.8957, 0.7760, (d) 1.2887, 1.1165
n
X
7. (a) fi = f1 + f2 + . . . + fn−1 + fn ,
i=1
Xn
(b) fi xi = f1 x1 + f2 x2 + . . . + fn−1 xn−1 + fn xn .
i=1
9. y(x1 )δx1 + y(x2 )δx2 + y(x3 )δx3 .
20 HELM (2006):
®

Indices 1.2
Introduction
Indices, or powers, provide a convenient notation when we need to multiply a number by itself several
times. In this Section we explain how indices are written, and state the rules which are used for
manipulating them.
Expressions built up using non-negative whole number powers of a variable − known as polynomials
− occur frequently in engineering mathematics. We introduce some common polynomials in this
Section.
Finally, scientific notation is used to express very large or very small numbers concisely. This requires
use of indices. We explain how to use scientific notation towards the end of the Section.

• be familiar with algebraic notation and
Prerequisites symbols
Before starting this Section you should . . .

'
$
• perform calculations using indices
Learning Outcomes • state and use the laws of indices
On completion you should be able to . . . • use scientific notation

& %
HELM (2006): 21
Section 1.2: Indices
1. Index notation
The number 4 × 4 × 4 is written, for short, as 43 and read ‘4 raised to the power 3’ or ‘4 cubed’.
Note that the number of times ‘4’ occurs in the product is written as a superscript. In this context
we call the superscript 3 an index or power. Similarly we could write
5 × 5 = 52 , read ‘5 to the power 2’ or ‘5 squared’
and
7 × 7 × 7 × 7 × 7 = 75 a × a × a = a3 , m × m × m × m = m4
More generally, in the expression xy , x is called the base and y is called the index or power. The
plural of index is indices. The process of raising to a power is also known as exponentiation
because yet another name for a power is an exponent. When dealing with numbers your calculator
is able to evaluate expressions involving powers, probably using the xy button.
Example 12
Use a calculator to evaluate 312 .
Solution
Using the xy button on the calculator check that you obtain 312 = 531441.
Example 13
Identify the index and base in the following expressions. (a) 811 , (b) (−2)5 ,
(c) p−q
Solution
(a) In the expression 811 , 8 is the base and 11 is the index.

(b) In the expression (−2)5 , −2 is the base and 5 is the index.
(c) In the expression p−q , p is the base and −q is the index. The interpretation of a negative index
will be given in sub-section 4 which starts on page 31.
Recall from Section 1.1 that when several operations are involved we can make use of the BODMAS
rule for deciding the order in which operations must be carried out. The BODMAS rule makes no
mention of exponentiation. Exponentiation should be carried out immediately after any brackets have
been dealt with and before multiplication and division. Consider the following examples.
22 HELM (2006):
®
Example 14
Evaluate 7 × 32 .
Solution
There are two operations involved here, exponentiation and multiplication. The exponentiation
should be carried out before the multiplication. So 7 × 32 = 7 × 9 = 63.
Example 15
Write out fully (a) 3m4 , (b) (3m)4 .
Solution
(a) In the expression 3m4 the exponentiation is carried out before the multiplication by 3. So
3m4 means 3 × (m × m × m × m) that is 3 × m × m × m × m
(b) Here the bracketed expression is raised to the power 4 and so should be multiplied by itself
four times:
(3m)4 = (3m) × (3m) × (3m) × (3m)
Because of the associativity of multiplication we can write this as
3×3×3×3×m×m×m×m or simply 81m4 .
Note the important distinction between (3m)4 and 3m4 .
Exercises
1. Evaluate, without using a calculator, (a) 33 , (b) 35 , (c) 25 . (d) 0.22 , (e) 152 .
2. Evaluate using a calculator (a) 73 , (b) (14)3.2 .
3. Write each of the following using index notation:

1
(a) 7 × 7 × 7 × 7 × 7, (b) t × t × t × t, (c) 2
× 12 × 71 × 71 × 17 .
4. Evaluate without using a calculator. Leave any fractions in fractional form.

2 3 2 3
(a) 23 , (b) 25 , (c) 12 , (d) 21 , (e) 0.13 .
HELM (2006): 23
Answers
1. (a) 27, (b) 243, (c) 32, (d) 0.04, (e) 225
2. (a) 343, (b) 4651.7 (1 d.p.).

2 1 3
3. (a) 75 , (b) t4 , (c) 21 7
4 8 1 1
4. (a) , (b) , (c) , (d) , (e) 0.13 means (0.1) × (0.1) × (0.1) = 0.001
9 125 4 8
2. Laws of indices
There is a set of rules which enable us to manipulate expressions involving indices. These rules are
known as the laws of indices, and they occur so commonly that it is worthwhile to memorise them.
Key Point 5
Laws of Indices
The laws of indices state:
First law: am × an = am+n add indices when multiplying numbers with the same base
am
Second law: = am−n subtract indices when dividing numbers with the same base
an
Third law: (am )n = amn multiply indices together when raising a number to a power
24 HELM (2006):
®
Example 16
Simplify (a) a5 × a4 , (b) 2x5 (x3 ).
Solution
In each case we are required to multiply expressions involving indices. The bases are the same and
we use the first law of indices.
(a) The indices must be added, thus a5 × a4 = a5+4 = a9 .

(b) Because of the associativity of multiplication we can write
2x5 (x3 ) = 2(x5 x3 ) = 2x5+3 = 2x8
The first law of indices (Key Point 5) extends in an obvious way when more terms are involved:
Example 17
Simplify b5 × b4 × b7 .
Solution
The indices are added. Thus b5 × b4 × b7 = b5+4+7 = b16 .
Task
Simplify y 4 y 2 y 3 .
Your solution
y4y2y3 =
Answer
All quantities have the same base. To multiply the quantities together, the indices are added: y 9
HELM (2006): 25
Example 18
84
Simplify (a) , (b) x18 ÷ x7 .
82
Solution
In each case we are required to divide expressions involving indices. The bases are the same and we
use the second law of indices (Key Point 5).
84
(a) The indices must be subtracted, thus = 84−2 = 82 = 64.
82
(b) Again the indices are subtracted, and so x18 ÷ x7 = x18−7 = x11 .
Task
59
Simplify .
57
Your solution
59
=
57
Answer
The bases are the same, and the division is carried out by subtracting the indices: 59−7 = 52 = 25
Task
y5
Simplify
y2
Your solution
y5
=
y2
Answer
y 5−2 = y 3
26 HELM (2006):
®
Example 19
Simplify (a) (82 )3 , (b) (z 3 )4 .
Solution
We use the third law of indices (Key Point 5).
(a) (82 )3 = 82×3 = 86

(b) (z 3 )4 = z 3×4 = z 12 .
Task
Simplify (x2 )5 .
Your solution
(x2 )5 =
Answer
x2×5 = x10
Task
Simplify (ex )y
Your solution
(ex )y =
Answer
Again, using the third law of indices, the two powers are multiplied: ex×y = exy
Two important results which can be derived from the laws of indices state:
Key Point 6
Any non-zero number raised to the power 0 has the value 1, that is a0 = 1
Any number raised to power 1 is itself, that is a1 = a
HELM (2006): 27
A generalisation of the third law of indices states:
Key Point 7
(am bn )k = amk bnk
Example 20
Remove the brackets from (a) (3x)2 , (b) (x3 y 7 )4 .
Solution
(a) Noting that 3 = 31 and x = x1 then (3x)2 = (31 x1 )2 = 32 x2 = 9x2
or, alternatively (3x)2 = (3x) × (3x) = 9x2
(b) (x3 y 7 )4 = x3×4 y 7×4 = x12 y 28
Exercises
1. Show that (−xy)2 is equivalent to x2 y 2 whereas (−xy)3 is equivalent to −x3 y 3 .
2. Write each of the following expressions with a single index:
7 9 67
(a) 6 6 , (b) 19 , (c) (x4 )3
6
3. Remove the brackets from (a) (8a)2 , (b) (7ab)3 , (c) 7(ab)3 , (d) (6xy)4 ,
4. Simplify (a) 15x2 (x3 ), (b) 3x2 (5x), (c) 18x−1 (3x4 ).
5. Simplify (a) 5x(x3 ), (b) 4x2 (x3 ), (c) 3x7 (x4 ), (d) 2x8 (x11 ), (e) 5x2 (3x9 )
Answers
2. (a) 616 , (b) 6−12 , (c) x12
3. (a) 64a2 , (b) 343a3 b3 , (c) 7a3 b3 , (d) 1296x4 y 4
4. (a) 15x5 , (b) 15x3 , (c) 54x3
5. (a) 5x4 , (b) 4x5 , (c) 3x11 , (d) 2x19 , (e) 15x11
28 HELM (2006):
®
3. Polynomial expressions
An important group of mathematical expressions which use indices are known as polynomials.
Examples of polynomials are
4x3 + 2x2 + 3x − 7, x2 + x, 17 − 2t + 7t4 , z − z3
Notice that they are all constructed using non-negative whole number powers of the variable. Recall
that x0 = 1 and so the number −7 appearing in the first expression can be thought of as −7x0 .
Similarly the 17 appearing in the third expression can be read as 17t0 .
Key Point 8
Polynomials
A polynomial expression takes the form
a0 + a1 x + a2 x 2 + a3 x 3 + . . . + an x n
where a0 , a1 , a2 , a3 , . . . an are all constants called the coefficients of the polynomial. The number
a0 is also called the constant term. The highest power in a polynomial is called the degree of the
polynomial.
Polynomials with low degrees have special names and subscript notation is often not needed:
Polynomial Degree Name

ax3 + bx2 + cx + d 3 cubic
ax2 + bx + c 2 quadratic
ax + b 1 linear
a 0 constant
Task
Which of the following expressions are polynomials? Give the degree of those
which are.
1 √
(a) 3x2 + 4x + 2, (b) , (c) x, (d) 2t + 4,
x+1
4
(e) 3x2 + + 2.
x
Recall that a polynomial expression must contain only terms involving non-negative
whole number powers of the variable.
Give your answers by ringing the correct word (yes/no) and stating the degree if
it is a polynomial.
HELM (2006): 29
Your solution
polynomial degree
(a) 3x2 + 4x + 2 yes no
1
(b) yes no
x+1
√
(c) x yes no
(d) 2t + 4 yes no
4
(e) 3x2 + +2 yes no
x
Answer
(a) yes: polynomial of degree 2, called quadratic (b) no (c) no
(d) yes: polynomial of degree 1, called linear (e) no
Exercises
1. State which of the following are linear polynomials, which are quadratic polynomials, and which
are constants.
(a) x, (b) x2 + x + 3, (c) x2 − 1, (d) 3 − x, (e) 7x − 2, (f) 12 ,
(g) 12 x + 34 , (h) 3 − 21 x2 .
2. State which of the following are polynomials.

1
(a) −α2 − α − 1, (b) x1/2 − 7x2 , (c) , (d) 19.
x
3. Which of the following are polynomials ?
1 1 1 1
(a) 4t + 17, (b) − t, (c) 15, (d) t2 − 3t + 7, (e) 2
+ +7
2 2 t t
4. State the degree of each of the following polynomials. For those of low degree, give their name.
(a) 2t3 + 7t2 , (b) 7t7 + 14t3 − 2t2 , (c) 7x + 2,
(d) x2 + 3x + 2, (e) 2 − 3x − x2 , (f) 42
Answers
1. (a), (d), (e) and (g) are linear. (b), (c) and (h) are quadratic. (f) is a constant.
2. (a) is a polynomial, (d) is a polynomial of degree 0. (b) and (c) are not polynomials.
3. (a) (b) (c) and (d) are polynomials.
4. (a) 3, cubic, (b) 7, (c) 1, linear, (d) 2, quadratic, (e) 2, quadratic, (f) 0, constant.
30 HELM (2006):
®
4. Negative indices
Sometimes a number is raised to a negative power. This is interpreted as follows:
Key Point 9
Negative Powers
1 1
a−m = , am =
am a−m
Thus a negative index can be used to indicate a reciprocal.
Example 21
Write each of the following expressions using a positive index and simplify if pos-
sible.
1
(a) 2−3 , (b) −3 , (c) x−1 , (d) x−2 , (e) 10−1
4
Solution
1 1 1 1 1 1
(a) 2−3 = 3
= , (b) −3 = 43 = 64, (c) x−1 = 1
= , (d) x−2 = ,
2 8 4 x x x2
1 1
(e) 10−1 = 1 = or 0.1.
10 10
Task
Write each of the following using a positive index. Use Key Point 9.
1
(a) −4 , (b) 17−3 , (c) y −1 , (d) 10−2
t
Your solution
1
(a) −4 =
t
Answer
t4
HELM (2006): 31
Your solution
(b) 17−3 =
Answer
1
173
Your solution
(c) y −1 =
Answer
1
y
Your solution
(d) 10−2 =
Answer
1 1
2
which equals or 0.01
10 100
Task
a8 × a7
Simplify
a4
Use the first law of indices to simplify the numerator:

Your solution
a8 × a7
=
a4
Answer
a15
a4
Now use the second law to simplify the result:
Your solution
Answer
a11
32 HELM (2006):
®
Task
m9 × m−2
Simplify
m−3
First simplify the numerator using the first law of indices:
Your solution
m9 × m−2
=
m−3
Answer
m7
m−3
Then use the second law to simplify the result:
Your solution
Answer
m7−(−3) = m10
Exercises
1. Write the following numbers using a positive index and also express your answers as decimal
fractions:
(a) 10−1 , (b) 10−3 , (c) 10−4
2. Simplify as much as possible:

t4 y −2
(a) x3 x−2 , (b) , (c) .
t−3 y −6
Answers
1
1. (a) 10 = 0.1, (b) 1013 = 0.001, (c) 1014 = 0.0001.
2. (a) x1 = x, (b) t4+3 = t7 , (c) y −2+6 = y 4 .
HELM (2006): 33
5. Fractional indices
So far we have used indices that are whole numbers. We now consider fractional powers. Consider
1
the expression (16 2 )2 . Using the third law of indices, (am )n = amn , we can write
1 1
(16 2 )2 = 16 2 ×2 = 161 = 16
1 1
So 16 2 is a number which when squared equals 16, that is 4 or −4. In other words 16 2 is a square
1
root of 16. There are always two square roots of a non-zero positive number, and we write 16 2 = ±4
Key Point 10
1
In general a2 is a square root of a a≥0
Similarly
1 1
(8 3 )3 = 8 3 ×3 = 81 = 8
1 1 √
3
so that 8 3 is a number which when cubed equals 8. Thus 8 3 is the cube root of 8, that is 8,
namely 2. Each number has only one cube root, and so
1
83 = 2
In general
Key Point 11
1
a3 is the cube root of a
More generally we have
Key Point 12
1
The nth root of a is denoted by a n .
When a < 0 the nth root only exists if n is odd.
√
If a > 0 the positive nth root is denoted by n a
p
If a < 0 the negative nth root is − n |a|
34 HELM (2006):
®
Your calculator will be able to evaluate fractional powers, and roots of numbers. Check that you
can obtain the results of the following Examples on your calculator, but be aware that calculators
normally give only one root when there may be others.
Example 22
Evaluate (a) 1441/2 , (b) 1251/3
Solution
(a) 1441/2 is a square root of 144, that is ±12.
√
(b) Noting that 53 = 125, we see that 1251/3 = 3 125 = 5
Example 23
Evaluate (a) 321/5 , (b) 322/5 , (c) 82/3 .
Solution
1 √ √
(a) 32 5 is the 5th root of 32, that is 5
32. Now 25 = 32 and so 5
32 = 2.
2× 51 1
(b) Using the third law of indices we can write 322/5 = 32 = (32 5 )2 . Thus
322/5 = ((32)1/5 )2 = 22 = 4
(c) Note that 81/3 = 2. Then

2 1
8 3 = 82× 3 = (81/3 )2 = 22 = 4
Note the following alternatives:
82/3 = (81/3 )2 = (82 )1/3
Example 24
Write the following as a simple power with a single index:
√ √4
(a) x5 , (b) x3 .
Solution
√ 1 1 5
(a) x5 = (x5 ) 2 . Then using the third law of indices we can write this as x5× 2 = x 2 .
√4 1 1 3
(b) x3 = (x3 ) 4 . Using the third law we can write this as x3× 4 = x 4 .
HELM (2006): 35
Example 25
1
Show that z −1/2 = √ .
z
Solution
1 1
z −1/2 = =√
z 1/2 z
Task √
z
Simplify
z 3 z −1/2
√
First, rewrite z using an index and simplify the denominator using the first law of indices:
Your
√ solution
z
3 −1/2
=
z z
Answer
1
z2
5
z2
Finally, use the second law to simplify the result:
Your solution
Answer
1 5 1
z 2 − 2 = z −2 or
z2
36 HELM (2006):
®
Example 26
The generalisation of the third law of indices states that (am bn )k = amk bnk . By
1 √ √ √
taking m = 1, n = 1 and k = show that ab = a b.
2
Solution
1
Taking m = 1, n = 1 and k = gives (ab)1/2 = a1/2 b1/2 .
2
√ √ √
Taking the case when all these roots are positive, we have ab = a b.
Key Point 13
√ √ √
ab = a b a ≥ 0, b ≥ 0
√
This
√ result often
√ allows
√ answers
√ to be written in alternative forms. For example, we may write 48
as 3 × 16 = 3 16 = 4 3.
Although this rule works for multiplication we should be aware that it does not work for addition or
subtraction so that
√ √ √
a ± b 6= a ± b
Exercises
1
1. Evaluate using a calculator (a) 31/2 , (b) 15− 3 , (c) 853 , (d) 811/4
2. Evaluate using a calculator (a) 15−5 , (b) 15−2/7

√ √ √
a11 a3/4 z z −5/2 3
a 5
z
3. Simplify (a) −1/2 , (b) 3/2 , (c) √ , (d) √ , (e) .
a z z 2
a z 1/2
4. Write each of the following expressions with a single index:
x1/2
(a) (x−4 )3 , (b) x1/2 x1/4 , (c)
x1/4
Answers
1 (a) 1.7321, (b) 0.4055, (c) 614125, (d) 3
2 (a) 0.000001317 (4 s.f.), (b) 0.4613 (4 s.f.),
3 (a) a12.25 , (b) z −1 , (c) z −3 , (d) a−1/6 , (e) z −3/10
4 (a) x−12 , (b) x3/4 , (c) x1/4
HELM (2006): 37
6. Scientific notation
It is often necessary to use very large or very small numbers such as 78000000 and 0.00000034.
Scientific notation can be used to express such numbers in a more concise form. Each number is
written in the form
a × 10n
where a is a number between 1 and 10. We can make use of the following facts:
10 = 101 , 100 = 102 , 1000 = 103 and so on
and
0.1 = 10−1 , 0.01 = 10−2 , 0.001 = 10−3 and so on.
For example,
• the number 5000 can be written 5 × 1000 = 5 × 103
• the number 403 can be written 4.03 × 100 = 4.03 × 102
• the number 0.009 can be written 9 × 0.001 = 9 × 10−3
Furthermore, to multiply a number by 10n the decimal point is moved n places to the right if n is a
positive integer, and n places to the left if n is a negative integer. (If necessary additional zeros are
inserted to make up the required number of digits before the decimal point.)
Task
Write the numbers 0.00678 and 123456.7 in scientific notation.
Your solution
Answer
0.00678 = 6.78 × 10−3 123456.7 = 1.234567 × 105
Engineering constants
Many constants appearing in engineering calculations are expressed in scientific notation. For example
the charge on an electron equals 1.6 × 10−19 coulomb and the speed of light is 3 × 108 m s−1 .
Avogadro’s constant is equal to 6.023 × 1026 and is the number of atoms in one kilomole of an
element. Clearly the use of scientific notation avoids writing lengthy strings of zeros.
Your scientific calculator will be able to accept numbers in scientific notation. Often the E button
is used and a number like 4.2 × 107 will be entered as 4.2E7. Note that 10E4 means 10 × 104 , that
is 105 . To enter the number 103 say, you would key in 1E3. Entering powers of 10 incorrectly is a
common cause of error. You must check how your particular calculator accepts numbers in scientific
notation.
38 HELM (2006):
®
The following Task is designed to check that you can enter numbers given in scientific notation into
your calculator.
Task
Use your calculator to find 4.2 × 10−3 × 3.6 × 10−4 .
Your solution
4.2 × 10−3 × 3.6 × 10−4 =
Answer
1.512 × 10−6
Exercises
1. Express each of the following numbers in scientific notation:
(a) 45, (b) 456, (c) 2079, (d) 7000000, (e) 0.1, (f) 0.034,
(g) 0.09856
2. Simplify 6 × 1024 × 1.3 × 10−16
Answers
1. (a) 4.5 × 101 , (b) 4.56 × 102 , (c) 2.079 × 103 , (d) 7 × 106 , (e) 1 × 10−1 ,
(f) 3.4 × 10−2 , (g) 9.856 × 10−2
2. 7.8 × 108
HELM (2006): 39
Simplification
and Factorisation 1.3
Introduction
In this Section we explain what is meant by the phrase ‘like terms’ and show how like terms are
collected together and simplified.
Next we consider removing brackets. In order to simplify an expression which contains brackets it
is often necessary to rewrite the expression in an equivalent form but without any brackets. This
process of removing brackets must be carried out according to particular rules which are described in
this Section.
Finally, factorisation, which can be considered as the reverse of the process, is dealt with. It is
essential that you have had plenty practice in removing brackets before you study factorisation.

• be familiar with algebraic notation

Prerequisites
• have competence in removing brackets

'
$
• use the laws of indices
• simplify expressions by collecting like terms
• use the laws of indices

Learning Outcomes
• identify common factors in an expression
• factorise simple expressions
• factorise quadratic expressions

& %
40 HELM (2006):
®
1. Addition and subtraction of like terms

Like terms are multiples of the same quantity. For example 5y, 17y and 21 y are all multiples of y
and so are like terms. Similarly, 3x2 , −5x2 and 14 x2 are all multiples of x2 and so are like terms.
Further examples of like terms are:
kx and `x which are both multiples of x,
x2 y, 6x2 y, −13x2 y, −2yx2 , which are all multiples of x2 y
abc2 , −7abc2 , kabc2 , are all multiples of abc2
Like terms can be added or subtracted in order to simplify expressions.
Example 27
Simplify 5x − 13x + 22x.
Solution
All three terms are multiples of x and so are like terms. The expression can be simplified to 14x.
Example 28
Simplify 5z + 2x.
Solution
5z and 2x are not like terms. They are not multiples of the same quantity. This expression cannot
be simplified.
Task
Simplify 5a + 2b − 7a − 9b.
Your solution
5a + 2b − 7a − 9b =
Answer
−2a − 7b
HELM (2006): 41
Section 1.3: Simplification and Factorisation
Example 29
Simplify 2x2 − 7x + 11x2 + x.
Solution
2x2 and 11x2 , both being multiples of x2 , can be collected together and added to give 13x2 .
Similarly, −7x and x can be added to give −6x.
We get 2x2 − 7x + 11x2 + x = 13x2 − 6x which cannot be simplified further.
Task
Simplify 12 x + 34 x − 2y.
Your solution
1
2
x + 34 x − 2y =
Answer
5
4
x − 2y
Example 30
Simplify 3a2 b − 7a2 b − 2b2 + a2 .
Solution
Note that 3a2 b and 7a2 b are both multiples of a2 b and so are like terms. There are no other like
terms. Therefore
3a2 b − 7a2 b − 2b2 + a2 = −4a2 b − 2b2 + a2
42 HELM (2006):
®
Exercises
1. Simplify, if possible,
(a) 5x + 2x + 3x, (b) 3q − 2q + 11q, (c) 7x2 + 11x2 , (d) −11v 2 + 2v 2 , (e) 5p + 3q
2. Simplify, if possible, (a) 5w + 3r − 2w + r, (b) 5w2 + w + 1, (c) 6w2 + w2 − 3w2
(a) 7x + 2 + 3x + 8x − 11, (b) 2x2 − 3x + 6x − 2, (c) −5x2 − 3x2 + 11x + 11,
(d) 4q 2 − 4r2 + 11r + 6q, (e) a2 + ba + ab + b2 , (f) 3x2 + 4x + 6x + 8,
(g) s3 + 3s2 + 2s2 + 6s + 4s + 12.
4. Explain the distinction, if any, between each of the following expressions, and simplify if possible.
(a) 18x − 9x, (b) 18x(9x), (c) 18x(−9x), (d) −18x − 9x, (e) −18x(9x)
5. Explain the distinction, if any, between each of the following expressions, and simplify if possible.
(a) 4x − 2x, (b) 4x(−2x), (c) 4x(2x), (d) −4x(2x), (e) −4x − 2x, (f) (4x)(2x)
2 2 x2
(a) x + , (b) 0.5x2 + 34 x2 − 11
2
x, (c) 3x3 − 11x + 3yx + 11,
3 3
(d) −4αx2 + βx2 where α and β are constants.
Answers
1. (a) 10x, (b) 12q, (c) 18x2 , (d) −9v 2 , (e) cannot be simplified.
2. (a) 3w + 4r, (b) cannot be simplified, (c) 4w2
3. (a) 18x − 9, (b) 2x2 + 3x − 2, (c) −8x2 + 11x + 11, (d) cannot be simplified,
(e) a2 + 2ab + b2 , (f) 3x2 + 10x + 8, (g) s3 + 5s2 + 10s + 12
4. (a) 9x, (b) 162x2 , (c) −162x2 , (d) −27x, (e) −162x2
5. (a) 4x − 2x = 2x, (b) 4x(−2x) = −8x2 , (c) 4x(2x) = 8x2 , (d) −4x(2x) = −8x2 ,
(e) −4x − 2x = −6x, (f) (4x)(2x) = 8x2
11
6. (a) x2 , (b) 1.25x2 − x, (c) cannot be simplified, (d) (β − 4α)x2
2
HELM (2006): 43
2. Removing brackets from expressions a(b + c) and a(b − c)
Removing brackets means multiplying out. For example 5(2 + 4) = 5 × 2 + 5 × 4 = 10 + 20 = 30.
In this simple example we could alternatively get the same result as follows: 5(2 + 4) = 5 × 6 = 30.
That is:
5(2 + 4) = 5 × 2 + 5 × 4
In an expression such as 5(x + y) it is intended that the 5 multiplies both x and y to produce 5x + 5y.
Thus the expressions 5(x + y) and 5x + 5y are equivalent. In general we have the following rules
known as distributive laws:
Key Point 14
a(b + c) = ab + ac
a(b − c) = ab − ac
Note that when the brackets are removed both terms in the brackets are multiplied by a.
As we have noted above, if you insert numbers instead of letters into these expressions you will see
that both left and right hand sides are equivalent. For example
4(3 + 5) has the same value as 4(3) + 4(5), that is 32
and
7(8 − 3) has the same value as 7(8) − 7(3), that is 35
Example 31
Remove the brackets from (a) 9(2 + y), (b) 9(2y).
Solution
(a) In the expression 9(2 + y) the 9 must multiply both terms in the brackets:
9(2 + y) = 9(2) + 9(y)

= 18 + 9y
(b) Recall that 9(2y) means 9 × (2 × y) and that when multiplying numbers together the presence
of brackets is irrelevant. Thus 9(2y) = 9 × 2 × y = 18y
44 HELM (2006):
®
The crucial distinction between the role of the factor 9 in the two expressions 9(2 + y) and 9(2y) in
Example 31 should be noted.
Example 32
Remove the brackets from 9(x + 2y).
Solution
In the expression 9(x + 2y) the 9 must multiply both the x and the 2y in the brackets. Thus
9(x + 2y) = 9x + 9(2y)

= 9x + 18y
Task
Remove the brackets from 9(2x + 3y).
Remember that the 9 must multiply both the term 2x and the term 3y:
Your solution
9(2x + 3y) =
Answer
18x + 27y
Example 33
Remove the brackets from −3(5x − z).
Solution
The number −3 must multiply both the 5x and the z.
−3(5x − z) = (−3)(5x) − (−3)(z)

= −15x + 3z
HELM (2006): 45
Task
Remove the brackets from 6x(3x − 2y).
Your solution
Answer
6x(3x − 2y) = 6x(3x) − 6x(2y) = 18x2 − 12xy
Example 34
Remove the brackets from −(3x + 1).
Solution
Although the 1 is unwritten, the minus sign outside the brackets stands for −1. We must therefore
consider the expression −1(3x + 1).
−1(3x + 1) = (−1)(3x) + (−1)(1)

= −3x + (−1)
= −3x − 1
Task
Remove the brackets from −(5x − 3y).
Your solution
Answer
−(5x − 3y) means −1(5x − 3y).
−1(5x − 3y) = (−1)(5x) − (−1)(3y) = −5x + 3y
46 HELM (2006):
®
Task
Remove the brackets from m(m − n).
In the expression m(m − n) the first m must multiply both terms in the brackets:
Your solution
m(m − n) =
Answer
m2 − mn
Example 35
Remove the brackets from the expression 5x − (3x + 1) and simplify the result by
collecting like terms.
Solution
The brackets in −(3x + 1) were removed in Example 34 on page 46.
5x − (3x + 1) = 5x − 1(3x + 1)
= 5x − 3x − 1
= 2x − 1
Example 36
−x − 1 −(x + 1) x+1
Show that , and − are all equivalent expressions.
4 4 4
Solution
Consider −(x + 1). Removing the brackets we obtain −x − 1 and so
−x − 1 −(x + 1)
is equivalent to
4 4
A negative quantity divided by a positive quantity will be negative. Hence
−(x + 1) x+1
is equivalent to −
4 4
You should study all three expressions carefully to recognise the variety of equivalent ways in which
we can write an algebraic expression.
HELM (2006): 47
Sometimes the bracketed expression can appear on the left, as in (a + b)c. To remove the brackets
here we use the following rules:
Key Point 15
(a + b)c = ac + bc
(a − b)c = ac − bc
Note that when the brackets are removed both the terms in the brackets multiply c.
Example 37
Remove the brackets from (2x + 3y)x.
Solution
Both terms in the brackets multiply the x outside. Thus
(2x + 3y)x = 2x(x) + 3y(x)

= 2x2 + 3yx
Task
Remove the brackets from (a) (x + 3)(−2), (b) (x − 3)(−2).
Your solution
(a) (x + 3)(−2) =
Answer
Both terms in the bracket must multiply the −2, giving −2x − 6
Your solution
(b) (x − 3)(−2) =
Answer
−2x + 6
48 HELM (2006):
®
3. Removing brackets from expressions of the form (a + b)(c + d)

Sometimes it is necessary to consider two bracketed terms multiplied together. In the expression
(a + b)(c + d), by regarding the first bracket as a single term we can use the result in Key Point
14 to write it as (a + b)c + (a + b)d. Removing the brackets from each of these terms produces
ac + bc + ad + bd. More concisely:
Key Point 16
(a + b)(c + d) = (a + b)c + (a + b)d = ac + bc + ad + bd
We see that each term in the first bracketed expression multiplies each term in the second bracketed
expression.
Example 38
Remove the brackets from (3 + x)(2 + y)
Solution
We find (3 + x)(2 + y) = (3 + x)(2) + (3 + x)y
= (3)(2) + (x)(2) + (3)(y) + (x)(y) = 6 + 2x + 3y + xy
Example 39
Remove the brackets from (3x + 4)(x + 2) and simplify your result.
Solution
(3x + 4)(x + 2) = (3x + 4)(x) + (3x + 4)(2)
= 3x2 + 4x + 6x + 8 = 3x2 + 10x + 8
HELM (2006): 49
Example 40
Remove the brackets from (a + b)2 and simplify your result.
Solution
When a quantity is squared it is multiplied by itself. Thus
(a + b)2 = (a + b)(a + b) = (a + b)a + (a + b)b

= a2 + ba + ab + b2 = a2 + 2ab + b2
Key Point 17
(a + b)2 = a2 + 2ab + b2
(a − b)2 = a2 − 2ab + b2
Task
Remove the brackets from the following expressions and simplify the results.
(a) (x + 7)(x + 3), (b) (x + 3)(x − 2),
Your solution
(a) (x + 7)(x + 3) =
Answer
x2 + 7x + 3x + 21 = x2 + 10x + 21
Your solution
(b) (x + 3)(x − 2) =
Answer
x2 + 3x − 2x − 6 = x2 + x − 6
50 HELM (2006):
®
Example 41
Explain the distinction between (x + 3)(x + 2) and x + 3(x + 2).
Solution
In the first expression removing the brackets we find
(x + 3)(x + 2) = x2 + 3x + 2x + 6
= x2 + 5x + 6
In the second expression we have

x + 3(x + 2) = x + 3x + 6 = 4x + 6
Note that in the second expression the term (x + 2) is only multiplied by 3 and not by x.
Example 42
Remove the brackets from (s2 + 2s + 4)(s + 3).
Solution
Each term in the first bracket must multiply each term in the second. Working through all combi-
nations systematically we have
(s2 + 2s + 4)(s + 3) = (s2 + 2s + 4)(s) + (s2 + 2s + 4)(3)

= s3 + 2s2 + 4s + 3s2 + 6s + 12
= s3 + 5s2 + 10s + 12
HELM (2006): 51
Engineering Example 1
Reliability in a communication network
Introduction
The reliability of a communication network depends on the reliability of its component parts. The
reliability of a component can be represented by a number between 0 and 1 which represents the
probability that it will function over a given period of time.
A very simple system with only two components C1 and C2 can be configured in series or in parallel.
If the components are in series then the system will fail if one component fails (see Figure 4)
C1 C2
Figure 4: Both components 1 and 2 must function for the system to function
If the components are in parallel then only one component need function properly (see Figure 5)
and we have built-in redundancy.
C1
C2
Figure 5: Either component 1 or 2 must function for the system to function

The reliability of a system with two units in parallel is given by 1 − (1 − R1 )(1 − R2 ) which is the
same as R1 + R2 − R1 R2 , where Ri is the reliability of component Ci . The reliability of a system
with 3 units in parallel, as in Figure 6, is given by
1 − (1 − R1 )(1 − R2 )(1 − R3 )
C1
C2
C3
Figure 6: At least one of the three components must function for the system to function
52 HELM (2006):
®
Problem in words
(a) Show that the expression for the system reliability for three components in parallel is equal
to R1 + R2 + R3 − R1 R2 − R1 R3 − R2 R3 + R1 R2 R3
(b) Find an expression for the reliability of the system when the reliability of each of the
components is the same i.e. R1 = R2 = R3 = R
(c) Find the system reliability when R = 0.75
(d) Find the system reliability when there are two parallel components each with reliability
R = 0.75.
Mathematical statement of the problem
(a) Show that 1−(1−R1 )(1−R2 )(1−R3 ) ≡ R1 +R2 +R3 −R1 R2 −R1 R3 −R2 R3 +R1 R2 R3
(b) Find 1 − (1 − R1 )(1 − R2 )(1 − R3 ) in terms of R when R1 = R2 = R3 = R
(c) Find the value of (b) when R = 0.75
(d) Find 1 − (1 − R1 )(1 − R2 ) when R1 = R2 = 0.75.
Mathematical analysis
(a) 1 − (1 − R1 )(1 − R2 )(1 − R3 ) ≡ 1 − (1 − R1 − R2 + R1 R2 )(1 − R3 )
= 1 − ((1 − R1 − R2 + R1 R2 ) × 1 − (1 − R1 − R2 + R1 R2 ) × R3 )
= 1 − (1 − R1 − R2 + R1 R2 − (R3 − R1 R3 − R2 R3 + R1 R2 R3 ))
= 1 − (1 − R1 − R2 + R1 R2 − R3 + R1 R3 + R2 R3 − R1 R2 R3 )
= 1 − 1 + R1 + R2 − R1 R2 + R3 − R1 R3 − R2 R3 + R1 R2 R3
= R1 + R2 + R3 − R1 R2 − R1 R3 − R2 R3 + R1 R2 R3
(b) When R1 = R2 = R3 = R the reliability is
1 − (1 − R)3 which is equivalent to 3R − 3R2 + R3
(c) When R1 = R2 = R3 = 0.75 we get
1 − (1 − 0.75)3 = 1 − 0.253 = 1 − 0.015625 = 0.984375
(d) 1 − (0.25)2 = 0.9375
Interpretation
The mathematical analysis confirms the expectation that the more components there are in par-
allel then the more reliable the system becomes (1 component: 0.75; 2 components: 0.9375; 3
components: 0.984375). With three components in parallel, as in part (c), although each individual
component is relatively unreliable (R = 0.75 implies a one in four chance of failure of an individual
component) the system as a whole has an over 98% probability of functioning (under 1 in 50 chance
of failure).
HELM (2006): 53
Exercises
1. Remove the brackets from each of the following expressions:
(a) 2(mn), (b) 2(m + n), (c) a(mn), (d) a(m + n), (e) a(m − n),
(f) (am)n, (g) (a + m)n, (h) (a − m)n, (i) 5(pq), (j) 5(p + q),
(k) 5(p − q), (l) 7(xy), (m) 7(x + y), (n) 7(x − y), (o) 8(2p + q),
(p) 8(2pq), (q) 8(2p − q), (r) 5(p − 3q), (s) 5(p + 3q) (t) 5(3pq).

(a) 4(a + b), (b) 2(m − n), (c) 9(x − y),
3. Remove the brackets from each of the following expressions and simplify where possible:
(a) (2 + a)(3 + b), (b) (x + 1)(x + 2), (c) (x + 3)(x + 3), (d) (x + 5)(x − 3)
(a) (7 + x)(2 + x), (b) (9 + x)(2 + x), (c) (x + 9)(x − 2), (d) (x + 11)(x − 7),
(e) (x + 2)x, (f) (3x + 1)x, (g) (3x + 1)(x + 1), (h) (3x + 1)(2x + 1),
(i) (3x + 5)(2x + 7), (j) (3x + 5)(2x − 1), (k) (5 − 3x)(x + 1) (l) (2 − x)(1 − x).
5. Remove the brackets from (s + 1)(s + 5)(s − 3).
Answers
1. (a) 2mn, (b) 2m + 2n, (c) amn, (d) am + an, (e) am − an, (f) amn, (g) an + mn,
(h) an − mn, (i) 5pq, (j) 5p + 5q, (k) 5p − 5q, (l) 7xy, (m) 7x + 7y, (n) 7x − 7y,
(o) 16p + 8q, (p) 16pq, (q) 16p − 8q, (r) 5p − 15q, (s) 5p + 15q, (t) 15pq
2. (a) 4a + 4b, (b) 2m − 2n, (c) 9x − 9y
3. (a) 6 + 3a + 2b + ab, (b) x2 + 3x + 2, (c) x2 + 6x + 9, (d) x2 + 2x − 15
4. On removing brackets we obtain:

(a) 14 + 9x + x2 , (b) 18 + 11x + x2 , (c) x2 + 7x − 18, (d) x2 + 4x − 77
(e) x2 + 2x, (f) 3x2 + x, (g) 3x2 + 4x + 1 (h) 6x2 + 5x + 1
(i) 6x2 + 31x + 35, 2
(j) 6x + 7x − 5, (k) −3x + 2x + 5, (l) x2 − 3x + 2
2
5. s3 + 3s2 − 13s − 15
54 HELM (2006):
®
4. Factorisation
A number is said to be factorised when it is written as a product. For example, 21 can be factorised
into 7 × 3. We say that 7 and 3 are factors of 21.
Algebraic expressions can also be factorised. Consider the expression 7(2x + 1). Removing the
brackets we can rewrite this as
7(2x + 1) = 7(2x) + (7)(1) = 14x + 7.
Thus 14x + 7 is equivalent to 7(2x + 1). We see that 14x + 7 has factors 7 and (2x + 1). The
factors 7 and (2x + 1) multiply together to give 14x + 7. The process of writing an expression as a
product of its factors is called factorisation. When asked to factorise 14x + 7 we write
14x + 7 = 7(2x + 1)
and so we see that factorisation can be regarded as reversing the process of removing brackets.
Always remember that the factors of an algebraic expression are multiplied together.
Example 43
Factorise the expression 4x + 20.
Solution
Both terms in the expression 4x + 20 are examined to see if they have any factors in common.
Clearly 20 can be factorised as (4)(5) and so we can write
4x + 20 = 4x + (4)(5)
The factor 4 is common to both terms on the right; it is called a common factor and is placed at
the front and outside brackets to give
4x + 20 = 4(x + 5)
Note that the factorised form can be checked by removing the brackets again.
Example 44
Factorise z 2 − 5z.
Solution
Note that since z 2 = z × z we can write
z 2 − 5z = z(z) − 5z
so that there is a common factor of z. Hence
z 2 − 5z = z(z) − 5z = z(z − 5)
HELM (2006): 55
Example 45
Factorise 6x − 9y.
Solution
By observation, we see that there is a common factor of 3. Thus 6x − 9y = 3(2x − 3y)
Task
Factorise 14z + 21w.
(a) Find the factor common to both 14z and 21w:

Your solution
Answer
7
(b) Now factorise 14z + 21w:
Your solution
14z + 21w =
Answer
7(2z + 3w)
Note: If you have any doubt, you can check your answer by removing the brackets again.
Task
Factorise 6x − 12xy.
First identify the two common factors:

Your solution
Answer
6 and x
Now factorise 6x − 12xy:
Your solution
6x − 12xy =
Answer
6x(1 − 2y)
56 HELM (2006):
®
Exercises
1. Factorise
(a) 5x + 15y, (b) 3x − 9y, (c) 2x + 12y, (d) 4x + 32z + 16y, (e) 12 x + 41 y.
In each case check your answer by removing the brackets again.
2. Factorise
(a) a2 + 3ab, (b) xy + xyz, (c) 9x2 − 12x
3. Explain why a is a factor of a + ab but b is not. Factorise a + ab.
4. Explain why x2 is a factor of 4x2 + 3yx3 + 5yx4 but y is not.

Factorise 4x2 + 3yx3 + 5yx4 .
Answers
1. (a) 5(x + 3y), (b) 3(x − 3y), (c) 2(x + 6y), (d) 4(x + 8z + 4y), (e) 21 (x + 12 y)
2. (a) a(a + 3b), (b) xy(1 + z), (c) 3x(3x − 4).
3. a(1 + b).
4. x2 (4 + 3yx + 5yx2 ).
5. Factorising quadratic expressions

Quadratic expressions commonly occur in many areas of mathematics, physics and engineering. Many
quadratic expressions can be written as the product of two linear factors and, in this Section, we
examine how these factors can be easily found.
Key Point 18
An expression of the form
ax2 + bx + c a 6= 0
where a, b and c are numbers is called a quadratic expression (in the variable x).
The numbers b and c may be zero but a must not be zero (for, then, the quadratic reduces to a
linear expression or constant). The number a is called the coefficient of x2 , b is the coefficient of
x and c is called the constant term.
HELM (2006): 57
Case 1
Consider the product (x + 1)(x + 2). Removing brackets yields x2 + 3x + 2. Conversely, we see that
the factors of x2 + 3x + 2 are (x + 1) and (x + 2). However, if we were given the quadratic expression
first, how would we factorise it ? The following examples show how to do this but note that not all
quadratic expressions can be easily factorised.
To enable us to factorise a quadratic expression in which the coefficient of x2 equals 1, we note the
following expansion:
(x + m)(x + n) = x2 + mx + nx + mn = x2 + (m + n)x + mn
So, given a quadratic expression we can think of the coefficient of x as m + n and the constant term
as mn. Once the values of m and n have been found the factors can be easily obtained.
Example 46
Factorise x2 + 4x − 5.
Solution
Writing x2 + 4x − 5 = (x + m)(x + n) = x2 + (m + n)x + mn we seek numbers m and n such
that m + n = 4 and mn = −5. By trial and error it is not difficult to find that m = 5 and n = −1
(or, the other way round, m = −1 and n = 5). So we can write
x2 + 4x − 5 = (x + 5)(x − 1)
The answer can be checked easily by removing brackets.
Task
Factorise x2 + 6x + 8.
As the coefficient of x2 is 1, we can write
x2 + 6x + 8 = (x + m)(x + n) = x2 + (m + n)x + mn
so that m + n = 6 and mn = 8.
First, find suitable values for m and n:
Your solution
Answer
m = 4, n = 2 or, the other way round, m = 2, n = 4
Finally factorise the quadratic:
58 HELM (2006):
®
Your solution
x2 + 6x + 8 =
Answer
(x + 4)(x + 2)
Case 2
When the coefficient of x2 is not equal to 1 it may be possible to extract a numerical factor. For
example, note that 3x2 + 18x + 24 can be written as 3(x2 + 6x + 8) and then factorised as in
the previous Task in Case 1. Sometimes no numerical factor can be found and a slightly different
approach may be taken. We will demonstrate a technique which can always be used to transform
the given expression into one in which the coefficient of the squared variable equals 1.
Example 47
Factorise 2x2 + 5x + 3.
Solution
First note the coefficient of x2 ; in this case 2. Multiply the whole expression by this number and
rearrange as follows:
2(2x2 + 5x + 3) = 2(2x2 ) + 2(5x) + 2(3) = (2x)2 + 5(2x) + 6.
We now introduce a new variable z such that z = 2x Thus we can write
(2x)2 + 5(2x) + 6 as z 2 + 5z + 6
This can be factorised to give (z + 3)(z + 2). Returning to the original variable by replacing z by
2x we find
2(2x2 + 5x + 3) = (2x + 3)(2x + 2)
A factor of 2 can be extracted from the second bracket on the right so that
2(2x2 + 5x + 3) = 2(2x + 3)(x + 1)
so that
2x2 + 5x + 3 = (2x + 3)(x + 1)
As an alternative to the technique of Example 47, experience and practice can often help us to
identify factors. For example suppose we wish to factorise 3x2 + 7x + 2. We write
3x2 + 7x + 2 = ( )( )
In order to obtain the term 3x2 we can place terms 3x and x in the brackets to give
3x2 + 7x + 2 = (3x + ? )(x + ? )
HELM (2006): 59
In order to obtain the constant 2, we consider the factors of 2. These are 1,2 or −1,−2. By placing
these factors in the brackets we can factorise the quadratic expression. Various possibilities exist: we
could write (3x + 2)(x + 1) or (3x + 1)(x + 2) or (3x − 2)(x − 1) or (3x − 1)(x − 2), only one of which
is correct. By removing brackets from each in turn we look for the factorisation which produces the
correct middle term, 7x. The correct factorisation is found to be
3x2 + 7x + 2 = (3x + 1)(x + 2)
With practice you will be able to carry out this process quite easily.
Task
Factorise the quadratic expression 5x2 − 7x − 6.
Write 5x2 − 7x − 6 = ( )( )
To obtain the quadratic term 5x2 , insert 5x and x in the brackets:
5x2 − 7x − 6 = (5x + ? )(x + ? )
Now find the factors of −6:

Your solution
Answer
3, −2 or −3, 2 or −6, 1 or 6, −1
Use these factors in turn to find which pair, if any, gives rise to the middle term, −7x, and complete
the factorisation:
Your solution
5x2 − 7x − 6 = (5x + )(x + ) =
Answer
(5x + 3)(x − 2)
On occasions you will meet expressions of the form x2 −y 2 known as the difference of two squares.
It is easy to verify by removing brackets that this factorises as
x2 − y 2 = (x + y)(x − y)
So, if you can learn to recognise such expressions it is an easy matter to factorise them.
60 HELM (2006):
®
Example 48
Factorise
(a) x2 − 36z 2 , (b) 25x2 − 9z 2 , (c) α2 − 1
Solution
In each case we are required to find the difference of two squared terms.
(a) Note that x2 − 36z 2 = x2 − (6z)2 . This factorises as (x + 6z)(x − 6z).

(b) Here 25x2 − 9z 2 = (5x)2 − (3z)2 . This factorises as (5x + 3z)(5x − 3z).
(c) α2 − 1 = (α + 1)(α − 1).
Exercises
1. Factorise
(a) x2 + 8x + 7, (b) x2 + 6x − 7, (c) x2 + 7x + 10, (d) x2 − 6x + 9.
2. Factorise
(a) 2x2 + 3x + 1, (b) 2x2 + 4x + 2, (c) 3x2 − 3x − 6, (d) 5x2 − 4x − 1, (e) 16x2 − 1,
(f) −x2 + 1, (g) −2x2 + x + 3.
3. Factorise
(a) x2 + 9x + 14, (b) x2 + 11x + 18, (c) x2 + 7x − 18, (d) x2 + 4x − 77,
(e) x2 + 2x, 2
(f) 3x + x, 2
(g) 3x + 4x + 1, (h) 6x2 + 5x + 1,
(i) 6x2 + 31x + 35, (j) 6x2 + 7x − 5, (k) −3x2 + 2x + 5, (l) x2 − 3x + 2.
4. Factorise (a) z 2 − 144, (b) z 2 − 41 , (c) s2 − 1

9
Answers
1. (a) (x + 7)(x + 1), (b) (x + 7)(x − 1), (c) (x + 2)(x + 5), (d) (x − 3)(x − 3)
2. (a) (2x + 1)(x + 1), (b) 2(x + 1)2 , (c) 3(x + 1)(x − 2), (d)(5x + 1)(x − 1),
(e) (4x + 1)(4x − 1), (f) (x + 1)(1 − x), (g) (x + 1)(3 − 2x)
3. The factors are:

(a) (7 + x)(2 + x), (b) (9 + x)(2 + x), (c) (x + 9)(x − 2), (d) (x + 11)(x − 7),
(e) (x + 2)x, (f) (3x + 1)x, (g) (3x + 1)(x + 1), (h) (3x + 1)(2x + 1),
(i) (3x + 5)(2x + 7), (j) (3x + 5)(2x − 1), (k) (5 − 3x)(x + 1), (l) (2 − x)(1 − x).
4. (a) (z + 12)(z − 12), (b) (z + 12 )(z − 21 ), (c) (s + 31 )(s − 13 )
HELM (2006): 61
Arithmetic of
Algebraic Fractions 1.4

Introduction
Just as one whole number divided by another is called a numerical fraction, so one algebraic expression
divided by another is known as an algebraic fraction. Examples are
x 3x + 2y x2 + 3x + 1
, , and
y x−y x−4
In this Section we explain how algebraic fractions can be simplified, added, subtracted, multiplied
and divided.

Prerequisites • be familiar with the arithmetic of numerical

fractions

Learning Outcomes • add, subtract, multiply and divide algebraic

fractions

62 HELM (2006):
®
1. Cancelling common factors

10
Consider the fraction . To simplify it we can factorise the numerator and the denominator and then
35
cancel any common factors. Common factors are those factors which occur in both the numerator
and the denominator. Thus
10 65×2 2
= =
35 7× 6 5 7
Note that the common factor 5 has been cancelled. It is important to remember that only common
10 2
factors can be cancelled. The fractions and have identical values - they are equivalent fractions
35 7
2 10
- but is in a simpler form than .
7 35
We apply the same process when simplifying algebraic fractions.
Example 49
Simplify, if possible,
yx x x
(a) , (b) , (c)
2x xy x+y
Solution
yx
(a) In the expression , x is a factor common to both numerator and denominator. This
2x
common factor can be cancelled to give
y6x y
=
26x 2
x 1x
(b) Note that can be written . The common factor of x can be cancelled to give
xy xy
16x 1
=
6 xy y
x
(c) In the expression notice that an x appears in both numerator and denominator.
x+y
However x is not a common factor. Recall that factors of an expression are multi-
plied together whereas in the denominator x is added to y. This expression cannot be
simplified.
HELM (2006): 63
Section 1.4: Arithmetic of Algebraic Fractions
Task
abc 3ab
Simplify, if possible, (a), (b)
3ac b+a
When simplifying remember only common factors can be cancelled.
Your solution
abc 3ab
(a) = (b) =
3ac b+a
Answer
b
(a) (b) This cannot be simplified.
3
Task
21x3
Simplify ,
14x
Your solution
Answer
Factorising and cancelling common factors gives:
21x3 6 7 × 3× 6 x × x2 3x2
= =
14x 6 7 × 2× 6 x 2
Task
36x
Simplify
12x3
Your solution
Answer
Factorising and cancelling common factors gives:
36x 12 × 3 × x 3
3
= 2
= 2
12x 12 × x × x x
64 HELM (2006):
®
Example 50
3x + 6
Simplify .
6x + 12
Solution
First we factorise the numerator and the denominator to see if there are any common factors.
3x + 6 3(x + 2) 3 1
= = =
6x + 12 6(x + 2) 6 2
The factors x + 2 and 3 have been cancelled.
Task
12
Simplify .
2x + 8
Your solution
12
=
2x + 8
Answer
6×2 6
Factorise the numerator and denominator, and cancel any common factors. =
2(x + 4) x+4
Example 51
3 3(x + 4)
Show that the algebraic fraction and 2 are equivalent.
x+1 x + 5x + 4
Solution
The denominator, x2 + 5x + 4, can be factorised as (x + 1)(x + 4) so that
3(x + 4) 3(x + 4)
=
x2 + 5x + 4 (x + 1)(x + 4)
Note that (x + 4) is a factor common to both the numerator and the denominator and can be
3 3 3(x + 4)
cancelled to leave . Thus and 2 are equivalent fractions.
x+1 x+1 x + 5x + 4
HELM (2006): 65
Task
x−1 1
Show that is equivalent to .
x2 − 3x + 2 x−2
First factorise the denominator:

Your solution
x2 − 3x + 2 =
Answer
(x − 1)(x − 2)
Now identify the factor common to both numerator and denominator and cancel this common factor:
Your solution
x−1
=
(x − 1)(x − 2)
Answer
1
. Hence the two given fractions are equivalent.
x−2
Example 52
6(4 − 8x)(x − 2)
Simplify
1 − 2x
Solution
The factor 4 − 8x can be factorised to 4(1 − 2x). Thus
6(4 − 8x)(x − 2) (6)(4)(1 − 2x)(x − 2)
= = 24(x − 2)
1 − 2x (1 − 2x)
Task
x2 + 2x − 15
Simplify
2x2 − 5x − 3
First factorise the numerator and factorise the denominator:

Your solution
x2 + 2x − 15
=
2x2 − 5x − 3
66 HELM (2006):
®
Answer
(x + 5)(x − 3)
(2x + 1)(x − 3)
Then cancel any common factors:

Your solution
(x + 5)(x − 3)
=
(2x + 1)(x − 3)
Answer
x+5
2x + 1
Exercises
19 14 35 7 14
(a) , (b) , (c) , (d) , (e)
38 28 40 11 56
14 36 13 52
2. Simplify, if possible, (a) , (b) , (c) , (d)
21 96 52 13
5z 25z 5 5z
3. Simplify (a) , (b) , (c) 2
, (d)
z 5z 25z 25z 2
4. Simplify
4x 15x 4s 21x4
(a) , (b) , (c) , (d)
3x x2 s3 7x3
x+1 x+1 2(x + 1) 3x + 3 5x − 15 5x − 15
(a) , (b) , (c) , (d) , (e) , (f) .
2(x + 1) 2x + 2 x+1 x+1 5 x−3
5x + 15 5x + 15 5x + 15 5x + 15
(a) , (b) , (c) , (d)
25x + 5 25x 25 25x + 1
x2 + 10x + 9 x2 − 9 2x2 − x − 1
7. Simplify (a) , (b) , (c) ,
x2 + 8x − 9 x2 + 4x − 21 2x2 + 5x + 2
3x2 − 4x + 1 5z 2 − 20z
(d) , (e)
x2 − x 2z − 8
6 2x 3x2
8. Simplify (a) , (b) 2 , (c)
3x + 9 4x + 2x 15x3 + 10x2
x2 − 1 x2 + 5x + 6
9. Simplify (a) , (b) .
x2 + 5x + 4 x2 + x − 6
HELM (2006): 67
Answers
1 1 7 7 1
1. (a) , (b) , (c) , (d) , (e) .
2 2 8 11 4
2 3 1
2. (a) , (b) , (c) , (d) 4
3 8 4
1 1
3. (a) 5, (b) 5, (c) 2
, (d) .
5z 5z
4 15 4
4. (a) , (b) , (c) 2 , (d) 3x
3 x s
1 1
5. (a) , (b) , (c) 2, (d) 3, (e) x − 3, (f) 5
2 2
x+3 x+3 x+3 5(x + 3)
6. (a) , (b) , (c) , (d)
5x + 1 5x 5 25x + 1
x+1 x+3 x−1 3x − 1 5z
7. (a) , (b) , (c) , (d) , (e)
x−1 x+7 x+2 x 2
2 1 3
8. (a) , (b) , (c) .
x+3 2x + 1 5(3x + 2)
x−1 x+2
9. (a) , (b) .
x+4 x−2
2. Multiplication and division of algebraic fractions

To multiply together two fractions (numerical or algebraic) we multiply their numerators together
and then multiply their denominators together. That is
Key Point 19
Multiplication of fractions
a c ac
× =
b d bd
Any factors common to both numerator and denominator can be cancelled. This cancellation can be
performed before or after the multiplication.
To divide one fraction by another (numerical or algebraic) we invert the second fraction and then
multiply.
68 HELM (2006):
®
Key Point 20
Division of fractions
a c a d ad
÷ = × = b 6= 0, c 6= 0, d 6= 0
b d b c bc
Example 53
2a 4 2a c 2a 4
Simplify (a) × , (b) × , (c) ÷
c c c 4 c c
Solution
2a 4 8a
(a) × = 2
c c c
2a c 2ac 2a a
(b) × = = =
c 4 4c 4 2
(c) Division is performed by inverting the second fraction and then multiplying.
2a 4 2a c a
÷ = × = (from the result in (b))
c c c 4 2
Example 54
1 1
Simplify (a) × 3x, (b) × x.
5x x
Solution
3x 1 1 3x 3x 3
(a) Note that 3x = . Then × 3x = × = =
1 5x 5x 1 5x 5
x 1 1 x x
(b) x can be written as . Then × x = × = = 1
1 x x 1 x
HELM (2006): 69
Task
1 y
Simplify (a) × x, (b) × x.
y x
Your solution
Answer
1 1 x x
(a) ×x= × =
y y 1 y
y y x yx
(b) ×x= × = =y
x x 1 x
Example 55
2x
y
Simplify
3x
2y
Solution
2x 3x
We can write the fraction as ÷ .
y 2y
Inverting the second fraction and multiplying we find
2x 2y 4xy 4
× = =
y 3x 3xy 3
70 HELM (2006):
®
Example 56
4x + 2 x+3
Simplify ×
x2 + 4x + 3 7x + 5
Solution
Factorising the numerator and denominator we find
4x + 2 x+3 2(2x + 1) x+3 2(2x + 1)(x + 3)
× = × =
x2 + 4x + 3 7x + 5 (x + 1)(x + 3) 7x + 5 (x + 1)(x + 3)(7x + 5)
2(2x + 1)
=
(x + 1)(7x + 5)
It is usually better to factorise first and cancel any common factors before multiplying. Don’t remove
any brackets unnecessarily otherwise common factors will be difficult to spot.
Task
Simplify
15 3
÷
3x − 1 2x + 1
Your solution
Answer
To divide we invert the second fraction and multiply:
15 3 15 2x + 1 (5)(3)(2x + 1) 5(2x + 1)
÷ = × = =
3x − 1 2x + 1 3x − 1 3 3(3x − 1) 3x − 1
HELM (2006): 71
Exercises
5 3 14 3 6 3 4 28
1. Simplify (a) × , (b) × , (c) × , (d) ×
9 2 3 9 11 4 7 3
5 3 14 3 6 3 4 28
2. Simplify (a) ÷ , (b) ÷ , (c) ÷ , (d) ÷
9 2 3 9 11 4 7 3
3. Simplify
x+y 1 2
(a) 2 × , (b) × 2(x + y), (c) × (x + y)
3 3 3
4. Simplify
x+4 1 3 x x+1 1 x2 + x
(a) 3 × , (b) × 3(x + 4), (c) × (x + 4), (d) × , (e) × ,
7 7 7 y y+1 y y+1
πd2 Q Q
(f) × 2, (g)
4 πd πd2 /4
6/7
5. Simplify
s+3
3 x
6. Simplify ÷
x + 2 2x + 4
5 x
7. Simplify ÷
2x + 1 3x − 1
Answers
5 14 9 16
1. (a) , (b) , (c) , (d)
6 9 22 3
10 8 3
2. (a) , (b) 14, (c) , (d)
27 11 49
2(x + y) 2(x + y) 2(x + y)
3. (a) , (b) , (c)
3 3 3
3(x + 4) 3(x + 4) 3(x + 4) x(x + 1) x(x + 1)
4. (a) , (b) , (c) , (d) , (e) , (f) Q/4,
7 7 7 y(y + 1) y(y + 1)
4Q
(g)
πd2
6
5.
7(s + 3)
6
6.
x
5(3x − 1)
7.
x(2x + 1)
72 HELM (2006):
®
3. Addition and subtraction of algebraic fractions

To add two algebraic fractions the lowest common denominator must be found first. This is the
simplest algebraic expression that has the given denominators as its factors. All fractions must be
written with this lowest common denominator. Their sum is found by adding the numerators and
dividing the result by the lowest common denominator.
To subtract two fractions the process is similar. The fractions are written with the lowest common
denominator. The difference is found by subtracting the numerators and dividing the result by the
lowest common denominator.
Example 57
State the simplest expression which has x + 1 and x + 4 as its factors.
Solution
The simplest expression is (x + 1)(x + 4). Note that both x + 1 and x + 4 are factors.
Example 58
State the simplest expression which has x − 1 and (x − 1)2 as its factors.
Solution
The simplest expression is (x − 1)2 . Clearly (x − 1)2 must be a factor of this expression. Also,
because we can write (x − 1)2 = (x − 1)(x − 1) it follows that x − 1 is a factor too.
HELM (2006): 73
Example 59
3 2
Express as a single fraction +
x+1 x+4
Solution
The simplest expression which has both denominators as its factors is (x + 1)(x + 4). This is the
lowest common denominator. Both fractions must be written using this denominator. Note that
3 3(x + 4) 2 2(x + 1)
is equivalent to and is equivalent to . Thus writing
x+1 (x + 1)(x + 4) x+4 (x + 1)(x + 4)
both fractions with the same denominator we have
3 2 3(x + 4) 2(x + 1)
+ = +
x+1 x+4 (x + 1)(x + 4) (x + 1)(x + 4)
The sum is found by adding the numerators and dividing the result by the lowest common denomi-
nator.
3(x + 4) 2(x + 1) 3(x + 4) + 2(x + 1) 5x + 14
+ = =
(x + 1)(x + 4) (x + 1)(x + 4) (x + 1)(x + 4) (x + 1)(x + 4)
Key Point 21
Addition of two algebraic fractions
Step 1: Find the lowest common denominator
Step 2: Express each fraction with this denominator
Step 3: Add the numerators and divide the result by the lowest common denominator
Example 60
1 5
Express + as a single fraction.
x − 1 (x − 1)2
Solution
The simplest expression having both denominators as its factors is (x − 1)2 . We write both fractions
with this denominator.
1 5 x−1 5 x−1+5 x+4
+ 2
= 2
+ 2
= 2
=
x − 1 (x − 1) (x − 1) (x − 1) (x − 1) (x − 1)2
74 HELM (2006):
®
Task
3 5
x+7 x+2
First find the lowest common denominator:

Your solution
Answer
(x + 7)(x + 2)
Re-write both fractions using this lowest common denominator:

Your solution
3 5
+ =
x+7 x+2
Answer
3(x + 2) 5(x + 7)
+
(x + 7)(x + 2) (x + 7)(x + 2)
Finally, add the numerators and simplify:

Your solution
3 5
+ =
x+7 x+2
Answer
8x + 41
(x + 7)(x + 2)
Example 61
5x 3x − 4
Express − as a single fraction.
7 2
Solution
In this example both denominators are simply numbers. The lowest common denominator is 14, and
both fractions are re-written with this denominator. Thus
5x 3x − 4 10x 7(3x − 4) 10x − 7(3x − 4) 28 − 11x
− = − = =
7 2 14 14 14 14
HELM (2006): 75
Task
1 1
x y
Your solution
Answer
The simplest expression which has x and y as its factors is xy. This is the lowest common denom-
1 y 1 x
inator. Both fractions are written using this denominator. Noting that = and that =
x xy y xy
we find
1 1 y x y+x
+ = + =
x y xy xy xy
No cancellation is now possible because neither x nor y is a factor of the numerator.
Exercises
x x 2x x 2x 3x x 2 x+1 3
1. Simplify (a)+ , (b) + , (c) − , (d) − , (e) + ,
4 7 5 9 3 4 x+1 x+2 x x+2
2x + 1 x x+3 x x x
(f) − , (g) − , (h) −
3 2 2x + 1 3 4 5
2. Find
1 2 2 5 2 3 x+1 x+4
(a) + , (b) + , (c) − , (d) + ,
x+2 x+3 x+3 x+1 2x + 1 3x + 2 x+3 x+2
x−1 x−1
(e) + .
x − 3 (x − 3)2
5 4
3. Find + .
2x + 3 (2x + 3)2
1 11
4. Find s+
7 21
A B
5. Express + as a single fraction.
2x + 3 x + 1
A B C
6 Express + + as a single fraction.
2x + 5 (x − 1) (x − 1)2
A B
7 Express + as a single fraction.
x + 1 (x + 1)2
76 HELM (2006):
®
Ax + B C
8 Express + as a single fraction.
x2+ x + 10 x − 1
C
9 Express Ax + B + as a single fraction.
x+1
x1 x1 x2 x3
10 Show that is equal to .
1 1 x2 − x3
−
x3 x2
3x x x 3x x x
11 Find (a) − + , (b) − + .
4 5 3 4 5 3
Answers
11x 23x x x2 − 2 x2 + 6x + 2
1. (a) , (b) , (c) − , (d) , (e) ,
28 45 12 (x + 1)(x + 2) x(x + 2)
x+2 9 + 2x − 2x2 x
(f) , (g) , (h)
6 3(2x + 1) 20
3x + 7 7x + 17 1
2. (a) , (b) , (c) ,
(x + 2)(x + 3) (x + 3)(x + 1) (2x + 1)(3x + 2)
2x2 + 10x + 14 x2 − 3x + 2
(d) , (e)
(x + 3)(x + 2) (x − 3)2
10x + 19
3.
(2x + 3)2
3s + 11
4.
21
A(x + 1) + B(2x + 3)
5.
(2x + 3)(x + 1)
A(x − 1)2 + B(x − 1)(2x + 5) + C(2x + 5)
6.
(2x + 5)(x − 1)2
A(x + 1) + B
7.
(x + 1)2
(Ax + B)(x − 1) + C(x2 + x + 10)
8.
(x − 1)(x2 + x + 10)
(Ax + B)(x + 1) + C
9.
x+1
53x 13x
11. (a) , (b)
60 60
HELM (2006): 77
Formulae and
Transposition 1.5
Introduction
Formulae are used frequently in almost all aspects of engineering in order to relate a physical quantity
to one or more others. Many well-known physical laws are described using formulae. For example,
you may have already seen Ohm’s law, v = iR, or Newton’s second law of motion, F = ma.
In this Section we describe the process of evaluating a formula, explain what is meant by the subject
of a formula, and show how a formula is rearranged or transposed. These are basic skills required in
all aspects of engineering.

Prerequisites • be able to add, subtract, multiply and divide

algebraic fractions

Learning Outcomes • transpose a formula


78 HELM (2006):
®
1. Using formulae and substitution

In the study of engineering, physical quantities can be related to each other using a formula. The
formula will contain variables and constants which represent the physical quantities. To evaluate a
formula we must substitute numbers in place of the variables.
For example, Ohm’s law provides a formula for relating the voltage, v, across a resistor with resistance
value, R, to the current through it, i. The formula states
v = iR
We can use this formula to calculate v if we know values for i and R. For example, if i = 13 A, and
R = 5 Ω, then
v = iR
= (13)(5)
= 65
The voltage is 65 V.
Note that it is important to pay attention to the units of any physical quantities involved. Unless a
consistent set of units is used a formula is not valid.
Example 62
The kinetic energy, K, of an object of mass M moving with speed v can be
calculated from the formula, K = 21 M v 2 .
Calculate the kinetic energy of an object of mass 5 kg moving with a speed of 2
m s−1 .
Solution
In this example M = 5 and v = 2. Substituting these values into the formula we find
1
K = M v2
2
1
= (5)(22 )
2
= 10
In the SI system the unit of energy is the joule. Hence the kinetic energy of the object is 10 joules.
HELM (2006): 79
Section 1.5: Formulae and Transposition
Task
The area, A, of the circle of radius r can be calculated from the formula A = πr2 .
If we know the diameter of the circle, d, we can use the equivalent formula A =
πd2
. Find the area of a circle having diameter 0.1 m. Your calculator will be
4
preprogrammed with the value of π.
Your solution
A=
Answer
π(0.1)2
= 0.0079 m2
4
Example 63
The volume, V , of a circular cylinder is equal to its cross-sectional area, A, times
its length, h.
Find the volume of a cylinder having diameter 0.1 m and length 0.3 m.
Solution
πd2
We can use the result of the previous Task to obtain the cross-sectional area A = . Then
4
V = Ah
π(0.1)2
= × 0.3
4
= 0.0024
The volume is 0.0024 m3 .
80 HELM (2006):
®
2. Rearranging a formula
In the formula for the area of a circle, A = πr2 , we say that A is the subject of the formula. A
variable is the subject of the formula if it appears by itself on one side of the formula, usually the
left-hand side, and nowhere else in the formula. If we are asked to transpose the formula for
r, or solve for r, then we have to make r the subject of the formula. When transposing a formula
whatever is done to one side is done to the other. There are five rules that must be adhered to.
Key Point 22
Rearranging a formula
You may carry out the following operations
• add the same quantity to both sides of the formula
• subtract the same quantity from both sides of the formula
• multiply both sides of the formula by the same quantity
• divide both sides of the formula by the same quantity
• take a ‘function’ of both sides of the formula: for example,
find the reciprocal of both sides (i.e. invert).
Example 64
Transpose the formula p = 5t − 17 for t.
Solution
We must obtain t on its own on the left-hand side. We do this in stages by using one or more of
the five rules in Key Point 22. For example, by adding 17 to both sides of p = 5t − 17 we find
p + 17 = 5t − 17 + 17
so that p + 17 = 5t
Dividing both sides by 5 we obtain t on its own:

p + 17
=t
5
p + 17
so that t = .
5
HELM (2006): 81
Example 65
√
Transpose the formula 2q = p for q.
Solution
√
First we square both sides to remove the square root. Note that ( 2q)2 = 2q. This gives
2q = p2
p2
Second we divide both sides by 2 to get q = 2
.
Note that in general by squaring both sides of an equation may introduce extra solutions not valid
for the original equation. In Example 65 if p = 2 then q = 2 is the only solution. However, if we
p2
transform to q = , then if q = 2, p can be +2 or −2.
2
Task √
Transpose the formula v = t2 + w for w.
You must obtain w on its own on the left-hand side. Do this in several stages.
First square both sides to remove the square root:
Your solution
Answer
v 2 = t2 + w
Then, subtract t2 from both sides to obtain an expression for w:

Your solution
Answer
v 2 − t2 = w
Finally, write down the formula for w:

Your solution
Answer
w = v 2 − t2
82 HELM (2006):
®
Example 66
1
Transpose x = for y.
y
Solution
We must try to obtain an expression for y. Multiplying both sides by y has the effect of removing
this fraction:
1
Multiply both sides of x = by y to get
y
1
yx = y ×
y
so that yx = 1
1
Divide both sides by x to leaves y on its own, y = .
x
1 1
Alternatively: simply invert both sides of the equation x = to get = y.
y x
Example 67
Make R the subject of the formula
2 3
=
R x+y
Solution
In the given form R appears in a fraction. Inverting both sides gives
R x+y
=
2 3
Thus multiplying both sides by 2 gives
2(x + y)
R=
3
HELM (2006): 83
Task
1 1 1
Make R the subject of the formula = + .
R R1 R2
(a) Add the two terms on the right:
Your solution
Answer
1 1 R2 + R1
+ =
R1 R2 R1 R2
(b) Write down the complete formula:
Your solution
Answer
1 R2 + R1
=
R R1 R2
(c) Now invert both sides:
Your solution
Answer
R1 R2
R=
R2 + R1
84 HELM (2006):
®
Heat flow in an insulated metal plate
Introduction
Thermal insulation is important in many domestic (e.g. central heating) and industrial (e.g cooling
and heating) situations. Although many real situations involve heat flow in more than one dimension,
we consider only a one dimensional case here. The flow of heat is determined by temperature and
thermal conductivity. It is possible to model the amount of heat Q (J) crossing point x in one
dimension (the heat flow in the x direction) from temperature T2 (K) to temperature T1 (K) (in
which T2 > T1 ) in time t s by

Q T2 − T1
= λA ,
t x
where λ is the thermal conductivity in W m−1 K.
Problem in words
Suppose that the upper and lower sides of a metal plate connecting two containers are insulated and
one end is maintained at a temperature T2 (K) (see Figure 7).
The plate is assumed to be infinite in the direction perpendicular to the sheet of paper.
Insulator
Container 2 metal plate Container 1
Temperature T2 Heat flow Temperature T1
Insulator
Figure 7: A laterally insulated metal plate
(a) Find a formula for T .

(b) If λ = 205 (W m−1 K−1 ), T1 = 300 (K), A = 0.004 (m2 ), x = 0.5 (m), calculate the
value of T2 required to achieve a heat flow of 100 J s−1 .

Q T2 − T1
(a) Given = λA express T2 as the subject of the formula.
t x
(b) In the formula found in part (a) substitute λ = 205, T1 = 300, A = 0.004, x = 0.5 and
Q
= 100 to find T2 .
t
HELM (2006): 85

Q T2 − T1
(a) = λA
t x
Divide both sides by λA
Q T2 − T1
=
tλA x
Multiply both sides by x
Qx
= T2 − T1
tλA
Add T1 to both sides
Qx
+ T1 = T2
tλA
which is equivalent to
Qx
T2 = + T1
tλA
Q
(b) Substitute λ = 205, T1 = 300, A = 0.004, x = 0.5 and = 100 to find T2 :
t
100 × 0.5
T2 = + 300 ≈ 60.9 + 300 = 360.9
205 × 0.004
So the temperature in container 2 is 361 K to 3 sig.fig.
Interpretation
Qx
The formula T2 = + T1 can be used to find a value for T2 that would achieve any desired heat
tλA
flow. In the example given T2 would need to be about 361 K (≈ 78◦ C) to produce a heat flow of
100 J s−1 .
86 HELM (2006):
®
Exercises
1. The formula for the volume of a cylinder is V = πr2 h. Find V when r = 5 cm and h = 15
cm.
2. If R = 5p2 , find R when (a) p = 10, (b) p = 16.
3. For the following formulae, find y at the given values of x.
(a) y = 3x + 2, x = −1, x = 0, x = 1.
(b) y = −4x + 7, x = −2, x = 0, x = 1.
(c) y = x2 , x = −2, x = −1, x = 0, x = 1, x = 2.
3
4. If P = find P if Q = 15 and R = 0.300.
QR
r
x
5. If y = find y if x = 13.200 and z = 15.600.
z
π
6. Evaluate M = when r = 23.700 and s = −0.2.
2r + s
7. To convert a length measured in metres to one measured in centimetres, the length in metres
is multiplied by 100. Convert the following lengths to cm. (a) 5 m, (b) 0.5 m, (c) 56.2 m.
8. To convert an area measured in m2 to one measured in cm2 , the area in m2 is multiplied by

104 . Convert the following areas to cm2 . (a) 5 m2 , (b) 0.33 m2 , (c) 6.2 m2 .
9. To convert a volume measured in m3 to one measured in cm3 , the volume in m3 is multiplied

by 106 . Convert the following volumes to cm3 . (a) 15 m3 , (b) 0.25 m3 , (c) 8.2 m3 .
4QP
10. If η = evaluate η when QP = 0.0003, d = 0.05, L = 0.1 and n = 2.
πd2 Ln
11. The moment of inertia of an object is a measure of its resistance to rotation. It depends upon
both the mass of the object and the distribution of mass about the axis of rotation. It can be
shown that the moment of inertia, J, of a solid disc rotating about an axis through its centre
and perpendicular to the plane of the disc, is given by the formula
1
J = M a2
2
where M is the mass of the disc and a is its radius. Find the moment of inertia of a disc of
mass 12 kg and diameter 10 m. The SI unit of moment of inertia is kg m2 .
12. Transpose the given formulae to make the given variable the subject.
(a) y = 3x − 7, for x, (b) 8y + 3x = 4, for x, (c) 8x + 3y = 4 for y,
(d) 13 − 2x − 7y = 0 for x.
13. Transpose the formula P V = RT for (a) V , (b) P , (c) R, (d) T .
HELM (2006): 87
√
14. Transpose v = x + 2y, (a) for x, (b) for y.
15. Transpose 8u + 4v − 3w = 17 for each of u, v and w.
16. When a ball is dropped from rest onto a horizontal surface it will bounce before eventually
coming to rest after a time T where

2v 1
T =
g 1−e
where v is the speed immediately after the first impact, and g is a constant called the accel-
eration due to gravity. Transpose this formula to make e, the coefficient of restitution, the
subject.
s
2gh
17. Transpose q = A1 for A2 .
(A1 /A2 )2 − 1
r
r+x x−1
18. Make x the subject of (a) y = , (b) y = .
1 − rx x+1
19. In the design of orifice plate flowmeters, the volumetric flowrate, Q (m3 s−1 ), is given by
s
2g∆h
Q = Cd Ao
1 − A2o /A2p
where Cd is a dimensionless discharge coefficient, ∆h (m) is the head difference across the
orifice plate and Ao (m2 ) is the area of the orifice and Ap (m2 ) is the area of the pipe.
(a) Rearrange the equation to solve for the area of the orifice, Ao , in terms of the other
variables.
(b) A volumetric flowrate of 100 cm3 s−1 passes through a 10 cm inside diameter pipe.
Assuming a discharge coefficient of 0.6, calculate the required orifice diameter, so that
the head difference across the orifice plate is 200 mm.
[Hint: be very careful with the units!]
88 HELM (2006):
®
Answers
1. 1178.1 cm3
2. (a) 500, (b) 1280
3. (a) −1, 2, 5, (b) 15, 7, 3, (c) 5,3,1,0,
4. P =0.667
5. y = 0.920
6. M = 0.067
7. (a) 500 cm, (b) 50 cm, (c) 5620 cm.
8. (a) 50000 cm2 , (b) 3300 cm2 , (c) 62000 cm2 .
9. (a) 15000000 cm3 , (b) 250000 cm3 , (c) 8200000 cm3 .
10. η = 0.764.
11. 150 kg m2
y+7 4 − 8y 4 − 8x 13 − 7y
12. (a) x = , (b) x = , (c) y = , (d) x =
3 3 3 2
RT RT PV PV
13. (a) V = , (b) P = , (c) R = , (d) T =
P V T R
2
v − x
14. (a) x = v 2 − 2y, (b) y =
2
17 − 4v + 3w 17 − 8u + 3w 8u + 4v − 17
15. u = , v= , w=
8 4 3
2v
16. e = 1 −
sgT
A21 q 2
17. A2 = ±
2A21 gh + q 2
y−r 1 + y2
18. (a) x = , (b) x =
1 + yr 1 − y2
19.
QAp
(a) A0 = q
Q2 + 2g∆hA2p Cd2
(b) Q = 100 cm3 s−1 = 10−4 m2 s−1
0.12
Ap = π = 0.007854 m2
4
Cd = 0.6
∆h = 0.2 m
g = 9.81 m s−2
Substituting in answer (a) gives
Ao = 8.4132 ×r 10−5 m2
4Ao
so diameter = = 0.01035 m = 1.035 cm
π
HELM (2006): 89
Contents 2
Basic Functions
1. Basic Concepts of Functions 2
2. Graphs of Functions and Parametric Form 11
3. One-to-One and Inverse Functions 20
4. Characterising Functions 26
5. The Straight Line 36
6. The Circle 46
7. Some Common Functions 62
Learning outcomes
You will gain familiarity with functions and variables. You will learn how to graph a
function and what is meant by an inverse function. You will learn how to use a parametric
approach to describe a function. Finally, you will meet some of the functions which occur
in engineering and science: polynomials, rational functions, the modulus function and
the unit step function.
Contents 2
Basic Functions
1. Basic Concepts of Functions 2
2. Graphs of Functions and Parametric Form 11
3. One-to-One and Inverse Functions 20
4. Characterising Functions 26
5. The Straight Line 36
6. The Circle 46
7. Some Common Functions 62
Learning outcomes
You will gain familiarity with functions and variables. You will learn how to graph a
function and what is meant by an inverse function. You will learn how to use a parametric
approach to describe a function. Finally, you will meet some of the functions which occur
in engineering and science: polynomials, rational functions, the modulus function and
the unit step function.
Basic Concepts of
Functions 2.1
Introduction
In engineering there are many quantities that change their value as time changes. For example, the
temperature of a furnace may change with time as it is heated. Similarly, there are many quantities
that change their value as the location of a point of interest changes. For example, the shear stress
in a bridge girder will vary from point to point across the bridge. A quantity whose value can change
is known as a variable. We use functions to describe how one variable changes as a consequence
of another variable changing. There are many different types of function that are used by engineers.
We will be examining some of these in later Sections. The purpose of this Section is to look at the
basic concepts associated with all functions.

Prerequisites • have a thorough understanding of basic

algebraic notation and techniques

'
$
• explain what is meant by a function
Learning Outcomes • use common notations for functions
On completion you should be able to . . . • explain what is meant by the argument of a

function
& %
2 HELM (2006):
Workbook 2: Basic Functions
1. The function rule
A function can be thought of as a rule which operates on an input and produces an output. This
is often illustrated pictorially in two ways as shown in Figure 1. The first way is by using a block
diagram which consists of a box showing the input, the output and the rule. We often write the rule
inside the box. The second way is to use two sets, one to represent the input and one to represent
the output with an arrow showing the relationship between them.
input output
function
input a specific rule output function

which transforms the
input into the output
Figure 1: A general function
More precisely, a rule is a function if it produces only a single output for any given input. The
function with the rule ‘treble the input’ is shown in Figure 2.
f f
input treb he input
le t output
4 Treble the input 12
4 12
f
x 3x
x Treble the input 3x
Figure 2: The function with the rule ‘treble the input’
Note that with an input of 4 the function will produce an output of 12. With a more general input,
x say, the output will be 3x. It is usual to assign a letter or other symbol to a function in order to
label it. The trebling function in Figure 2 has been given the symbol f .
Key Point 1
A function is a rule which operates on an input
and produces a single output from that input.
HELM (2006): 3
Section 2.1: Basic Concepts of Functions
Task
Write down the output from the function shown in Figure 3 when the input is
(a) 4, (b) −3, (c) x (d) t.
function
input multiply the input by 7 output

and then subtract 2
Figure 3
Your solution
In each case the function rule instructs you to multiply the input by 7 and then subtract 2. Evaluate
the corresponding outputs.
Answer
(a) When the input is 4 the output is 26
(b) When the input is −3 the output is −23
(c) When the input is x the output is 7x − 2
(d) When the input is t the output is 7t − 2.
Several different notations are used by engineers to describe functions. For the trebling function in
Figure 2 it is common to write
f (x) = 3x
This indicates that with an input x, the function, f , produces an output of 3x. The input to the
function is placed in the brackets after the ‘f ’. f (x) is read as ‘f is a function of x’, or simply ‘f of
x’, meaning that the value of the output from the function depends upon the value of the input x.
The value of the output is often called the ‘value of the function’.
4 HELM (2006):
Example 1
State in words the rule defined by each of the following functions:
(a) f (x) = 6x
(b) f (t) = 6t − 1
(c) g(x) = x2 − 7
(d) h(t) = t3 + 5
(e) p(x) = x3 + 5
Solution
(a) The rule for f is ‘multiply the input by 6’.
(b) Here the input has been labelled t. The rule for f is ‘multiply the input by 6 and subtract 1’.
(c) Here the function has been labelled g. The rule for g is ‘square the input and subtract 7’.
(d) The rule for h is ‘cube the input and add 5’.
(e) The rule for p is ‘cube the input and add 5’.
Note from Example 1, parts (d) and (e), that it is the rule that is important when describing a
function and not the letters used. Both h(t) and p(x) instruct us to ‘cube the input and add 5’.
Task
Write down a mathematical function which can be used to describe the following
rules:
(a) ‘square the input and divide the result by 2’. Use the letter x for input and
the letter f to represent the function.
(b) ‘divide the input by 3 and add 7’. Call the function g and call the input t.
Your solution
Answer
x2 t
(a) f (x) = , (b) g(t) = + 7
2 3
Exercise
State the rule of each of the following functions:
(a) f (x) = 5x, (b) f (t) = 5t, (c) f (x) = 8x + 10, (d) f (t) = 7t − 27, (e) f (t) = 1 − t,
t 2 1
(f) h(t) = + , (g) f (x) =
3 3 1+x
HELM (2006): 5
Answers
(a) multiply the input by 5. (b) same as (a). (c) multiply the input by 8 and then add 10. (d)
multiply the input by 7 and then subtract 27. (e) subtract the input from 1. (f) divide the input
by 3 and then add 2/3. (g) add 1 to the input and then find the reciprocal of the result.
2. The argument of a function

The input to a function is sometimes called its argument. It is frequently necessary to obtain the
output from a function if we are given its argument. For example, given the function g(t) = 3t + 2
we may require the value of the output when the argument is 4. We write this as g(t = 4) or more
usually and compactly as g(4). In this case the value of g(4) is 3 × 4 + 2 = 14.
Example 2
Given the function f (x) = 3x + 1 find
(a) f (2)
(b) f (−1)
(c) f (6)
Solution
(a) The output from the function needs to be found when the argument or input is 2. We
need to replace x by 2 in the expression for the function. We find
f (2) = 3 × 2 + 1 = 7
(b) Here the argument is −1. We find
f (−1) = 3 × (−1) + 1 = −2
(c) f (6) = 3 × 6 + 1 = 19.
Task
Given the function g(t) = 6t + 4 find (a) g(3), (b) g(6), (c) g(−2)
Your solution
Answer
a) g(3) = 6 × 3 + 4 = 22, (b) g(6) = 40, (c) g(−2) = −8
6 HELM (2006):
It is possible to obtain the value of a function when the argument is an algebraic expression. Consider
the following Example.
Example 3
Given the function y(x) = 3x + 2 find
(a) y(t)
(b) y(2t)
(c) y(z + 2)
(d) y(5x)
Solution
The rule for this function is ‘multiply the input by 3 and then add 2’. We can apply this rule
whatever the argument.
(a) In this case the argument is t. Multiplying this by 3 and adding 2 we find y(t) = 3t + 2.
Equivalently we can replace x by t in the expression for the function, so, y(t) = 3t + 2.
(b) In this case the argument is 2t. We need to replace x by 2t in the expression for the
function. So y(2t) = 3(2t) + 2 = 6t + 2
(c) In this case the argument is z + 2. We find y(z + 2) = 3(z + 2) + 2 = 3z + 8. It is
important to note that y(z + 2) is not y × (z + 2) = yz + y2 but instead reads ‘y of
(z + 2)’ where ‘of’ means ‘take the function of’.
(d) Here we have a complication. The argument is 5x and so there appears to be a clash
of notation with the original expression for the function. There is no problem if we
remember that the rule is to multiply the input by 3 and then add 2. The input now is
5x. So y(5x) = 3(5x) + 2 = 15x + 2.
Task
Given the function g(x) = 8 − 2x find (a) g(4), (b) g(4t), (c) g(2x − 3)
Your solution
(a)
(b)
(c)
HELM (2006): 7
Answer
(a) g(4) = 8 − 2 × 4 = 0
(b) g(4t) = 8 − 2 × 4t = 8 − 8t
(c) g(2x − 3) = 8 − 2(2x − 3) = 14 − 4x
Exercises
1. Explain what is meant by the ‘argument’ of a function.
2. Given the function g(t) = 8t + 3 find (a) g(7), (b) g(2), (c) g(−0.5), (d) g(−0.11)
3. Given the function f (t) = 2t2 + 4 find: (a) f (x) (b) f (2x) (c) f (−x) (d) f (4x + 2)
t
(e) f (3t + 5) (f) f (λ) (g) f (t − λ) (h) f ( )
α
4. Given g(x) = 3x2 − 7 find (a) g(3t), (b) g(t + 5), (c) g(6t − 4), (d) g(4x + 9)
1
5. Calculate f (x + h) when (a) f (x) = x2 , (b) f (x) = x3 , (c) f (x) = . In each case write
x
down the corresponding expression for f (x + h) − f (x).
1 x
6. If f (x) = 2
find f ( ).
(1 − x) `
Answers
1. The argument is the input.
2. (a) 59, (b) 19, (c) −1, (d) 2.12.
3. (a) 2x2 + 4, (b) 8x2 + 4, (c) 2x2 + 4, (d) 32x2 + 32x + 12, (e) 18t2 + 60t + 54,
2t2
(f) 2λ2 + 4, (g) 2(t − λ)2 + 4, (h) + 4.
α2
4. (a) 27t2 − 7, (b) 3t2 + 30t + 68, (c) 108t2 − 144t + 41, (d) 48x2 + 216x + 236.
1
5. (a) x2 + 2xh + h2 , (b) x3 + 3x2 h + 3xh2 + h3 , (c) .
x+h
The corresponding expressions are (a) 2xh + h2 , (b) 3x2 h + 3xh2 + h3 ,
1 1 h
(c) − =− .
x+h x x(x + h)
1
6. .
(1 − x` )2
8 HELM (2006):
3. Composition of functions
Consider the two functions g(x) = x2 , and h(x) = 3x + 5. Block diagrams showing the rules for
these functions are shown in Figure 4.
x square the input x2
h
treble the input 3x + 5
x and add 5
Figure 4: Block diagrams of two functions g and h
Suppose we place these Block diagrams together in series as shown in Figure 5, so that the output
from function g is used as the input to function h.
g h
x2 treble the input
x square the input
and add 5 3x2 + 5
Figure 5: The composition of the two functions to give h(g(x))
Study Figure 5 carefully and deduce that when the input to g is x the output from the two functions
in series is 3x2 + 5. Since the output from g is used as input to h we write
h(g(x)) = h(x2 ) = 3x2 + 5
The form h(g(x)) is known as the composition of the functions g and h.
Suppose we interchange the two functions so that h is applied first as shown in Figure 6.
h g
treble the input
x and add 5 square the input (3x + 5)2
Figure 6: The composition of the two functions to give g(h(x))
Study Figure 6 and note that when the input to h is x the final output is (3x + 5)2 . We write
g(h(x)) = (3x + 5)2
Note that the function h(g(x)) is different from g(h(x)).
HELM (2006): 9
Example 4
Given two functions g(t) = 3t + 2 and h(t) = t + 3 obtain an expression for the
composition g(h(t)).
Solution
We have g(h(t)) = g(t + 3). Now the rule for g is ‘triple the input and add 2’, and so we can
write g(t + 3) = 3(t + 3) + 2 = 3t + 11 so, g(h(t)) = 3t + 11.
Task
Given the two functions g(t) = 3t + 2 and h(t) = t + 3 as in Example 4 above,
obtain an expression for the composition h(g(t)).
Your solution
We have
h(g(t)) = h(3t + 2)
State the rule for h and write down h(g(t)).
Answer
‘add 3 to the input’, h(3t + 2) = 3t + 5. Note that h(g(t)) 6= g(h(t)).
Exercises
1. Find f (g(x)) when f (x) = x − 7 and g(x) = x2 .
2. If f (x) = 8x + 2 find f (f (x)).
3. If f (x) = x + 6 and g(x) = x2 − 5 find (a) f (g(0)), (b) g(f (0)), (c) g(g(2)), (d) f (g(7)).
x−3 1
4. If f (x) = and g(x) = find g(f (x)).
x+1 x
Answers
1. x2 − 7.
2. 8(8x + 2) + 2 = 64x + 18.
3. (a) 1, (b) 31, (c) −4, (d) 50.

x+1
4. .
x−3
10 HELM (2006):
Graphs of Functions
and Parametric Form 2.2
Introduction
Engineers often find mathematical ideas easier to understand when these are portrayed visually as
opposed to algebraically. Graphs are a convenient and widely-used way of portraying functions. By
inspecting a graph it is easy to describe a number of properties of a function. For example, where
is the function positive, and where is it negative? Where is it increasing and where is it decreasing?
Do function values repeat? Questions like these can be answered once the graph of a function has
been drawn. In this Section we will describe how the graph of a function is obtained and introduce
various terminology associated with graphs.
We have seen in Section 2.1 that it is possible to represent a function using the form y = f (x). An
alternative representation is to write expressions for both y and x in terms of a third variable known
as a parameter. The variables t or θ are normally used to denote the parameter.
For example, when a projectile such as a ball or rocket is thrown or launched, the x and y coordinates
of its path can be described by a function in the form y = f (x). However, it is often useful to also
give its x coordinate as a function of the time after launch, that is x(t), and the y coordinate similarly
as y(t). Here time t is the parameter.

Prerequisites • understand what is meant by a function


• draw the graphs of a variety of functions
Learning Outcomes • explain what is meant by the domain and
On completion you should be able to . . . range of a function

HELM (2006): 11
Section 2.2: Graphs of Functions and Parametric Form
1. The graph of a function
Consider the function f (x) = 2x. The output is obtained by multiplying the input by 2. We can
choose several values for the input to this function and calculate the corresponding outputs. We
have done this for integer values of x between −2 and 2 and the results are shown in Table 1.
Table 1
input, x −2 −1 0 1 2
output, f (x) −4 −2 0 2 4
To construct the graph of this function we first draw a pair of axes - a vertical axis and a horizontal
axis. These are drawn at right-angles to each other and intersect at the origin as shown in Figure 7.
vertical ( y ) axis
y = 2x
4
origin 3
2
1
−2 −1 1 1.5 2 horizontal ( x ) axis
−1
−2
−3
−4
Figure 7: The two axes intersect at the origin
Each pair of input and output values can be represented on a graph by a single point. The input
values are measured along the horizontal axis and the output values are measured along the vertical
axis. The horizontal axis is often called the x axis. The vertical axis is commonly referred to as the
y axis so that we often write the function as
y = f (x) = 2x
or simply
y = 2x
Each pair of x and y values in the table is plotted as a single point, shown as • in Figure 7. A general
point is often labelled as (x, y). The values x and y are said to be the coordinates of the point.
The points are then joined with a smooth curve to produce the required graph as shown in Figure
7. Note that in this case the graph is a straight line. The graph can then be used to find function
values other than those given in the table. For example, directly from the graph we can see that
when x = 1.5, the value of y is 3.
12 HELM (2006):
Task
Draw up a table of values of the function f (x) = x3 for x between −3 and 3. Use
the table to plot a graph of this function.
Complete the following table:

Your solution
input, x −3 −2 −1 0 1 2 3
output, f (x) −27 −8 27
Answer
input, x −3 −2 −1 0 1 2 3
output, f (x) −27 −8 −1 0 1 8 27
Now add your points to the graph of f (x) = x3 and draw a smooth curve through them:
Your solution
y
30
20
10
−3 −2 −1 1 2 3 x
− 10
−20
−30
Dependent and independent variables

Since x and y can have a number of different values they are variables. Here x is called the
independent variable and y is called the dependent variable. Knowing or choosing a value of
the independent variable x, the function rule enables us to calculate the corresponding value of the
dependent variable y. To show this dependence we often write y(x). This is read as ‘y is a function
of x’ or ‘y depends upon x’, or simply ‘y of x’. Note that it is the independent variable which is the
input to the function and the dependent variable which is the output.
The domain and range of a function
The set of values which we allow the independent variable to take is called the domain of the
function. A domain is often an interval on the x axis. For example, the function
y = g(x) = 5x + 2, −5 ≤ x ≤ 20
has any value of x between −5 and 20 inclusive as its domain because it has been stated as this. If
the domain of a function is not stated then it is taken to be the largest set possible. For example
h(t) = t2 + 1
has domain −∞ < x < ∞ since h is defined for every value of t and the domain has not been stated
otherwise.
HELM (2006): 13
Later, you will meet some functions for which certain values of the independent variable must be
excluded from the domain because at these values the function would be undefined. One such
1 1
example is f (x) = for which we must exclude the value x = 0, since is a meaningless quantity.
x 0
1
Similarly, we must exclude the value x = 2 from the domain of f (x) = .
x−2
The set of values of the function for a given domain, that is, the set of y values, is called the
range of the function. The range of g(x) (above) is −23 ≤ g(x) ≤ 102 and the range of h(t) is
1 ≤ h(t) < ∞, although this may not be apparent to you at this stage. Usually the range of a
function can be identified quite easily by inspecting its graph.
Example 5
Consider the function given by g(t) = 2t2 + 1, −2 ≤ t ≤ 2.
(a) State the domain of the function.
(b) Plot a graph of the function.
(c) Deduce the range of the function from the graph.
Solution
(a) The domain is given as the interval −2 ≤ t ≤ 2, that is any value of t between −2 and
2 inclusive.
(b) To draw the graph a table of input and output values must be constructed first. See
Table 2.
Table 2
t −2 −1 0 1 2
y = g(t) 9 3 1 3 9
Each pair of t and y values in the table is plotted as a single point shown as • in Figure
8. The points are then joined with a smooth curve to produce the required graph.
y
9
g(t) = 2t2 + 1
−2 −1 0 1 2 t
Figure 8: Graph of g(t) = 2t2 + 1

(c) The range is the set of values which the function takes. By inspecting the graph we see
that the range of g is the interval 1 ≤ g(t) ≤ 9.
14 HELM (2006):
Task
Consider the function given by f (x) = x2 + 2, −3 ≤ x ≤ 3
(a) State the domain of the function:
Your solution
Recall that the domain of a function f (x) is the set of values that x is allowed to take. Write
down this set of values:
Answer
−3 ≤ x ≤ 3
(b) Draw up a table of input and output values for this function:
Your solution
The table of values has been partially calculated. Complete this now:
input, x −3 −2 −1 0 1 2 3
2
output, x + 2 6 2
Answer
x −3 −2 −1 0 1 2 3
2
x + 2 11 6 3 2 3 6 11
(c) Plot a graph of the function:
Your solution
Part of the graph f (x) = x2 + 2 is shown in the figure. Complete it.
f (x) = x2 + 2
10
x
−3 −2 −1 0 1 2 3
HELM (2006): 15
(d) Deduce the range of the function by inspecting the graph:
Your solution
Recall that the range of the function is the set of values that the function takes as x is varied. It is
possible to deduce this from the graph. Write this set as an interval.
Answer
(d) [2, 11]
Exercises
1. Explain the meaning of the terms ‘dependent variable’ and ‘independent variable’. When
plotting a graph, which variables are plotted on which axes ?
2. When stating the coordinates of a point, which coordinate is given first ?
3. Explain the meaning of an expression such as y(x) in the context of functions. What is the
interpretation of x(t) ?
4. Explain the meaning of the terms ‘domain’ and ‘range’ when applied to functions.
5. Plot a graph of the following functions. In each case state the domain and the range of the
function.
(a) f (x) = 3x + 2, −2 ≤ x ≤ 5
2
(b) g(x) = x + 4, −2 ≤ x ≤ 3
2
(c) p(t) = 2t + 8, −2 ≤ t ≤ 4
(d) f (t) = 6 − t2 , 1≤t≤5
5
6. Explain why the value x = −7 should be excluded from the domain of f (x) = x+7
.
1
7. What value(s) should be excluded from the domain of f (t) = t2
?
Answers
1. The independent variable is plotted on the horizontal axis.
2. The independent variable is given first, as in (x, y).
3. x(t) means that the dependent variable x is a function of the independent variable t.
5. (a) domain [−2, 5], range [−4, 17], (b) [−2, 3], [4, 13], (c) [−2, 4], [8, 40], (d) [1, 5], [−19, 5].
6. f is undefined when x = −7.
7. t = 0.
16 HELM (2006):
2. Parametric representation of a function
Suppose we write x and y in terms of t in the form
x = 4t y = 2t2 , for − 1 ≤ t ≤ 1 (1)
For different values of t between −1 and 1, we can calculate pairs of values of x and y. For example
when t = 1 we see that x = 4(1) = 4 and y = 2 × 12 = 2. That is, t = 1 corresponds to the point
with (x, y) coordinates (4, 2).
A table of values is given in Table 3.
Table 3
t −1 −0.5 0 0.5 1
x −4 −2 0 2 4
y 2 0.5 0 0.5 2
If the resulting points are plotted on a graph then different values of t correspond to different points
on the graph. The graph of (1) is plotted in Figure 9.
2 t=1
t = −1
t = −0.5 t = 0.5
−4 −3 −2 −1 1 2 3 4 x
t=0
Figure 9: Graph of the function defined parametrically by x = 4t, y = 2t2 , −1 ≤ t ≤ 1
It is often possible to convert a parametric representation of a function into the more usual form by
x
combining the two expressions to eliminate the parameter. Thus if x = 4t we can write t = and
4
so
x 2
2
y = 2t = 2
4
2x2
=
16
x2
=
8
x2
Using y = we can, by giving x values, find corresponding values of y. Plotting these (x, y) values
8
gives, of course, exactly the same curve as in Figure 9.
HELM (2006): 17
Task
Consider the function x = 21 t + 1t , y = 12 t − 1t , 1 ≤ t ≤ 8.

(a) Draw up a table of values of this function.

(b) Plot a graph of the function
Your solution
(a) A partially completed table of values has been prepared. Complete the table.
t 1 2 3 4 5 6 7 8
x 1 1.25 1.67 4.06
y 0 0.75 3.94
Answer
t 1 2 3 4 5 6 7 8
x 1 1.25 1.67 2.13 2.60 3.08 3.57 4.06
y 0 0.75 1.33 1.88 2.40 2.92 3.43 3.94
Your solution
(b) The graph is shown in the figure. Add your points to those already marked on the graph.
4
3
2
1
x
1 2 3 4
It is possible to eliminate t between the two equations so that the original parametric form can be
expressed as x2 − y 2 = 1.
18 HELM (2006):
Task
A particle with mass m falls under gravity so that at time t its distance from the
t2
y-axis is 2t and its distance from the x-axis is −mg + 3 where g is a constant
2
(the acceleration due to gravity). Find the value of t when the particle crosses the
x-axis and, at this time, find the distance from the y-axis.
Begin by obtaining the parametric equations of the path of the particle:

Your solution
x= y=
Answer
t2
x = 2t y = −mg + 3
2
Now find the value of t when y = 0:
Your solution
t=
Answer
p
t = 6/(mg)
Finally, obtain the value of x at this value of t:

Your solution
x=
Answer
p
x = 2 6/(mg)
Exercises
1. Explain what is meant by the term ‘parameter’.
√
2. Consider the parametric equations x = t, y = t, for t ≥ 0.
(a) Draw up a table of values of t, x and y for values of t between 0 and 10.
(b) Plot a graph of this function.
(c) Obtain an explicit equation for y in terms of x.
Answers
√
2. (c) y = x2 , 0 ≤ x ≤ 10
HELM (2006): 19
One-to-One and
Inverse Functions 2.3
Introduction
In this Section we examine more terminology associated with functions. We explain one-to-one and
many-to-one functions and show how the rule associated with certain functions can be reversed to
give so-called inverse functions. These ideas will be needed when we deal with particular functions
in later Sections.

• understand what is meant by a function

Prerequisites
• be able to sketch graphs of simple functions

'
$
• explain what is meant by a one-to-one
function
• explain what is meant by a many-to-one

Learning Outcomes function
On completion you should be able to . . . • explain what is meant by an inverse function,
and determine when and how such a function
can be found
& %
20 HELM (2006):
1. One-to-many rules, many-to-one and one-to-one
functions
One-to-many rules
Recall from Section 2.1 that a rule for a function must produce a single output for a given input.
Not all rules satisfy this criterion. For example, the rule ‘take the square root of the input’ cannot be
a rule for a function because for a given input there are two outputs; an input of 4 produces outputs
of 2 and −2. Figure 10 shows two ways in which we can picture this situation, the first being a block
diagram, and the second using two sets representing input and output values and the relationship
between them.
e square ro
2 input take th ot output
take the square root
4 of the input
−2 2
√ 4
x −2
take the square root
x of the input √
− x
Figure 10: This rule cannot be a function - it is a one-to-many rule
Such a rule is described as a one-to-many rule. This means that one input produces more than
one output. This is obvious from inspecting the sets in Figure 10.
√
The graph of the rule ‘take ± x’ can be drawn by constructing a table of values:
Table 4
x 0 1 √2 √3 4
√
y=± x 0 ±1 ± 2 ± 3 ±2
The graph is shown in Figure 11(a). For each value of x there are two corresponding values of y.
Plotting a graph of a one-to-many rule will result in a curve through which a vertical line can be
drawn which cuts the curve more than once as you can see. The vertical line cuts the curve more
than once because there is more than one y value for each x value.
y y
x x
(a) (b)
Figure 11
HELM (2006): 21
Section 2.3: One-to-One and Inverse Functions
By describing a rule more carefully it is possible to make sure a single output results from a single
input, thereby defining a valid rule for a function. For example, the rule ‘take the positive square
root of the input’ is a valid function rule because a given input produces a single output. The graph
of this function is displayed in Figure 11(b).
Many-to-one and one-to-one functions

Consider the function y(x) = x2 . An input of x = 3 produces an output of 9. Similarly, an input of
−3 also produces an output of 9. In general, a function for which different inputs can produce the
same output is called a many-to-one function. This is represented pictorially in Figure 12 from
which it is clear why we call this a many-to-one function.
input y = x2
output
−3
9
3
Figure 12: This represents a many-to-one function
Note that whilst this is many-to-one it is still a function since any chosen input value has only one
arrow emerging from it. Thus there is a single output for each input.
It is possible to decide if a function is many-to-one by examining its graph. Consider the graph of
y = x2 shown in Figure 13.
y
y = x2
x
−3 3
Figure 13: The function y = x2 is a many-to-one function
We see that a horizontal line drawn on the graph cuts it more than once. This means that two (or
more) different inputs have yielded the same output and so the function is many-to-one.
If a function is not many-to-one then it is said to be one-to-one. This means that each different
input to the function yields a different output.
Consider the function y(x) = x3 which is shown in Figure 14. A horizontal line drawn on this graph
will intersect the curve only once. This means that each input value of x yields a different output
value for y.
22 HELM (2006):
y y = x3
10
x
−5 5
−10
Figure 14: The function y(x) = x3 is a one-to-one function
Task
Study the graphs shown in Figure 15. Decide which, if any, are graphs of functions.
For those which are, state if the function is one-to-one or many-to-one.
y y
a) y b) c)
x x
x
Figure 15
Your solution
Answer
(a) not a function, (b) one-to-one function, (c) many-to-one function
2. Inverse of a function
We have seen that a function can be regarded as taking an input, x, and processing it in some way
to produce a single output f (x) as shown in Figure 16(a). A natural question to ask is whether we
can find another function that will reverse the process. In other words, can we find a function that
will start with f (x) and process it to produce x again? This idea is also shown in Figure 16(b). If we
can find such a function it is called the inverse function to f (x) and is given the symbol f −1 (x).
Do not confuse the ‘−1’ with an index, or power. Here the superscript is used purely as the notation
for the inverse function. Note that the composite function f −1 (f (x)) = x as shown in Figure 17.
HELM (2006): 23
f
x process f (x) (a)
f −1
f (x) reverse process x (b)
Figure 16: The second block reverse the process in the first
f f −1
f (x)
x process reverse process x
Figure 17: f −1 reverses the process in f
Example 6
Find the inverse function to f (x) = 3x − 8.
Solution
The given function takes an input, x and produces an output 3x − 8. The inverse function, f −1 ,
must take an input 3x − 8 and give an output x. That is
f −1 (3x − 8) = x
If we introduce a new variable z = 3x − 8, and transpose this for x to give
z+8 z+8
x= then f −1 (z) =
3 3
So the rule for f is add 8 to the input and divide the result by 3. Writing f −1 with x as its
−1
argument gives
x+8
f −1 (x) =
3
This is the inverse function.
24 HELM (2006):
Not all functions possess an inverse function. In fact, only one-to-one functions do so. If a function
is many-to-one the process to reverse it would require many outputs from one input contradicting
the definition of a function.
Task
Find the inverse of the function f (x) = 7 − 3x, using the fact that the inverse
function must take an input 7−3x and produce an output x. So f −1 (7−3x) = x
Introduce a new variable z so that z = 7 − 3x and transpose this to find x. Hence write down the
inverse function:
Your solution
Answer
7−z 7−x
f −1 (z) = . With x as its argument the inverse function is f −1 (x) = .
3 3
Exercises
1. Explain why a one-to-many rule cannot be a function.
2. Illustrate why y = x4 is a many-to-one function by providing a suitable example.
3. By sketching a graph of y = 3x − 1 show that this is a one-to-one function.
4. Explain why a many-to-one function does not have an inverse function. Give an example.
5. Find the inverse of each of the following functions:
1
(a) f (x) = 4x + 7, (b) f (x) = x, (c) f (x) = −23x, (d) f (x) = .
x+1
Answers
x−7 x 1−x
5. (a) f −1 (x) = , (b) f −1 (x) = x, (c) f −1 (x) = − , (d) f −1 (x) = .
4 23 x
HELM (2006): 25
Characterising
Functions 2.4
Introduction
There are a number of different terms used to describe the ways in which functions behave. In this
Section we explain some of these terms and illustrate their use.


Prerequisites
• be able to graph simple functions

'
$
• explain the distinction between a continuous
and discontinuous function
• find the limits of simple functions

Learning Outcomes
• explain what is meant by a periodic function
• explain what is meant by an odd function and
an even function
& %
26 HELM (2006):
1. Continuous and discontinuous functions and limits
Look at the graph shown in Figure 18a. The curve can be traced out from left to right without
moving the pen from the paper. The function represented by this curve is said to be continuous at
every point. If we try to trace out the curve in Figure 18b, the presence of a jump in the graph (at
x = x1 ) means that the pen must be lifted from the paper and moved in order to trace the graph.
Such a function is said to be discontinuous at the point where the jump occurs. The jumps are
known as discontinuities.
x1
(a) (b)
Figure 18: (a) A continuous function (b) A discontinuous function
Task
Sketch a graph of a function which has two discontinuities.
Your solution
(Get your tutor to check it.)
When defining a discontinuous function algebraically it is often necessary to give different function
rules for different values of x. Consider, for example, the function defined as:

3 x<0
f (x) =
x2 x ≥ 0
Notice that there is one rule for when x is less than 0 and another rule for when x is greater than or
equal to 0.
A graph of this function is shown in Figure 19.
HELM (2006): 27
Section 2.4: Characterising Functions
f(x)
x
−3 −2 −1 0 1 2 3
Figure 19: An example of a discontinuous function

Suppose we ask ‘to what value does y approach as x approaches 0?’. From the graph we see that
as x gets nearer and nearer to 0, the value of y gets nearer to 0, if we approach from the right-hand
side. We write this formally as
lim f (x) = 0
x→0+
and say ‘the limit of f (x) as x tends to 0 from above is 0.’

On the other hand if x gets closer to zero, from the left-hand side, the value of y remains at 3. In
this case we write
lim f (x) = 3
x→0−
and say ‘the limit of f (x) as x tends to 0 from below is 3.’

In this example the right-hand limit and the left-hand limit are not equal, and this is indicative of
the fact that the function is discontinuous.
In general a function is continuous at a point x = a if the left-hand and right-hand limits are the
same there and are finite, and if both of these are equal to the value of the function at that point.
That is
Key Point 2
A function f (x) is continuous at x = a if and only if:
lim f (x) = lim− f (x) = f (a)

x→a+ x→a
If the right-hand and left-hand limits are the same, we can simply describe this common limit as
lim f (x). If the limits are not the same we say the limit of the function does not exist at x = a.
x→a
28 HELM (2006):
Exercises
1. Explain the distinction between a continuous and a discontinous function. Draw a graph
showing an example of each type of function.
2. Study graphs of the functions y = x2 and y = −x2 . Are these continuous functions?
3. Study graphs of y = 3x − 2 and y = −7x + 1. Are these continuous functions?
4. Draw a graph of the function


 2x + 1 x < 3
f (x) = 5 x=3
6 x>3

Find
(a) lim+ f (x), (b) lim− f (x), (c) lim f (x), (d) lim+ f (x), (e) lim− f (x),
x→0 x→0 x→0 x→3 x→3
(f) lim f (x),

x→3
Answers 2. Yes. 3. Yes. 4. (a) 1, (b) 1, (c) 1, (d) 6, (e) 7, (f) limit does not exist.
2. Periodic functions
Any function that has a definite pattern repeated at regular intervals is said to be periodic. The
interval over which the repetition takes place is called the period of the function, and is usually given
the symbol T . The period of a periodic function is usually obvious from its graph.
Figure 20 figure shows a graph of a periodic function with period T = 3. This function has discon-
tinuities at values of x which are divisible by 3.
f(x)
−6 −3 0 3  x
T
Figure 20
HELM (2006): 29
Figure 21 shows a graph of a periodic function with period T = 6. This function has no discontinu-
ities.
f(x)
−6 −3 0 3  x
T
Figure 21
If a function is a periodic function with period T then, for any value of the independent variable x,
the value of f (x + T ) is the same as the value of f (x).
Key Point 3
A function f (x) is periodic if we can find a number T such that
f (x + T ) = f (x) for all values of x
Often a periodic function will be defined by simply specifying the period of the function and by
stating the rule for the function within one period. This information alone is sufficient to draw the
graph for all values of the independent variable.
Figure 22 shows a graph of the periodic function defined by
f (x) = x, −π < x < π, period T = 2π
0
! 3! x
"!
Figure 22
30 HELM (2006):
Exercises
1. Explain what is meant by a periodic function.
2. Sketch a graph of a periodic function which has no discontinuities.
3. Sketch a graph of a periodic function which has discontinuities.
4. A periodic function has period 0.01 seconds. How many times will the pattern in the graph repeat
over an interval of 10 seconds ?
Answer 4. 1000.
3. Odd and even functions
Example 7
Figure 23 shows graphs of several functions. They share a common property.
Study the graphs and comment on any symmetry.
Figure 23
Answer
The graphs are all symmetrical about the y axis.
Any function which is symmetrical about the y axis, i.e. where the graph of the right-hand part is the
mirror image of that on the left, is said to be an even function. Even functions have the following
property:
Key Point 4
Even Function
An even function is such that f (−x) = f (x) for all values of x.
Key Point 4 is saying that the function value at a negative value of x is the same as the function
value at the corresponding positive value of x.
HELM (2006): 31
Example 8
Show algebraically that f (x) = x4 + 5 is an even function.
Solution
We must show that f (−x) = f (x).
f (−x) = (−x)4 + 5 = x4 + 5
Hence f (−x) = f (x) and so the function is even. Check for yourself that f (−3) = f (3).
Task
Extend the graph in the solution box in order to produce a graph of an even
function.
Your solution
Answer
Task
The following diagrams shows graphs of several functions. They share a common
property. Study the graphs and comment on any symmetry.
32 HELM (2006):
Your solution
Answer
There is rotational symmetry about the origin. That is, each curve, when rotated through 180◦ ,
transforms into itself.
Any function which possesses such symmetry − that is the graph of the right can be obtained by
rotating the curve on the left through 180◦ about the origin − is said to be an odd function. Odd
functions have the following property:
Key Point 5
Odd Function
An odd function is such that f (−x) = −f (x) for all values of x.
Key Point 5 is saying that the function value at a negative value of x is minus the function value at
the corresponding positive value of x.
Example 9
Show that the function f (x) = x3 + 4x is odd.
Solution
We must show that f (−x) = −f (x).
f (−x) = (−x)3 + 4(−x)

= −x3 − 4x
= −(x3 + 4x)
= −f (x)
and so this function is odd. Check for yourself that f (−2) = −f (2).
HELM (2006): 33
Task
Extend the graph in the solution box in order to produce a graph of an odd function.
Your solution
Answer
Note that some functions are neither odd nor even; for example f (x) = x3 + x2 is neither even
nor odd.
The reader should confirm (with simple examples) that, ‘odd’ and ‘even’ functions have the following
properties:
odd + odd = odd even + even = even odd + even = neither

odd × odd = even even × even = even odd × even = odd
34 HELM (2006):
Exercises
1. Classify the following functions as odd, even or neither. If necessary sketch a graph to help you
decide. (a) f (x) = 6, (b) f (x) = x2 , (c) f (x) = 2x + 1, (d) f (x) = x, (e) f (x) = 2x
2. The diagram below represents a heavy cable hanging under gravity from two points at the
same height. Such a curve (shown as a dashed line), known as a catenary, is described by a
mathematical function known as a hyperbolic cosine, f (x) = cosh x, discussed in 6.
y
y = cosh x
0 x
A catenary
(a) Comment upon any symmetry.
(b) Is this function one-to-one or many-to-one?
(c) Is this a continuous or discontinuous function?
(d) State lim cosh x.

x→0
Answers
1(a) even, (b) even, (c) neither, (d) odd, (e) odd
2(a) function is even, symmetric about the y-axis, (b) many-to-one, (c) continuous, (d) 1
HELM (2006): 35

The Straight Line 2.5
Introduction
Probably the most important function and graph that you will use are those associated with the
straight line. A large number of relationships between engineering variables can be described using a
straight line or linear graph. Even when this is not strictly the case it is often possible to approximate
a relationship by a straight line. In this Section we study the equation of a straight line, its properties
and graph.


Prerequisites
• be able to graph simple functions

'
$
• recognise the equation of a straight line
• explain the significance of a and b in the

equation of a line f (x) = ax + b
Learning Outcomes • find the gradient of a straight line given two

points on the line
• find the equation of a straight line through
two points
• find the distance between two points

& %
36 HELM (2006):
1. Linear functions
Any function of the form y = f (x) = ax + b where a and b are constants is called a linear function.
The constant a is called the coefficient of x, and b is referred to as the constant term.
Key Point 6
All linear functions can be written in the form:
f (x) = ax + b
where a and b are constants.
1 2
For example, f (x) = 3x + 2, g(x) = x − 7, h(x) = −3x + and k(x) = 2x are all linear functions.
2 3
The graph of a linear function is always a straight line. Such a graph can be plotted by finding just
two distinct points and joining these with a straight line.
Example 10
Plot the graph of the linear function y = f (x) = 4x + 3.
Solution
We start by finding two points. For example if we choose x = 0, then y = f (0) = 3, i.e. the first
point has coordinates (0, 3). Secondly, suppose we choose x = 5, then y = f (5) = 23. The second
point is (5, 23). These two points are then plotted and then joined by a straight line as shown in
the following diagram.
y
25
20
15
10
5
0 1 2 3 4 5 x
HELM (2006): 37
Section 2.5: The Straight Line
Example 11
Plot graphs of the three linear functions y = 4x − 3, y = 4x, and y = 4x + 5, for
−2 ≤ x ≤ 2.
Solution
For each function it is necessary to find two points on the line.
For y = 4x − 3, suppose for the first point we choose x = 0, so that y = −3. For the second point,
let x = 2 so that y = 5. So, the points (0, −3) and (2, 5) can be plotted and joined. This is shown
in the following diagram.
y
10
5
−2 −1 1 2 x
−5
For y = 4x we find the points (0, 0) and (2, 8). Similarly for y = 4x + 5 we find points (0, 5) and
(2, 13). The corresponding lines are also shown in the figure.
Task
Refer to Example 11. Comment upon the effect of changing the value of the
constant term of the linear function.
Your solution
Answer
As the constant term is varied, the line moves up or down the page always remaining parallel to its
initial position.
The value of the constant term is also known as the vertical or y -axis intercept because this is the
value of y where the line cuts the y axis.
38 HELM (2006):
Task
State the vertical intercept of each of the following lines:
1 1
(a) y = 3x + 3, (b) y = x − , (c) y = 1 − 3x, (d) y = −5x.
2 3
In each case you need to identify the constant term:
Your solution
(a) (b) (c) (d)
Answer
1
(a) 3, (b) − , (c) 1, (d) 0
3
Example 12
Plot graphs of the lines y = 3x + 3, y = 5x + 3 and y = −2x + 3.
Solution
Note that all three lines have the same constant term, that is 3. So all three lines pass through
(0, 3), the vertical intercept. A further point has been calculated for each of the lines and their
graphs are shown in the following diagram.
y
10
5
−2 1 x
−5
Note from the graphs in Example 12 that as the coefficient of x is changed the gradient of the
graph changes. The coefficient of x gives the gradient or slope of the line. In general, for the
line y = ax + b a positive value of a produces a graph which slopes upwards from left to right. A
negative value of a produces a graph which slopes downwards from left to right. If a is zero the line
is horizontal, that is its gradient is zero. These properties are summarised in the next figure.
HELM (2006): 39
y y y
a is negative
a is zero
a is positive
x x x
Figure 24: The gradient of a line y = ax + b depends upon the value of a.
Key Point 7
Linear Equation
In the linear function f (x) = ax + b, a is the gradient and b is the vertical intercept.
Task
State the gradients of the following lines:
1 x+2
(a) y = 7x + 2 (b) y = − x + 4 (c) y =
3 3
In each case the coefficient of x must be examined:

Your solution
(a) (b) (c)
Answer
(a) 7, (b) −1/3, (c) 1/3
40 HELM (2006):
Task
Which of the following lines has the steepest gradient ?
17x + 4 1
(a) y = , (b) y = 9x − 2, (c) y = x + 4.
5 3
Your solution
Answer
17 1
(b) because the three gradients are (a) (b) 9 (c)
5 3
Exercises
1. State the general form of the equation of a straight line explaining the role of each of the terms
in your answer.
2. State which of the following functions will have straight line graphs.
1
(a) f (x) = 3x − 3, (b) f (x) = x1/2 , (c) f (x) = , (d) f (x) = 13, (e) f (x) = −2 − x.
x
3. For each of the following, identify the gradient and vertical intercept.
(a) f (x) = 2x + 1, (b) f (x) = 3, (c) f (x) = −2x, (d) f (x) = −7 − 17x,
(e) f (x) = mx + c.
Answers
1. e.g. y = ax + b. x is the independent variable, y is the dependent variable, a is the gradient

and b is the vertical intercept.
2. (a), (d) and (e) will have straight line graphs.
3. (a) gradient = 2, vertical intercept =1, (b) 0, 3, (c) −2, 0, (d) −17, −7, (e) m, c.
HELM (2006): 41
2. The gradient of a straight line through two points
A common requirement is to find the gradient of a line when we know the coordinates of two points
on it. Suppose the two points are A(x1 , y1 ), B(x2 , y2 ) as shown in the following figure.
y
B(x2 , y2 )
A(x1 , y1 )
0 x
Figure 25
The gradient of the line joining A and B can be calculated from the following formula.
Key Point 8
Gradient of Line Through Two Points
The gradient of the line joining A(x1 , y1 ) and B(x2 , y2 ) is given by
y2 − y1
gradient =
x2 − x1
Example 13
Find the gradient of the line joining the points A(0, 3) and B(4, 5).
Solution
We calculate the gradient as follows:
y2 − y1 y
gradient =
x2 − x1 7
5−3 3
=
4−0
1 0 4 x
= 8
2
Thus the gradient of the line is 12 . Graphically, this means that when x increases by 1, the value of
y increases by 21 .
42 HELM (2006):
Task
Find the gradient of the line joining the points A(−1, 4) and B(2, 1).
Your solution
y2 − y1
gradient = =
x2 − x1
Answer
1−4
= −1
2 − (−1)
Thus the gradient of the line is −1. Graphically, this means that when x increases by 1, the value of
y decreases by 1.
Exercises
1. Calculate the gradient of the line joining (1, 0) and (15, −3).
2. Calculate the gradient of the line joining (10, −3) and (15, −3).
Answers
1. −3/14. 2. 0
HELM (2006): 43
3. The equation of a straight line through two points
The equation of the line passing through the points with coordinates A(x1 , y1 ) and B(x2 , y2 ) is given
by the following formula.
Key Point 9
The line passing through points A(x1 , y1 ) and B(x2 , y2 ) is given by

y − y1 x − x1 y2 − y1
= or, equivalently y − y1 = (x − x1 )
y2 − y1 x2 − x1 x2 − x1
Task
Find the equation of the line passing through A(−7, 11) and B(1, 3).
y − y1 x − x1
First apply the formula: =
y2 − y1 x2 − x1
Your solution
y−
=
Answer
y − 11 x+7
= .
3 − 11 1+7
Simplify this to obtain the required equation:
Your solution
Answer
y =4−x
Exercises
1. Find the equation of the line joining (1, 5) and (−9, 2).
2. Find the gradient and vertical intercept of the line joining (8, 1) and (−2, −3).
3 47
Answers 1. y = x+ . 2. 0.4, −2.2.
10 10
44 HELM (2006):
4. The distance between two points
Referring again to the figure of 2, the distance between the points A(x1 , y1 ) and B(x2 , y2 ) is
given using Pythagoras’ theorem by the following formula.
Key Point 10
Distance Between Two Points
p
The distance between points A(x1 , y1 ) and B(x2 , y2 ) is (x2 − x1 )2 + (y2 − y1 )2
Task
Find the distance between A(−7, 11) and B(1, 3), using Key Point 10.
Your solution
Answer
p √ √
(1 − (−7))2 + (3 − 11)2 = 64 + 64 = 128
Exercises
1. Find the distance between the points (4, 5) and (−17, 1).
2. Find the distance between the points (−4, −5) and (1, 7).
Answers
√
1. 457
2. 13
HELM (2006): 45

The Circle 2.6
Introduction
A circle is one of the most familiar geometrical figures and has been around a long time! In this
brief Section we discuss the basic coordinate geometry of a circle - in particular the basic equation
representing a circle in terms of its centre and radius.

• understand what is meant by a function and
Prerequisites be able to use functional notation
Before starting this Section you should . . . • be able to plot graphs of functions

• obtain the equation of any given circle
Learning Outcomes • obtain the centre and radius of a circle from
On completion you should be able to . . . its equation

46 HELM (2006):
1. Equations for circles in the Oxy plane
The obvious characteristic of a circle is that every point on its circumference is the same distance
from the centre. This fixed distance is called the radius of the circle and is generally denoted by R
or r or a.
In coordinate geometry terms suppose (x, y) denotes the coordinates of a point. For example, (4,2)
means x = 4, y = 2, (−1, 1) means x = −1, y = 1 and so on. See Figure 26.
(4, 2)
(−1, 1)
x
Figure 26
Example 14
Write down the distances d1 and d2 from the origin of the points with coordinates
(4,2) and (−1, 1) respectively. Generalise the result to obtain the distance d from
the origin of any arbitrary point with coordinates (x, y).
Solution
Using Pythagoras’ Theorem:
√ √
d1 = 42 + 22 = 20 is the distance between the origin (0,0) and the point (4,2).
p √
d2 = (−1)2 + 12 = 2 is the distance between the origin and (−1, 1).
p
d = x2 + y 2 is the distance from the origin to an arbitrary point (x, y). Note that the positive
square root is taken in each case.
HELM (2006): 47
Section 2.6: The Circle
Circles with centre at the origin
Suppose (x, y) is any point P on a circle of radius R whose centre is at the origin. See Figure 27.
(x, y)
P
R
x
Figure 27
Task
Using the final result of Example (14), write down an equation relating x, y and
R.
Your solution
Answer p
Since x2 + y 2 is distance of any point (x, y) from the origin, then for any point P on the above
circle.
p
x2 + y 2 = R or x2 + y 2 = R 2
As the point P in Figure 27 moves around the circle its x and y coordinates change. However P will
remain at the same distance R from the origin by the very definition of a circle.
Hence we say that
p
x2 + y 2 = R or, more usually,
x2 + y 2 = R 2 (1)
is the equation of the circle radius R centre at the origin. What this means is that if a point (x, y)
satisfies (1) then it lies on the circumference of the circle radius R. If (x, y) does not satisfy (1) then
it does not lie on that circumference.
Note carefully that the right-hand sides of the circle equation (1) is the square of the radius.
48 HELM (2006):
Task
Consider the circle centre at the origin and of radius 5.
(a) Write down the equation of this circle.

(b) For the following points determine which lie on the circumference of
this circle, which lie inside the circle and which lie outside the circle.
√ √
(5, 0) (0, −5) (4, 3) (−3, 4) (2, 21) (−2 6, 1) (1, 4) (4, −4)
Your solution
(a)
(x, y) x2 + y 2 conclusion
(5, 0)
(0, −5)
(4, 3)
(b) (−3,√−4)
(2, √21)
(−2 6, 1)
(1, 4)
(4, −4)
Answer
(a) x2 + y 2 = 52 = 25 is the equation of the circle.

(b) For each point (x, y) we calculate x2 + y 2 . If this equals 25 the point lies on the circle’
if greater than 25 then outside and if less than 25 then inside.
x, y x2 + y 2 conclusion
(5, 0) 25 on circle
(0, −5) 25 on circle
(4, 3) 25 on circle
(−3,√−4) 25 on circle
(2, √21) 25 on circle
(−2 6, 1) 25 on circle
(1, 4) 17 inside circle
(4, −4) 32 outside circle
HELM (2006): 49
Figure 28 demonstrates some of the results of the previous Task.
y
x2 + y 2 = 25
(1, 4) (4, 3)
(5, 0)
x
(−3, − 4) (4, −4)

(0, −5)
Figure 28
Note that the circle centre at the origin and of radius 1 has a special name – the unit circle.
Task
Calculate the distance between the points P1 (−1, 1) and P2 (4, 2).
(4, 2)
(−1, 1)
x
Your solution
50 HELM (2006):
Answer
y
P2
d
P1
A
x
Using Pythagoras’ Theorem the distance between the two given points is
p
d = (P1 A)2 + (AP2 )2
where P1 A = 4 − (−1) = 5, AP2 = 2 − 1 = 1
√ √
∴ d = 52 + 12 = 26
Task
Generalise your result to the previous Task to obtain the distance between any two
points whose coordinates are (x1 , y1 ) and (x2 , y2 ).
Your solution
HELM (2006): 51
Answer
y
(x1 , y1 )
P1
d
P2
A (x2 , y2 )
Between the arbitrary points P1 and P2 the distance is

p
d = (AP2 )2 + (P1 A)2
where AP2 = x2 − x1 , P1 A = y1 − y2 = −(y2 − y1 )
p
so d = (x2 − x1 )2 + (y2 − y1 )2
Using the result of the last Task, we now consider a circle centre at the point C(x0 ; y0 ) and of radius
R. Suppose P is an arbitrary point on this circle which has co-ordinates (x, y):
P (x, y)
R
C (x0 , y0 )
Figure 29
p
Clearly R = CP = (x − x0 )2 + (y − y0 )2
Hence, squaring both sides,
(x − x0 )2 + (y − y0 )2 = R2 (2)
which is said to be the equation of the circle centre (x0 , y0 ) radius R.
Note that if x0 = y0 = 0 (i.e. circle centre is at origin) then (2) reduces to (1) so the latter is simply
a special case.
The interpretation of (2) is similar to that of (1): any point (x, y) satisfying (2) lies on the circum-
ference of the circle.
52 HELM (2006):
Example 15
What does the equation (x − 3)2 + (y − 4)2 = 4 represent?
Solution
It represents a circle of radius 2 (the positive square root of 4) and has centre C (3, 4).
N.B. There is no need to expand the√terms on the left-hand side of the equation here. The given
form reveals quite plainly the radius ( 4) and centre (3, 4) of the circle.
Task
Write down the equations of each of the following circles for which the centre C
and radius R are given:
(a) C(0, 2), R = 2

(b) C(−2, 0), R = 3
(c) C(−3, 4), R = 5
√
(d) C(1, 1), R = 3
Your solution
(a)
(b)
(c)
(d)
Answer
(a) x0 = 0, y0 = 2, R2 = 4 so by Equation (2) the circle’s equation is
x2 + (y − 2)2 = 4
(b) x0 = −2, y0 = 0, R2 = 9 ∴ (x + 2)2 + y 2 = 9
(c) (x + 3)2 + (y − 4)2 = 25

(d) (x − 1)2 + (y − 1)2 = 3
Again we emphasise that the right-hand side of each of these equations is the square of the radius.
HELM (2006): 53
Task
Write down the equations of each of the circles shown below:
y
(a) (b) (c) (d)
2 3
2 1 2 −1
x
3
−1
Your solution
Answer
(a) x2 + y 2 = 4 (centre (0, 0) i.e. the origin, radius 2)

(b) (x − 1)2 + y 2 = 1 (centre (1, 0), radius 1)
(c) (x − 3)2 + (y − 3)2 = 0 (centre (3, 3), radius 3)
(d) (x + 1)2 + (y + 1)2 = 1 (centre (−1, −1), radius 1)
Consider again the equation of the circle, centre (3,4) of radius 2:
(x − 3)2 + (y − 4)2 = 4 (3)
In this form of the equation the centre and radius of the circle can be clearly identified and, as we
said, there is no advantage in squaring out. However, if we did square out the equation would become
x2 − 6x + 9 + y 2 − 8x + 16 = 4 or x2 − 6x + y 2 − 8x + 21 = 0 (4)
Equation (4) is of course a valid equation for this circle but, we cannot immediately obtain the centre
and radius from it.
54 HELM (2006):
Task
For the case of the general circle of radius R
(x − x0 )2 + (y − y0 )2 = R2
expand out the square terms and simplify.
Your solution
Answer
We obtain
x2 − 2x0 x + x20 + y 2 − 2y0 y + y02 − R2 = 0
or
x2 + y 2 − 2x0 x − 2y0 y + c = 0
where the constant c = x20 + y02 − R2 .
It follows from the above task that any equation of the form
x2 + y 2 − 2gx − 2f y + c = 0 (5)
represents a circle with centre (g, f ) and a radius obtained by solving
c = g 2 + f 2 − R2
for R.
Thus
p
R= g2 + f 2 − c (6)
There is no need to remember Equation (6). In any specific problem the technique of completion
of square can be used to turn an equation of the form (5) into the form of Equation (2) (i.e.
(x − x0 )2 + (y − y0 )2 = R2 ) and hence obtain the centre and radius of the circle.
NB. The key point about Equation (5) is that the coefficients of the term x2 and y 2 are the same,
i.e. 1. An equation with the coefficient of x2 and y 2 identical with value k 6= 1 could be converted
into the form (5) by division of the whole equation by k.
HELM (2006): 55
Task
If
x2 + y 2 − 2x + 10y + 16 = 0
obtain the centre and radius of the circle that this equation represents.
Begin by completing the square separately on the x−terms and the y−terms:
Your solution
Answer
x2 − 2x = (x − 1)2 − 1
y 2 + 10y = (y + 5)2 − 25
Now complete the problem:
Your solution
Answer
The original equation
x2 + y 2 − 2x + 10y + 16 = 0
becomes
(x − 1)2 − 1 + (y + 5)2 − 25 + 16 = 0
∴ (x − 1)2 + (y + 5)2 = 10
√
which represents a circle with centre (1, −5) and radius 10.
56 HELM (2006):
Circles and functions
Let us return to the equation of the unit circle
x2 + y 2 = 1
Solving for y we obtain
√
y = ± 1 − x2 .
This equation does not represent a function because of the two possible square roots which imply
that for any value of x there are two values of y. (You will recall from earlier in this Workbook that
a function requires only one value of the dependent variable y corresponding to each value of the
independent variable x.) √ √
However two functions can be obtained in this case: y = y1 = + 1 − x2 y = y2 = − 1 − x2
whose graphs are the semicircles shown.
y √ y
y = + 1 − x2
−1 1
x x
−1 1
√
y = − 1 − x2
Figure 30
2. Annuli between circles

Equations in x and y, such as (1) i.e. x2 + y 2 = R2 and (2) i.e. (x − x0 )2 + (y − y0 )2 = R2 for
circles, define curves in the Oxy plane. However, inequalities are necessary to define regions. For
example, the inequality
x2 + y 2 < 1
is satisfied by all points inside the unit circle - for example (0, 0), (0, 21 ), ( 14 , 0), ( 12 , 21 ).
Similarly x2 + y 2 > 1 is satisfied by all points outside that circle such as (1, 1).
y
x2 + y 2 > 1
1
x2 + y 2 = 1
x
1
x2 + y 2 < 1
Figure 31
HELM (2006): 57
Example 16
Sketch the regions in the Oxy plane defined by
(a) (x − 1)2 + y 2 < 1 (b) (x − 1)2 + y 2 > 1
Solution
The equality (x − 1)2 + y 2 = 1 is satisfied by any point on the circumference of the circle centre
(1,0) radius 1. Then, remembering that (x − 1)2 + y 2 is the square of the distance between any
point (x, y) and (1,0), it follows that
(a) (x − 1)2 + y 2 < 1 is satisfied by any point inside this circle (region (A) in the diagram.)
(b) (x − 1)2 + y 2 > 1 defines the region exterior to the circle since this inequality is satisfied
by every point outside. (Region (B) on the diagram.)
(B) (A) (B)

x
0 1 2
The region between two circles with the same centre (i.e. concentric circles) is called an annulus
or annular region. An annulus is defined by two inequalities. For example the inequality
x2 + y 2 > 1 (7)
defines, as we saw, the region outside the unit circle.
The inequality
x2 + y 2 < 4 (8)
defines the region inside the circle centre origin radius 2.
Hence points (x, y) which satisfy both the inequalities (7) and (8) lie in the annulus between the
two circles. The inequalities (7) and (8) are combined by writing
1 < x2 + y 2 < 4
y
1 < x2 + y 2 < 4
x
0 1 2
Figure 32
58 HELM (2006):
Task
Sketch the annulus defined by the inequalities
1 < (x − 1)2 + y 2 < 9
Your solution
Answer
The quantity (x − 1)2 + y 2 is the square of the distance of a point (x, y) from the point (1,0).
Hence, as we saw earlier, the left-hand inequality
1 < (x − 1)2 + y 2 which is the same as (x − 1)2 + y 2 > 1
is the region exterior to the circle C1 centre (1, 0) radius 1.
Similarly the right-hand inequality
(x − 1)2 + y 2 < 9
defines the interior of the circle C2 centre (1, 0) radius 3. Hence the double inequality holds for any
point in the annulus between C1 and C2 .
C2
C1
x
−2 0 1 22 4
HELM (2006): 59
Exercises
1. Write down the radius and the coordinates of the centre of the circle for each of the following
equations
(a) x2 + y 2 = 16
(b) (x − 4)2 + (y − 3)2 = 12
(c) (x + 3)2 + (y − 1)2 = 25
(d) x2 + (y + 1)2 − 4 = 0
(e) (x + 6)2 + y 2 − 36 = 0
2. Obtain in each case the equation of the given circle
(a) centre C (0, 0) radius 7

(b) centre C (0, 2) radius 2
(c) centre C (4, −4) radius 4
(d) centre C (−2, −2) radius 4
(e) centre C (−6, 0) radius 5
3. Obtain the radius and the coordinates of the centre for each of the following circles
(a) x2 + y 2 − 10x + 12y = 0

(b) x2 + y 2 + 2x − 4y = 11
(c) x2 + y 2 − 6x − 16 = 0
4. Describe the regions defined by each of these inequalities
(a) x2 + y 2 > 4
(b) x2 + y 2 < 16
(c) the inequalities in (i) and (ii) together
5. State an inequality that describes the points that lie outside the circle of radius 4 with centre
(−4, 2).
√
6. State an inequality that describes the points that lie inside the circle of radius 6 with centre
(−2, −1).
7. Obtain the equation of the circle which has centre (3, 4) and which passes through the point
(0, 5).
8. Show that if A(x1 , y1 ) and B(x2 , y2 ) are at opposite ends of a diameter of a circle then the
equation of the circle is (x − x1 )(x − x2 ) + (y − y1 )(y − y2 ) = 0.
(Hint: if P is any point on the circle obtain the slopes of the lines AP and BP and recall that
the angle in a semicircle must be a right-angle.)
9. State the equation of the unique circle which touches the x−axis at the point (2,0) and which
passes through the point (−1, 9).
60 HELM (2006):
Answers
1. (a) radius 4 centre (0, 0)

√
(b) radius 12 centre (4, 3)
(c) radius 5 centre (−3, 1)
(d) radius 2 centre (0, −1)
(e) radius 6 centre (−6, 0)
2. (a) x2 + y 2 = 49
(b) x2 + (y − 2)2 = 4
(c) (x − 4)2 + (y + 4)2 = 16
(d) (x + 2)2 + (y + 2)2 = 16
(e) (x + 6)2 + y 2 = 25
√
3. (a) centre (5, −6) radius 61
(b) centre (−1, 2) radius 4
(c) centre (3,0) radius 5
4. (a) the region outside the circumference of the circle centre the origin radius 2.
(b) the region inside the circle centre the origin radius 4 (often referred to as a circular disc)
(c) the annular ring between these two circles.
5. (x + 4)2 + (y − 2)2 > 16
6. (x + 2)2 + (y + 1)2 < 6
7. (x − 3)2 + (y − 4)2 = 10
8. (x − x1 )(x − x2 ) + (y − y1 )(y − y2 ) = 0.
9. (x − 2)2 + (y − 5)2 = 25 (Note: since we are told the circle touches the x−axis at (2,0) the
centre of the circle must be at the point (2, y0 ) where y0 = R).
HELM (2006): 61
Some Common
Functions 2.7
Introduction
This Section provides a catalogue of some common functions often used in Science and Engineering.
These include polynomials, rational functions, the modulus function and the unit step function.
Important properties and definitions are stated. This Section can be used as a reference when the
need arises. There are, of course, other types of function which arise in engineering applications,
such as trigonometric, exponential and logarithm functions. These others are dealt with in 4
to 6.

• understand what is meant by a function and
Prerequisites use functional notation
Before starting this Section you should . . . • be able to plot graphs of functions

'
$
• state what is meant by a polynomial
function, and a rational function
Learning Outcomes
• use and graph the modulus function
• use and graph the unit step function
& %
62 HELM (2006):
1. Polynomial functions
A very important type of function is the polynomial. Polynomial functions are made up of multiples
of non-negative whole number powers of a variable, such as 3x2 , −7x3 and so on. You are already
familiar with many such functions. Other examples include:
P0 (t) = 6
P1 (t) = 3t + 9 (The linear function you have already met).
2
P2 (x) = 3x − x + 2
P4 (z) = 7z 4 + z 2 − 1
where t, x and z are independent variables.

Note that fractional and negative powers of the independent variable are not allowed so that f (x) =
x−1 and g(x) = x3/2 are not polynomials. The function P0 (t) = 6 is a polynomial - we can regard
it as 6t0 .
By convention a polynomial is written with the powers either increasing or decreasing. For example
the polynomial
3x + 9x2 − x3 + 2
would be written as
−x3 + 9x2 + 3x + 2 or 2 + 3x + 9x2 − x3
In general we have the following definition:
Key Point 11
A polynomial expression has the form
an xn + an−1 xn−1 + an−2 xn−2 + . . . + a2 x2 + a1 x + a0
where n is a non-negative integer, an , an−1 , . . . , a1 , a0 are constants and x is a variable.
A polynomial function P (x) has the form
P (x) = an xn + an−1 xn−1 + an−2 xn−2 + . . . + a2 x2 + a1 x + a0
The degree of a polynomial or polynomial function is the value of the highest power. Referring to
the examples listed above, polynomial P2 has degree 2, because the term with the highest power
is 3x2 , P4 has degree 4, P1 has degree 1 and P0 has degree 0. Polynomials with low degrees have
special names given in Table 5.
HELM (2006): 63
Section 2.7: Some Common Functions
Table 5
degree name
a 0 constant
ax + b 1 linear
ax2 + bx + c 2 quadratic
3 2
ax + bx + cx + d 3 cubic
4 3 2
ax + bx + cx + dx + e 4 quartic
Typical graphs of some polynomial functions are shown in Figure 30. In particular, observe that the
graphs of the linear polynomials, P1 and Q2 are straight lines.
P2 (x) = x2 + 3
10
P1 (x) = 2x + 3 10 10 P3 (x) = x3
Q1 (x) = −x + 4 5
−5 5
x x −5
−5 5 5
−5
−10
−15
Q2 (x) = −x2 + 2x
Q3 (x) = −x3 + 7x − 6
Figure 30: Graphs of some typical linear, quadratic and cubic polynomials
Task
Which of the polynomial graphs in Figure 30 are odd and which are even? Are
any periodic ?
Your solution
Answer
P2 is even. P3 is odd. None are periodic.
64 HELM (2006):
Task
State which of the following are polynomial functions. For those that are, state
the degree and name.
(a) f (x) = 6x2 + 7x3 − 2x4 (b) f (t) = t3 − 3t2 + 7
1 3 1
(c) g(x) = 2
+ (d) f (x) = 16 (e) g(x) = 6
x x
Your solution
Answer
(a) polynomial of degree 4 (quartic), (b) polynomial of degree 3 (cubic), (c) not a polynomial,
(d) polynomial of degree 0 (constant), (e) polynomial of degree 0 (constant)
Exercises
1. Write down a polynomial of degree 3 with independent variable t.
2. Write down a function which is not a polynomial.
3. Explain why y = 1 + x + x1/2 is not a polynomial.
4. State the degree of the following polynomials: (a) P (t) = t4 + 7, (b) P (t) = −t3 + 3,
(c) P (t) = 11, (d) P (t) = t
5. Write down a polynomial of degree 0 with independent variable z.
6. Referring to Figure 27, state which functions are one-to-one and which are many-to-one.
Answers
1. For example f (t) = 1 + t + 3t2 − t3 .
1
2. For example y = .
x
3. A term such as x1/2 , with a fractional index, is not allowed in a polynomial.
4. (a) 4, (b) 3, (c) 0, (d) 1.
5. P (z) = 13, for example.
6. P1 , Q1 and P3 are one-to-one. The rest are many-to-one.
HELM (2006): 65
2. Rational functions
A rational function is formed by dividing one polynomial by another. Examples include
x+6 t3 − 1 2z 2 + z − 1
R1 (x) = , R 2 (t) = , R 3 (z) =
x2 + 1 2t + 3 z2 + z − 2
For convenience we have labelled these rational functions R1 , R2 and R3 .
Key Point 12
A rational function has the form
P (x)
R(x) =
Q(x)
where P and Q are polynomial functions.
P is called the numerator and Q is called the denominator.
The graphs of rational functions can take a variety of different forms and can be difficult to plot by
hand. Use of a graphics calculator or computer software can help. If you have access to a plotting
package or calculator it would be useful to obtain graphs of these functions for yourself. The next
Example and two Tasks allow you to explore some of the features of the graphs.
66 HELM (2006):
Example 17
x+2
Given the rational function R1 (x) = and its graph shown in Figure 31
x2 + 1
answer the following questions.
x
−2
x+2
Figure 31: Graph of R1 (x) =
x2 + 1
(a) For what values of x, if any, is the denominator zero?
(b) For what values of x, if any, is the denominator negative?
(c) For what values of x is the function negative?
(d) What is the value of the function when x is zero?
(e) What happens to the function as x gets larger and larger?
Solution
(a) x2 + 1 is never zero
(b) x2 + 1 is never negative, it is always positive
(c) only when the numerator x + 2 is negative which is when x is less than −2
(d) 2, because that is when the numerator x + 2 = 0
(e) R1 approaches zero because the x2 term in the denominator becomes very large. (This is seen
by substituting larger and larger values e.g. 10, 100, 1000 . . . )
Note that for large x values the graph gets closer and closer to the x axis. We say that the x axis is
a horizontal asymptote of this graph.
Answering questions such as (a) to (c) above will help you to sketch graphs of rational functions.
HELM (2006): 67
Task
t3 − 1
Study the graph and the algebraic form of the function R2 (t) = carefully
2t + 3
and answer the following questions. The following figure shows its graph (the solid
curve). The dotted line is an asymptote.
10
−10 −5 5 10 t
−10
t3 − 1
Graph of R2 (t) =
2t + 3
(a) What is the function value when t = 1?

(b) What is the value of the denominator when t = −3/2?
(c) What do you think happens to the graph of the function when t = −3/2?
Your solution
(a)
(b)
(c)
Answer
(a) 0,
(b) 0,
(c) The function value tends to infinity, the graph becomes infinite.
Note from the answers to parts (b) and (c) that we must exclude the value t = −3/2 from the
domain of this function because division by zero is not defined. At this point as you can see the
graph shoots off towards very large positive values (we say it tends to positive infinity) if the point is
approached from the left, and towards very large negative values (we say it tend to negative infinity)
if the point is approached from the right. The dotted line in the graph of R2 (x) has equation t = − 23 .
It is approached by the curve as t approaches − 32 and is known as a vertical asymptote.
68 HELM (2006):
Task
2z 2 + z − 1
Study the graph and the algebraic form of the function R3 (z) =
(z − 1)(z + 2)
carefully and try to answer the following questions. The graph of R3 (z) is shown
in the following figure.
10
−5 5 z
−10
2z 2 + z − 1
Graph of R3 (z) =
(z − 1)(z + 2)
(a) What is happening to the graph when z = −2 and when z = 1?

(b) Which values should be excluded from the domain of this function?
(c) Substitute some values for z (e.g. 10, 100 . . .). What happens to R3 as z gets large?
(d) Is there a horizontal asymptote?
(e) What is the name given to the vertical lines z = 1 and z = −2?
Your solution
Answer
(a) denominator is zero, R3 tends to infinity,
(b) z = −2 and z = 1,
(c) R3 approaches the value 2,
(d) y = 2 is a horizontal asymptote,
(e) vertical asymptotes
HELM (2006): 69
The previous Examples are intended to give you some guidance so that you will be able to sketch
rational functions yourself. Each function must be looked at individually but some general guidelines
are given in Key Point 13.
Key Point 13
Sketching rational functions
• Find the value of the function when the independent variable is zero. This is generally easy
to evaluate and gives you a point on the graph.
• Find values of the independent variable which make the denominator zero. These values must
be excluded from the domain of the function and give rise to vertical asymptotes.
• Find values of the independent variable which make the dependent variable zero. This gives
you points where the graph cuts the horizontal axis (if at all).
• Study the behaviour of the function when x is large and positive and when it is large and
negative.
• Are there any vertical or horizontal asymptotes? (Oblique asymptotes may also occur but
these are beyond the scope of this Workbook.)
It is particularly important for engineers to find values of the independent variable for which the
denominator is zero. These values are are known as the poles of the rational function.
Task
State the poles of the following rational functions:
t−3 s+7 2x + 5
(a) f (t) = (b) F (s) = (c) r(x) =
t+7 (s + 3)(s − 3) (x + 1)(x + 2)
x−1
(d) f (x) = 2 .
x −1
In each case locate the poles by finding values of the independent variable which make the denominator
zero:
Your solution
Answer
(a) −7, (b) 3 or −3, (c) −1 or −2, (d) x = −1
If you have access to a plotting package, plot these functions now.
70 HELM (2006):
Exercises
1. Explain what is meant by a rational function.
2. State the degree of the numerator and the degree of the denominator of the rational function
3x2 + x + 1
R(x) = .
x−1
3. Explain the term ‘pole’ of a rational function.
4. Referring to the graphs of R1 (x), R2 (t) and R3 (z) (on pages 66 - 68), state which functions,
if any, are one-to-one and which are many-to-one.
1 1
5. Without using a graphical calculator plot graphs of y = and y = 2 . Comment upon
x x
whether these graphs are odd, even or neither, whether they are continuous or discontinuous,
and state the position of any poles.
Answers
1. R(x) = P (x)/Q(x) where P and Q are polynomials.
2. numerator: 2, denominator: 1
3. The pole is a value of the independent variable which makes the denominator zero.
4. All are many-to-one.
1 1
5. is odd, and discontinuous. Pole at x = 0. 2 is even and discontinuous. Pole at x = 0.
x x
3. The modulus function

The modulus of a number is the size of that number with no regard paid to its sign. For example the
modulus of −7 is 7. The modulus of +7 is also 7. We can write this concisely using the modulus
sign | |. So we can write | − 7| = 7 and | + 7| = 7. The modulus function is defined as follows:
Key Point 14
Modulus Function
The modulus function is defined as

x x≥0
f (x) = |x| =
−x x < 0
HELM (2006): 71
The output from the function in Key Point 14 is simply the modulus of the input.
A graph of this function is shown in Figure 32.
f (x) = |x|
Figure 32: Graph of the modulus function |x|
Task
Draw up a table of values of the function f (x) = |x − 2| for values of x between
−3 and 5. Sketch a graph of this function.
Your solution
The table has been started. Complete it for yourself.
x −3 −2 −1 0 1 2 3 4 5
f (x) 5 3 2 0
Some points on the graph are shown in the figure. Plot your calculated points on the graph.
f (x) = |x − 2|
5
−3 2 x
5
72 HELM (2006):
Exercises
1. Sketch a graph of the following functions:
(a) f (x) = 3|x|, (b) f (x) = |x + 1|, (c) f (x) = 7|x − 3|.
2. Is the modulus function one-to-one or many-to-one?
Answers 2. Many-to-one
4. The unit step function

The unit step function is defined as follows:
Key Point 15
The unit step function u(t) is defined as:

1 t≥0
u(t) =
0 t<0
Study this definition carefully. You will see that it is defined in two parts, with one expression to be
used when t is greater than or equal to 0, and another expression to be used when t is less than 0.
The graph of this function is shown in Figure 33. Note that the part of u(t) for which t < 0 lies on
the t-axis but, for clarity, is shown as a distinct dashed line.
u(t)
Figure 33: Graph of the unit step function

There is a jump, or discontinuity in the graph when t = 0. That is why we need to define the function
in two parts; one part for when t is negative, and one part for when t is non-negative. The point
with coordinates (0,1) is part of the function defined on t ≥ 0.
HELM (2006): 73
The position of the discontinuity may be shifted to the left or right. The graph of u(t − d) is shown
in Figure 34.
u(t − d)
t
d
Figure 34: Graph of u(t − d).

In the previous two figures the function takes the value 0 or 1. We can adjust the value 1 by
multiplying the function by any other number we choose. The graph of 2u(t − 3) is shown in Figure
35.
2 u(t − )
t
3
Figure 35: Graph of 2u(t − 3)
Exercises
Sketch graphs of the following functions:
1. u(t),
2. −u(t),
3. u(t − 1),
4. u(t + 1),
5. u(t − 3) − u(t − 2),
6. 3u(t),
7. −2u(t − 3).
74 HELM (2006):
Answers
(1) (2)
u(t)
1
--------------------- ---------------------
t t
− u(t)
-1
(3) (4)
u(t − 1) u(t + 1)
1 1
------------------------------- ------------ t
1 t -1
(5) 3u(t)
(6)
3
------------------------------------------- -------- ---------------------

1 2 3 t t
-1
u(t − 3) − u(t − 2)
(7)
------------------------------------------------------
1 2 3 t
-2
− 2u(t − 3)
HELM (2006): 75
Contents 3
Equations, Inequalities
& Partial Fractions
3.1 Solving Linear Equations 2
3.2 Solving Quadratic Equations 13
3.3 Solving Polynomial Equations 31
3.4 Solving Simultaneous Linear Equations 42
3.5 Solving Inequalities 50
3.6 Partial Fractions 60
Learning outcomes
In this Workbook you will learn about solving single equations, mainly linear and quadratic,
but also cubic and higher degree, and also simultaneous linear equations. Such equations
often arise as part of a more complicated problem. In order to gain confidence in
mathematics you will need to be thoroughly familiar with these basis topics.
You will also study how to manipulate inequalities. You will also be introduced to partial
fractions which will enable you to re-express an algebraic fraction in terms of simpler
fractions. This will prove to be extremely useful in later studies on integration.
Solving Linear
Equations 3.1
Introduction
Many problems in engineering reduce to the solution of an equation or a set of equations. An equation
is a type of mathematical expression which contains one or more unknown quantities which you will
be required to find. In this Section we consider a particular type of equation which contains a single
unknown quantity, and is known as a linear equation. Later Sections will describe techniques for
solving other types of equations.

Prerequisites fractions
Before starting this Section you should . . . • be able to transpose formulae

Learning Outcomes • recognise and solve a linear equation


2 HELM (2006):
Workbook 3: Equations, Inequalities & Partial Fractions
®
1. Linear equations
Key Point 1
A linear equation is an equation of the form
ax + b = 0 a 6= 0
where a and b are known numbers and x represents an unknown quantity to be found.
In the equation ax + b = 0, the number a is called the coefficient of x, and the number b is called
the constant term.
The following are examples of linear equations
1
3x + 4 = 0, −2x + 3 = 0, − x−3=0
2
√
Note that the unknown, x, appears only to the first power, that is as x, and not as x2 , x, x1/2 etc.
Linear equations often appear in a non-standard form, and also different letters are sometimes used
for the unknown quantity. For example
1
2x = x + 1 3t − 7 = 17, 13 = 3z + 1, 1− y =3 2α − 1.5 = 0
2
are all examples of linear equations. Where necessary the equations can be rearranged and written
in the form ax + b = 0. We will explain how to do this later in this Section.
Task
Which of the following are linear equations and which are not linear?
(a) 3x + 7 = 0, (b) −3t + 17 = 0, (c) 3x2 + 7 = 0, (d) 5p = 0
The equations which can be written in the form ax + b = 0 are linear.

Your solution
(a) (b) (c) (d)
Answer
(a) linear in x (b) linear in t (c) non-linear - quadratic in x (d) linear in p, constant is zero
To solve a linear equation means to find the value of x that can be substituted into the equation so
that the left-hand side equals the right-hand side. Any such value obtained is known as a solution
or root of the equation and the value of x is said to satisfy the equation.
HELM (2006): 3
Section 3.1: Solving Linear Equations
Example 1
Consider the linear equation 3x − 2 = 10.
(a) Check that x = 4 is a solution.

(b) Check that x = 2 is not a solution.
Solution
(a) To check that x = 4 is a solution we substitute the value for x and see if both sides of the
equation are equal. Evaluating the left-hand side we find 3(4) − 2 which equals 10, the same
as the right-hand side. So, x = 4 is a solution. We say that x = 4 satisfies the equation.
(b) Substituting x = 2 into the left-hand side we find 3(2) − 2 which equals 4. Clearly the
left-hand side is not equal to 10 and so x = 2 is not a solution. The number x = 2 does not
satisfy the equation.
Task
Test which of the given values are solutions of the equation
18 − 4x = 26
(a) x = 2, (b) x = −2, (c) x = 8
(a) Substituting x = 2, the left-hand side equals

Your solution
Answer
18 − 4 × 2 = 10. But 10 6= 26 so x = 2 is not a solution.
(b) Substituting x = −2, the left-hand side equals:

Your solution
Answer
18 − 4(−2) = 26. This is the same as the right-hand side, so x = −2 is a solution.
(c) Substituting x = 8, the left-hand side equals:

Your solution
Answer
18 − 4(8) = −14. But −14 6= 26 and so x = 8 is not a solution.
4 HELM (2006):
®
Exercises
1. (a) Write down the general form of a linear equation.
(b) Explain what is meant by the root or solution of a linear equation.
In questions 2-8 verify that the given value is a solution of the given equation.
2. 3z − 7 = −28, z = −7
3. 8x − 3 = −11, x = −1
1
4. 2s + 3 = 4, s = 2
1 4
5. 3
x + 3
= 2, x = 2
6. 7t + 7 = 7, t = 0
7. 11x − 1 = 10, x = 1
8. 0.01t − 1 = 0, t = 100.
Answers
1. (a) The general form is ax + b = 0 where a and b are known numbers and x represents the
unknown quantity.
(b) A root is a value for the unknown which satisfies the equation.
2. Solving a linear equation

To solve a linear equation we make the unknown quantity the subject of the equation. We obtain
the unknown quantity on its own on the left-hand side. To do this we may apply the same rules used
for transposing formulae given in Workbook 1 Section 1.7. These are given again here.
Key Point 2
Operations which can be used in the process of solving a linear equation
• add the same quantity to both sides
• subtract the same quantity from both sides
• multiply both sides by the same quantity
• divide both sides by the same quantity
• take the reciprocal of both sides (invert)
• take functions of both sides; for example cube both sides.
HELM (2006): 5
A useful summary of the rules in Key Point 2 is ‘whatever we do to one side of an equation we must
also do to the other’.
Example 2
Solve the equation x + 14 = 5.
Solution
Note that by subtracting 14 from both sides, we leave x on its own on the left. Thus
x + 14 − 14 = 5 − 14
x = −9
Hence the solution of the equation is x = −9. It is easy to check that this solution is correct by
substituting x = −9 into the original equation and checking that both sides are indeed the same.
You should get into the habit of doing this.
Example 3
Solve the equation 19y = 38.
Solution
In order to make y the subject of the equation we can divide both sides by 19:
19y = 38
19y 38
=
19 19
38
cancelling 19’s gives y =
19
so y = 2
Hence the solution of the equation is y = 2.
6 HELM (2006):
®
Example 4
Solve the equation 4x + 12 = 0.
Solution
Starting from 4x + 12 = 0 we can subtract 12 from both sides to obtain
4x + 12 − 12 = 0 − 12
so that 4x = −12
If we now divide both sides by 4 we find

4x −12
=
4 4
cancelling 4’s gives x = −3
So the solution is x = −3.
Task
Solve the linear equation 14t − 56 = 0.
Your solution
Answer
t=4
Example 5 √ √
Solve the following equations: (a) x + 3 = 7, (b) x + 3 = − 7.
Solution
√
(a) Subtracting 3 from both sides gives x = 7 − 3.
√
(b) Subtracting 3 from both sides gives x = − 7 − 3.
√ √
Note that when asked to solve x + 3 = ± 7 we can write the two solutions
√ as x = −3 ± 7. It is
usually acceptable to leave the solutions in this form (i.e. with the 7 term) rather than calculate
decimal approximations. This form is known as the surd form.
HELM (2006): 7
Example 6
Solve the equation 23 (t + 7) = 5.
Solution
There are a number of ways in which the solution can be obtained. The idea is to gradually remove
unwanted terms on the left-hand side to leave t on its own. By multiplying both sides by 32 we find
3
2
× 23 (t + 7) = 3
2
×5= 3
2
× 5
1
and after simplifying and cancelling, t+7= 15
2
Finally, subtracting 7 from both sides gives

15 15 14 1
t= −7= − =
2 2 2 2
So the solution is t = 12 .
Example 7
Solve the equation 3(p − 2) + 2(p + 4) = 5.
Solution
At first sight this may not appear to be in the form of a linear equation. Some preliminary work is
necessary. Removing the brackets and collecting like terms we find the left-hand side yields 5p + 2
so the equation is 5p + 2 = 5 so that p = 35 .
Task
Solve the equation 2(x − 5) = 3 − (x + 6).
(a) First remove the brackets on both sides:
Your solution
Answer
2x − 10 = 3 − x − 6. We may write this as 2x − 10 = −x − 3.
8 HELM (2006):
®
(b) Rearrange the equation found in (a) so that terms involving x appear only on the left-hand side,
and constants on the right. Start by adding 10 to both sides:
Your solution
Answer
2x = −x + 7
(c) Now add x to both sides:

Your solution
Answer
3x = 7
(d) Finally solve this to find x:
Your solution
x=
Answer
7
3
Example 8
Solve the equation
6 7
=
1 − 2x x−2
Solution
This equation appears in an unfamiliar form but it can be rearranged into the standard form of a
linear equation. By multiplying both sides by (1 − 2x) and (x − 2) we find
6 7
(1 − 2x)(x − 2) × = (1 − 2x)(x − 2) ×
1 − 2x x−2
Considering each side in turn and cancelling common factors:
6(x − 2) = 7(1 − 2x)
Removing the brackets and rearranging to find x we have
6x − 12 = 7 − 14x
Further rearrangement gives: 20x = 19
19
The solution is therefore x = .
20
HELM (2006): 9
Example 9
Figure 1 shows three branches of an electrical circuit which meet together at
x. Point x is known as a node. As shown in Figure 1 the current in each of the
branches is denoted by I, I1 and I2 . Kirchhoff’s current law states that the current
entering any node must equal the current leaving that node. Thus we have the
equation I = I1 + I2
I x I2
I1
Figure 1
(a) Given I2 = 10 A and I = 18 A calculate I1 .

(b) Suppose I = 36 A and it is known that current I2 is five times as great as
I1 . Find the branch currents.
Solution
(a) Substituting the given values into the equation we find 18 = I1 + 10.
Solving for I1 we find
I1 = 18 − 10 = 8
Thus I1 equals 8 A.
(b) From Kirchhoff’s law, I = I1 + I2 .
We are told that I2 is five times as great as I1 , and so we can write I2 = 5I1 .
Since I = 36 we have
36 = I1 + 5I1
Solving this linear equation 36 = 6I1 gives I1 = 6 A.
Finally, since I2 is five times as great as I1 , we have I2 = 5I1 = 30 A.
10 HELM (2006):
®
Exercises
In questions 1-24 solve each equation:
1. 7x = 14 2. −3x = 6 3. 12 x = 7 4. 3x = 12
5. 4t = −2 6. 2t = 4 7. 4t = 2 8. 2t = −4
x x
9. =3 10. = −3 11. 7x + 2 = 9 12. 7x + 2 = 23
6 6
17
13. −7x + 1 = −6 14. −7x + 1 = −13 15. t = −2 16. 3 − x = 2x + 8
3
x x 13
17. x − 3 = 8 + 3x 18. = 16 19. = −2 20. − x = 14
4 9 2
21. −2y = −6 22. −7y = 11 23. −69y = −690 24. −8 = −4γ.
1
25. 3y − 8 = y 26. 7t − 5 = 4t + 7 27. 3x + 4 = 4x + 3
2
28. 4 − 3x = 4x + 3 29. 3x + 7 = 7x + 2 30. 3(x + 7) = 7(x + 2)
31. 2x − 1 = x − 3 32. 2(x + 4) = 8 33. −2(x − 3) = 6
34. −2(x − 3) = −6 35. −3(3x − 1) = 2
36. 2 − (2t + 1) = 4(t + 2) 37. 5(m − 3) = 8
38. 5m − 3 = 5(m − 3) + 2m 39. 2(y + 1) = −8
1 3
40. 17(x − 2) + 3(x − 1) = x 41. (x + 3) = −9 42. =4
3 m
5 2
43. = 44. −3x + 3 = 18 45. 3x + 10 = 31
m m+√1 √
46. x + 4 = 8 47. x − 4 = 23
48. If y = 2 find x if 4x + 3y = 9 49. If y = −2 find x if 4x + 5y = 3
50. If y = 0 find x if −4x + 10y = −8 51. If x = −3 find y if 2x + y = 8
52. If y = 10 find x when 10x + 55y = 530 53. If γ = 2 find β if 54 = γ − 4β
x − 5 2x − 1 x 3x x x 4x
54. − =6 55. + − =1 56. + = 2x − 7
2 3 4 2 6 2 3
5 2 2 5 x−3
57. = 58. = 59. =4
3m + 2 m+1 3x − 2 x−1 x+1
x+1 y−3 2 4x + 5 2x − 1
60. =4 61. = 62. − =x
x−3 y+3 3 6 3
3 1
63. + =0
2s − 1 s + 1
64. Solve the linear equation ax + b = 0 to find x
1 1
65. Solve the linear equation = (a 6= c) to find x
ax + b cx + d
HELM (2006): 11
Answers
1. 2 2. −2 3. 14 4. 1/6 5. −1/2 6. 2
7. 1/2 8. −2 9. 18 10. −18 11. 1 12. 3
13. 1 14. 2 15. −6/17 16. −5/3 17. −11/2 18. 64
19. −18 20. −28/13 21. y = 3 22. −11/7 23. y = 10 24. 2
25. 16/5 26. 4 27. 1 28. 1/7 29. 5/4 30. 7/4
31. −2 32. 0 33. 0 34. 6 35. 1/9 36. −7/6
37. 23/5 38. 6 39. −5 40. √
37/19 41. −30
√ 42. 3/4
43. −5/3 44. −5 45. 7 46. 8 − 4 47. 23 + 4 48. 3/4
49. 13/4 50. 2 51. 14 52. −2 53. −13 54. −49
55. 12/19 56. 42 57. 1 58. 8/13 59. −7/3 60. 13/3
(d − b)
61. 15 62. 7/6 63. −2/5 64. −b/a 65.
(a − c)
12 HELM (2006):
®
Solving Quadratic
Equations 3.2
Introduction
A quadratic equation is one which can be written in the form ax2 + bx + c = 0 where a, b and
c are numbers, a 6= 0, and x is the unknown whose value(s) we wish to find. In this Section we
describe several ways in which quadratic equations can be solved.

Prerequisites • be able to solve linear equations


'
$
• recognise a quadratic equation
• solve a quadratic equation by factorisation
• solve a quadratic equation using the standard

Learning Outcomes formula
On completion you should be able to . . . • solve a quadratic equation by completing the

square
• interpret the solution of a quadratic equation

graphically
& %
HELM (2006): 13
Section 3.2: Solving Quadratic Equations
1. Quadratic equations
Key Point 3
A quadratic equation is one which can be written in the form
ax2 + bx + c = 0 a 6= 0
where a, b and c are given numbers and x is the unknown whose value(s) must be found.
For example
2x2 + 7x − 3 = 0, x2 + x + 1 = 0, 0.5x2 + 3x + 9 = 0
are all quadratic equations. To ensure the presence of the x2 term, the number a, in the general
expression ax2 + bx + c cannot be zero. However b or c may be zero, so that
4x2 + 3x = 0, 2x2 − 3 = 0 and 6x2 = 0
are also quadratic equations. Frequently, quadratic equations occur in non-standard form but where
necessary they can be rearranged into standard form. For example
3x2 + 5x = 8, can be re-written as 3x2 + 5x − 8 = 0
2x2 = 8x − 9, can be re-written as 2x2 − 8x + 9 = 0
1
1+x= , can be re-written as x2 + x − 1 = 0
x
To solve a quadratic equation we must find values of the unknown x which make the left-hand and
right-hand sides equal. Such values are known as solutions or roots of the quadratic equation.
Note the difference between solving quadratic equations in comparison to solving linear equations. A
quadratic equation will generally have two values of x (solutions) which satisfy it whereas a linear
equation only has one solution.
We shall now describe three techniques for solving quadratic equations:
• factorisation
• completing the square
• using the quadratic formula
14 HELM (2006):
®
Exercises
1. Verify that x = 2 and x = 3 are both solutions of x2 − 5x + 6 = 0.
2. Verify that x = −2 and x = −3 are both solutions of x2 + 5x + 6 = 0.
2. Solution by factorisation
It may be possible to solve a quadratic equation by factorisation using the method described for
factorising quadratic expressions in 1.5, although you should be aware that not all quadratic
equations can be easily factorised.
Example 10
Solve the equation x2 + 5x = 0.
Solution
Factorising and equating each factor to zero we find
x2 + 5x = 0 is equivalent to x(x + 5) = 0
so that x = 0 and x = −5 are the two solutions.
Example 11
Solve the quadratic equation x2 + x − 6 = 0.
Solution
Factorising the left hand side we find x2 + x − 6 = (x + 3)(x − 2) so that
x2 + x − 6 = 0 is equivalent to (x + 3)(x − 2) = 0
When the product of two quantities equals zero, at least one of the two must equal zero. In this
case either (x + 3) is zero or (x − 2) is zero. It follows that
x + 3 = 0, giving x = −3 or x − 2 = 0, giving x=2
Here there are two solutions, x = −3 and x = 2.
These solutions can be checked quite easily by substitution back into the given equation.
HELM (2006): 15
Example 12
Solve the quadratic equation 2x2 − 7x − 4 = 0 by factorising the left-hand side.
Solution
Factorising the left hand side: 2x2 −7x−4 = (2x+1)(x−4) so 2x2 −7x−4 = 0 is equivalent to (2x+
1)(x − 4) = 0. In this case either (2x + 1) is zero or (x − 4) is zero. It follows that 2x + 1 =
0, giving x = − 12 or x − 4 = 0, giving x = 4
There are two solutions, x = − 12 and x = 4.
Example 13
Solve the equation 4x2 + 12x + 9 = 0.
Solution
Factorising we find 4x2 + 12x + 9 = (2x + 3)(2x + 3) = (2x + 3)2
This time the factor (2x + 3) occurs twice. The original equation 4x2 + 12x + 9 = 0 becomes
(2x + 3)2 = 0 so that 2x + 3 = 0
and we obtain the solution x = − 32 . Because the factor 2x + 3 appears twice in the equation
(2x + 3)2 = 0 we say that this root is a repeated solution or double root.
Task
Solve the quadratic equation 7x2 − 20x − 3 = 0.
First factorise the left-hand side:

Your solution
7x2 − 20x − 3 =
Answer
(7x + 1)(x − 3)
Equate each factor is then equated to zero to obtain the two solutions:
Your solution
Solution 1: x = Solution 2: x =
Answer
− 71 and 3
16 HELM (2006):
®
Exercises
Solve the following equations by factorisation:
1. x2 − 3x + 2 = 0 2. x2 − x − 2 = 0 3. x2 + x − 2 = 0
4. x2 + 3x + 2 = 0 5. x2 + 8x + 7 = 0 6. x2 − 7x + 12 = 0
2
7. x − x − 20 = 0 8. 4x2 − 4 = 0 9. −x2 + 2x − 1 = 0
10. 3x2 + 6x + 3 = 0 11. x2 + 11x = 0 12. 2x2 + 2x = 0
Answers The factors are found to be:

1. 1, 2 2. −1, 2 3. −2, 1 4. −1, −2 5. −7, −1
6. 4, 3 7. −4, 5 8. 1, −1 9. 1 twice 10. −1 twice
11. −11, 0 12. 0, −1
3. Completing the square

The technique known as completing the square can be used to solve quadratic equations although it
is applicable in many other circumstances too so it is well worth studying.
Example 14
(a) Show that (x + 3)2 = x2 + 6x + 9
(b) Hence show that x2 + 6x can be written as (x + 3)2 − 9.
Solution
(a) Removing the brackets we find
(x + 3)2 = (x + 3)(x + 3) = x2 + 3x + 3x + 9 = x2 + 6x + 9
(b) By subtracting 9 from both sides of the previous equation it follows that
(x + 3)2 − 9 = x2 + 6x
HELM (2006): 17
Example 15
(a) Show that (x − 4)2 = x2 − 8x + 16
(b) Hence show that x2 − 8x can be written as (x − 4)2 − 16.
Solution
(a) Removing the brackets we find
(x − 4)2 = (x − 4)(x − 4) = x2 − 4x − 4x + 16 = x2 − 8x + 16
(b) Subtracting 16 from both sides we can write
(x − 4)2 − 16 = x2 − 8x
We shall now generalise the results of Examples 14 and 15. Noting that
(x + k)2 = x2 + 2kx + k 2 we can write x2 + 2kx = (x + k)2 − k 2
Note that the constant term in the brackets on the right-hand side is always half the coefficient of x
on the left. This process is called completing the square.
Key Point 4
Completing the Square

The expression x2 + 2kx is equivalent to (x + k)2 − k 2
Example 16
Complete the square for the expression x2 + 16x.
Solution
Comparing x2 + 16x with the general form x2 + 2kx we see that k = 8. Hence
x2 + 16x = (x + 8)2 − 82 = (x + 8)2 − 64
Note that the constant term in the brackets on the right, that is 8, is half the coefficient of x on
the left, which is 16.
18 HELM (2006):
®
Example 17
Complete the square for the expression 5x2 + 4x.
Solution
Consider 5x2 + 4x. First of all the coefficient 5 is removed outside a bracket as follows
4
5x2 + 4x = 5(x2 + x)
5
We can now complete the square for the quadratic expression in the brackets:
2
2 4 2 2 2 2 4
x + x = (x + ) − = (x + )2 −
5 5 5 5 25
Finally, multiplying both sides by 5 we find

2 2 2 4
5x + 4x = 5 (x + ) −
5 25
Completing the square can be used to solve quadratic equations as shown in the following Examples.
Example 18
Solve the equation x2 + 6x + 2 = 0 by completing the square.
Solution
First of all just consider x2 + 6x, and note that we can write this as
x2 + 6x = (x + 3)2 − 9
Then the quadratic equation can be written as
x2 + 6x + 2 = (x + 3)2 − 9 + 2 = 0 that is (x + 3)2 = 7
Taking the square root of both sides gives
√ √
x + 3 = ± 7 so x = −3 ± 7
√ √
The two solutions are x = −3 + 7 = −0.3542 and x = −3 − 7 = −5.6458, to 4 d.p.
HELM (2006): 19
Example 19
Solve the equation x2 − 8x + 5 = 0
Solution
First consider x2 − 8x which we can write as x2 − 8x = (x − 4)2 − 16 so that the equation
becomes
x2 − 8x + 5 = (x − 4)2 − 16 + 5 = 0
i.e. (x − 4)2 = 11
√
x − 4 = ± 11
√
x = 4 ± 11
So x = 7.3166 or x = 0.6834 (to 4 d.p.)
Task
Solve the equation x2 − 4x + 1 = 0 by completing the square.
First examine the two left-most terms in the equation: x2 − 4x. Complete the square for these terms:
Your solution
x2 − 4x =
Answer
(x − 2)2 − 4
Use the above result to rewrite the equation x2 − 4x + 1 = 0 in the form (x− ?)2 + ? = 0:
Your solution
x2 − 4x + 1 =
Answer
(x − 2)2 − 4 + 1 = (x − 2)2 − 3 = 0
From this now obtain the roots:

Your solution
Answer
√ √
(x − 2)2 = 3, so x − 2 = ± 3. Therefore x = 2 ± 3 so x = 3.7321 or 0.2679 to 4 d.p.
20 HELM (2006):
®
Exercises
1. Solve the following quadratic equations by completing the square.
(a) x2 − 3x = 0
(b) x2 + 9x = 0.
(c) 2x2 − 5x + 2 = 0
(d) 6x2 − x − 1 = 0
(e) −5x2 + 6x − 1 = 0
(f) −x2 + 4x − 3 = 0
2. A chemical manufacturer found that the sales figures for a certain chemical X2 O depended on
its selling price. At present, the company can sell all of its weekly production of 300 t at a
price of £600 / t. The company’s market research department advised that the amount sold
would decrease by only 1 t per week for every £2 / t increase in the price of X2 O. If the total
production costs are made up of a fixed cost of £30000 per week, plus £400 per t of product,
show that the weekly profit is given by
x2
P =− + 800x − 270000
2
where x is the new price per t of X2 O. Complete the square for the above expression and hence
find
(a) the price which maximises the weekly profit on sales of X2 O

(b) the maximum weekly profit
(c) the weekly production rate
Answers
1. (a) 0, 3 (b) 0, −9 (c) 2, 21 (d) 1
2
, − 13 (e) 1
5
,1 (f) 1, 3
2. (a) £800 / t, (b) £50000 /wk, (c) 200 t / wk
4. Solution by formula
When it is difficult to factorise a quadratic equation, it may be possible to solve it using a formula
which is used to calculate the roots. The formula is obtained by completing the square in the general
quadratic ax2 + bx + c. We proceed by removing the coefficient of a:
b c b c b2
ax2 + bx + c = a{x2 + x + } = a{(x + )2 + − 2 }
a a 2a a 4a
Thus the solution of ax2 + bx + c = 0 is the same as the solution to
b 2 c b2
(x + ) + − 2 = 0
2a a 4a
HELM (2006): 21
r
b c b2 b c b2
So, solving: (x + )2 = − + 2 which leads to x=− ± − + 2
2a a 4a 2a a 4a
Simplifying this expression further we obtain the important result:
Key Point 5
Quadratic Formula
If ax2 + bx + c = 0, a 6= 0 then the two solutions (roots) are
√ √
−b − b2 − 4ac −b + b2 − 4ac
x= and x=
2a 2a
To apply the formula to a specific quadratic equation it is necessary to identify carefully the values
of a, b and c, paying particular attention to the signs of these numbers. Substitution of these values
into the formula then gives the desired solutions.
Note that if the quantity b2 − 4ac (called the discriminant) is a positive number we can take its
square root and the formula will produce two values known as distinct real roots. If b2 − 4ac = 0
there will be one value only known as a repeated root or double root. The value of this root
b
is x = − . Finally if b2 − 4ac is negative we say the equation possesses complex roots. These
2a
require special treatment and are described in 10.
Key Point 6
When finding roots of the quadratic equation ax2 + bx + c = 0 first calculate the discrinimant
b2 − 4ac
• If b2 − 4ac > 0 the quadratic has two real distinct roots
• If b2 − 4ac = 0 the quadratic has two real and equal roots
• If b2 − 4ac < 0 the quadratic has no real roots: there are two complex roots
22 HELM (2006):
®
Example 20
Compare each given equation with the standard form ax2 +bx +c = 0 and identify
a, b and c. Calculate b2 − 4ac in each case and use this information to state the
nature of the roots.
(a) 3x2 + 2x − 7 = 0 (b) 3x2 + 2x + 7 = 0
(c) 3x2 − 2x + 7 = 0 (d) x2 + x + 2 = 0
1
(e) −x2 + 3x − 2
=0 (f) 5x2 − 3 = 0
(g) x2 − 2x + 1 = 0 (h) 2p2 − 4p = 0
(i) −p2 + 4p − 4 = 0
Solution
(a) a = 3, b = 2, c = −7. So b2 − 4ac = (2)2 − 4(3)(−7) = 88.

The roots are real and distinct.
(b) a = 3, b = 2,c = 7. So b2 − 4ac = (2)2 − 4(3)(7) = −80.
The roots are complex.
(c) a = 3, b = −2, c = 7. So b2 − 4ac = (−2)2 − 4(3)(7) = −80.
(d) a = 1, b = 1, c = 2. So b2 − 4ac = 12 − 4(1)(2) = −7.
(e) a = −1, b = 3, c = − 12 . So b2 − 4ac = 32 − 4(−1)(− 12 ) = 7.
(f) a = 5, b = 0, c = −3. So b2 − 4ac = 0 − 4(5)(−3) = 60.
(g) a = 1, b = −2, c = 1. So b2 − 4ac = (−2)2 − 4(1)(1) = 0.
The roots are real and equal.
(h) a = 2, b = −4, c = 0. So b2 − 4ac = (−4)2 − 4(2)(0) = 16
(i) a = −1, b = 4, c = −4. So b2 − 4ac = (−4)2 − 4(−1)(−4) = 0
The roots are real and equal.
HELM (2006): 23
Example 21
Solve the quadratic equation 2x2 + 3x − 6 = 0 using the formula.
Solution
We compare the given equation with the standard form ax2 + bx + c = 0 in order to identify a, b
and c. We see that here a = 2, b = 3 and c = −6. Note particularly the sign of c. Substituting
these values into the formula we find
√ p √
−b ± b2 − 4ac −3 ± 32 − 4(2)(−6) −3 ± 9 + 48 −3 ± 7.5498
x= = = =
2a (2)(2) 4 4
Hence, to 4 d.p., the two roots are x = 1.1375, if the positive sign is taken and x = −2.6375 if
the negative sign√is taken. However, it is often sufficient to leave the solution in the so-called surd
−3 ± 57
form x = , which is exact.
4
Task
Solve the equation 3x2 − x − 6 = 0 using the quadratic formula.
First identify a, b and c:

Your solution
a= b= c=
Answer
a = 3, b = −1, c = −6
Substitute these values into the formula and simplify:
Your solution √
−b ± b2 − 4ac
x= so x =
2a
Answer p √
−(−1) ± (−1)2 − (4)(3)(−6) 1 ± 73
=
(2)(3) 6
Finally, calculate the values of x to 4 d.p.:

Your solution
x= or x=
Answer
1.5907, −1.2573
24 HELM (2006):
®
Undersea cable fault location

Introduction
The voltage (V ), current (I) and resistance (R) in an electrical circuit are related by Ohm’s law i.e.
V = IR. If there are two resistances (R1 and R2 ) in an electrical circuit, they may be in series, in
which case the total resistance (R) is given by R = R1 + R2 . Or they may be in parallel in which
case the total resistance is given by
1 1 1
= +
R R1 R2
In 1871 the telephone cable between England (A) and Denmark (B) developed a fault, due to a short
circuit under the sea (see Figure 2). Oliver Heaviside, an electrical engineer, came up with a very
simple method to find the location of the fault. He assumed that the cable had a uniform resistance
per unit length. Heaviside performed two tests:
(1) connecting a battery (voltage E) at A, with the circuit open at B, he measured the resulting
current I1 ,
(2) connecting the same battery at A, with the cable earthed at B, he measured the current I2 .
x r−x
A B
England Denmark
short-circuit
Figure 2: Schematic of the undersea cable

In the first measurement the resistances up to the cable fault and between the fault and the short
circuit are in series and in the second experiment the resistances beyond the fault and between the
fault and the short circuit are in parallel.
Problem in words
Use the information from the measurements to deduce the location of the fault.
Mathematical statement of problem
(a) Denote the resistances of the various branches by the symbols shown in Figure 2.
(b) Use Ohm’s law to write down expressions that apply to each of the two measurements.
(c) Eliminate y from these expressions to obtain an expression for x.
HELM (2006): 25
(a) In the first experiment the total circuit resistance is x + y. In the second experiment, the total
circuit resistance is given by:
−1
1 1
x+ +
r−x y
So application of Ohm’s law to each experimental situation gives:
E = I1 (x + y) (1)
−1
1 1
E = I2 (x + + ) (2)
r−x y
E
Rearrange Equation (1) to give −x=y
I1
E E
Substitute for y in Equation (2), divide both sides by I2 and introduce = r1 and = r2 :
I1 I2
−1
1 1
r2 = (x + + )
r−x y
Use a common denominator for the fractions on the right-hand side:

(r − x)(r1 − x) x(r1 + r − 2x) + (r − x)(r1 − x)
r2 = (x + )=
r1 − x + r − x (r1 + r − 2x)
Multiply through by (r1 + r − 2x):
r2 (r1 + r − 2x) = x(r1 + r − 2x) + (r − x)(r1 − x)
Rearrange as a quadratic for x:
x2 − 2r2 x − rr1 + r2 r1 + rr2 = 0
Use the standard formula for solving quadratic equations
with a = 1, b = −2r2 and c = −rr1 + r2 r1 + rr2 :
p
2r2 ± 4r22 − 4(−rr1 + r2 r1 + rr2 ) p
x= = r2 ± (r − r2 )(r1 − r2 )
2
Only positive solutions would be of interest.
26 HELM (2006):
®
Estimating the mass of a pipe
Introduction
Sometimes engineers have to estimate component weights from dimensions and material properties.
On some occasions, engineers prefer use of approximate formulae to exact ones as long as they are
sufficiently accurate for the purpose. This Example introduces both of these aspects.
Problem in words
(a) Find the mass of a given length of pipe in terms of its inner and outer diameters and the
density of the pipe material.
(b) Find the wall thickness of the pipe if the inner diameter is 0.15 m, the density is 7900 kg
m−3 and the mass per unit length of pipe is 40 kg m−1 .
(c) Find an approximate method for calculating the mass of a given length of a thin-walled
pipe and calculate the maximum ratio of inner and outer diameters that give an error of
less than 10% when using the approximate method.
(a) Denote the length of the pipe by L m and inside and outside diameters by di m and do
m, respectively and the density by ρ kg m−3 . Assume that the pipe is cylindrical so its
cross section corresponds to the gap between concentric circles (this is called an annulus
or annular region - see 2.6). Calculate the difference in cross sectional areas by
using the formula for the area of a circle (A = πr2 where r is the radius) and multiply
by the density and length to obtain mass (m).
(b) Rearrange the equation in terms of wall thickness (d m) and inner diameter. Substitute
the given values to determine the wall thickness.
(c) Approximate the resulting expression for small values of (do − di ). Calculate the percent-
age difference in predictions between the original and approximate formulae for various
numerical values of di /do .
(a) The cross section of a cylindrical pipe is a circular annulus. The area of a circle is given
π π
by πr2 = d2 , since r = d/2 if d is the diameter. So the area of the outer circle is d2o
4 4
π 2
and that of the inner circle is di . This means that the mass m kg of length L m of the
4
pipe is given by
π 2
m= (d − d2i )Lρ
4 0
HELM (2006): 27
(b) Denote the pipe wall thickness by δ so do = di + 25.
Use (d2o − d2i ) = (do − di )(do + di ) = 2δ(2di + 2δ). So m = πδ(di + δ)Lρ
Given that m/L = 40, di = 0.15 and r = 7900,
40 = pd(0.15 + d)7900.
Rearrange this equation as a quadratic in δ,
δ 2 + 0.15δ − 4π/790 = 0
Solve this quadratic using the standard formula with a = 1, b = 0.15 and c = 4π/790.
Retain only the positive solution to give δ = 0.072, i.e. the pipe wall thickness is 72 mm.
(c) If δ is small then (do − di ) is small and di + δ ≈ di . So the expression for m in terms of
δ may be written
m ≈ πδdi Lρ
The graph in Figure 3 shows that the percentage error from using the approximate formula
for the mass of the pipe exceeds 10% only if the inner diameter is less than 82% of the
outer diameter.
The percentage error from using the approximate formula can be calculated from
(exact result − approximate result)/(exact result) × 100% for various values of the ratio of inner to
outer diameters. In the graph the error is plotted for diameter ratios between 0.75 and 1.
15
10
% error
0
0.75 0.8 0.85 0.9 0.95 1
Inner diameter / Outer diameter
Figure 3
Comment
The graph shows also that the error is 1% or less for diameter ratios > 0.98.
28 HELM (2006):
®
Exercises
Solve the following quadratic equations by using the formula. Give answers exactly (where possible)
or to 4 d.p.:
1. x2 + 8x + 1 = 0 2. x2 + 7x − 2 = 0 3. x2 + 6x − 2 = 0
2 2
4. −x + 3x + 1 = 0 5. −2x − 3x + 1 = 0 6. 2x2 + 5x − 3 = 0
Answers
1. −0.1270, −7.8730 2. −7.2749, 0.2749 3. 0.3166, −6.3166
4. 3.3028, −0.3028 5. −1.7808, 0.2808 6. 21 , −3
5. Geometrical representation of quadratics

We can plot a graph of the function y = ax2 + bx + c (given the values of a, b and c). If the graph
crosses the horizontal axis it will do so when y = 0, and so the x coordinates at such points are
solutions of ax2 + bx + c = 0. Depending on the sign of a and of the nature of the solutions there
are essentially six different types of graph that can occur. These are displayed in Figure 4.
real, distinct roots real, equal roots complex roots
y y y
a>0
x x x
y y y
a<0
x x x
Figure 4: The possible graphs of a quadratic y = ax2 + bx + c
Sometimes a graph of the quadratic is used to locate the solutions; however, this approach is generally
inaccurate. This is illustrated in the following example.
HELM (2006): 29
Example 22
Solve the equation x2 − 4x + 1 = 0 by plotting a graph of the function:
y = x2 − 4x + 1
Solution
By constructing a table of function values we can plot the graph as shown in Figure 5.
y
x 0 1 2 3 4
y 1 −2 −3 −2 1
1
C D
−1 0 1 2 3 4 5 x
−1
−2
−3
Figure 5: The graph of y = x2 − 4x + 1 cuts the x axis at C and D

The solutions of the equation x2 − 4x + 1 = 0 are found by looking for points where the graph
crosses the horizontal axis. The two points are approximately x = 0.3 and x = 3.7 marked C and
D on the Figure.
Exercises
1. Solve the following quadratic equations giving exact numeric solutions. Use whichever method
you prefer
(a) x2 − 9 = 0 (b) s2 − 25 = 0
2
(c) 3x − 12 = 0 (d) x2 − 5x + 6 = 0
(e) 6s2 + s − 15 = 0 (f) p2 + 7p = 0
2. Solve the equation 2x2 − 3x − 7 = 0 giving solutions rounded to 4 d.p.
3. Solve the equation 2t2 + 3t − 4 giving the solutions in surd form.

Answers
1(a) x = 3, −3, (b) s = 5, −5, (c) x = 2, −2, (d) x = 3, 2, (e) s = 3/2, −5/3,
√
−3 ± 43
(f) p = 0, −7. 2. −2.7656, 1.2656. 3.
4
30 HELM (2006):
®
Solving Polynomial
Equations 3.3
Introduction
Linear and quadratic equations, dealt within Sections 3.1 and 3.2, are members of a class of equations,
called polynomial equations. These have the general form:
an xn + an−1 xn−1 + . . . + a2 x2 + a1 x + a0 = 0
in which x is a variable and an , an−1 , . . . , a2 , a1 , a0 are given constants. Also n must be a positive
integer and an 6= 0. Examples include x3 +7x2 +3x−2 = 0, 5x4 −7x2 = 0 and −x6 +x5 −x4 = 0.
In this Section you will learn how to factorise some polynomial expressions and solve some polynomial
equations.

Prerequisites • be able to solve linear and quadratic

equations

Learning Outcomes • recognise and solve some polynomial

equations

HELM (2006): 31
Section 3.3: Solving Polynomial Equations
1. Multiplying polynomials together
Key Point 7
A polynomial expression is one of the form
an xn + an−1 xn−1 + . . . + a2 x2 + a1 x + a0
where a0 , a1 , . . ., an are known coefficients (numbers), an 6= 0, and x is a variable.

n must be a positive integer.
For example x3 − 17x2 + 54x − 8 is a polynomial expression in x. The polynomial may be expressed
in terms of a variable other than x. So, the following are also polynomial expressions:
t3 − t2 + t − 3 z5 − 1 w4 + 10w2 − 12 s+1
Note that only non-negative whole number powers of the variable (usually x) are allowed in a poly-
nomial expression. In this Section you will learn how to factorise simple polynomial expressions and
how to solve some polynomial equations. You will also learn the technique of equating coefficients.
This process is very important when we need to perform calculations involving partial fractions which
will be considered in Section 6.
The degree of a polynomial is the highest power to which the variable is raised. Thus x3 + 6x + 2
has degree 3, t6 − 6t4 + 2t has degree 6, and 5x + 2 has degree 1.
Let us consider what happens when two polynomials are multiplied together. For example
(x + 1)(3x − 2)
is the product of two first degree polynomials. Expanding the brackets we obtain
(x + 1)(3x − 2) = 3x2 + x − 2
which is a second degree polynomial.
In general we can regard a second degree polynomial, or quadratic, as the product of two first degree
polynomials, provided that the quadratic can be factorised. Similarly
(x − 1)(x2 + 3x − 7) = x3 + 2x2 − 10x + 7
is a third degree, or cubic, polynomial which is thus the product of a linear polynomial and a quadratic
polynomial.
In general we can regard a cubic polynomial as the product of a linear polynomial and a quadratic
polynomial or the product of three linear polynomials. This fact will be important in the following
Section when we come to factorise cubics.
32 HELM (2006):
®
Key Point 8
A cubic expression can always be formulated as a linear expression times a quadratic expression.
Task
If x3 − 17x2 + 54x − 8 = (x − 4) × (a polynomial), state the degree of the
undefined polynomial.
Your solution
Answer
second.
Task
(a) If 3x2 + 13x + 4 = (x + 4) × (a polynomial), state the degree of the
undefined polynomial.
(b) What is the coefficient of x in this unknown polynomial ?
Your solution
(a) (b)
Answer
(a) First. (b) It must be 3 in order to generate the term 3x2 when the brackets are removed.
Task
If 2x2 + 5x + 2 = (x + 2)× (a polynomial), what must be the coefficient of x in
this unknown polynomial ?
Your solution
Answer
It must be 2 in order to generate the term 2x2 when the brackets are removed.
HELM (2006): 33
Task
Two quadratic polynomials are multiplied together. What is the degree of the
resulting polynomial?
Your solution
Answer
Fourth degree.
2. Factorising polynomials and equating coefficients

We will consider how we might find the solution to some simple polynomial equations. An important
part of this process is being able to express a complicated polynomial into a product of simpler
polynomials. This involves factorisation.
Factorisation of polynomial expressions can be achieved more easily if one or more of the factors
is already known. This requires a knowledge of the technique of ‘equating coefficients’ which is
illustrated in the following example.
Example 23
Factorise the expression x3 −17x2 +54x−8 given that one of the factors is (x−4).
Solution
Given that x − 4 is a factor we can write
x3 − 17x2 + 54x − 8 = (x − 4) × (a quadratic polynomial)
The polynomial must be quadratic because the expression on the left is cubic and x − 4 is linear.
Suppose we write this quadratic as ax2 + bx + c where a, b and c are unknown numbers which we
need to find. Then
x3 − 17x2 + 54x − 8 = (x − 4)(ax2 + bx + c)
Removing the brackets on the right and collecting like terms together we have
x3 − 17x2 + 54x − 8 = ax3 + (b − 4a)x2 + (c − 4b)x − 4c
34 HELM (2006):
®
Solution (contd.)
Like terms are those which involve the same power of the variable (x).
Equating coefficients means that we compare the coefficients of each term on the left with the
corresponding term on the right. Thus if we look at the x3 terms on each side we see that x3 = ax3
which implies a must equal 1. Similarly by equating coefficients of x2 we find −17 = b − 4a With
a = 1 we have −17 = b − 4 so b must equal −13. Finally, equating constant terms we find
−8 = −4c so that c = 2.
As a check we look at the coefficient of x to ensure it is the same on both sides. Now that we know
a = 1, b = −13, c = 2 we can write the polynomial expression as
x3 − 17x2 + 54x − 8 = (x − 4)(x2 − 13x + 2)
Exercises
Factorise into a quadratic and linear product the given polynomial expressions
1. x3 − 6x2 + 11x − 6, given that x − 1 is a factor
2. x3 − 7x − 6, given that x + 2 is a factor
3. 2x3 + 7x2 + 7x + 2, given that x + 1 is a factor
4. 3x3 + 7x2 − 22x − 8, given that x + 4 is a factor
Answers
1. (x − 1)(x2 − 5x + 6), 2. (x + 2)(x2 − 2x − 3), 3. (x + 1)(2x2 + 5x + 2),
4. (x + 4)(3x2 − 5x − 2).
3. Polynomial equations
When a polynomial expression is equated to zero, a polynomial equation is obtained. Linear and
quadratic equations, which you have already met, are particular types of polynomial equation.
Key Point 9
A polynomial equation has the form
an xn + an−1 xn−1 + . . . a2 x2 + a1 x + a0 = 0
where a0 , a1 , . . . , an are known coefficients, an 6= 0, and x represents an unknown whose value(s)

are to be found.
HELM (2006): 35
Polynomial equations of low degree have special names. A polynomial equation of degree 1 is a
linear equation and such equations have been solved in Section 3.1. Degree 2 polynomials are called
quadratics; degree 3 polynomials are called cubics; degree 4 equations are called quartics and so on.
The following are examples of polynomial equations:
5x6 − 3x4 + x2 + 7 = 0, −7x4 + x2 + 9 = 0, t3 − t + 5 = 0, w7 − 3w − 1 = 0
Recall that the degree of the equation is the highest power of x occurring. The solutions or roots
of the equation are those values of x which satisfy the equation.
Key Point 10
A polynomial equation of degree n has n roots.
Some (possibly all) of the roots may be repeated.
Some (possibly all) of the roots may be complex.
Example 24
Verify that x = −1, x = 1 and x = 0 are solutions (roots) of the equation
x3 − x = 0
Solution
We substitute each value in turn into x3 − x.
(−1)3 − (−1) = −1 + 1 = 0
so x = −1 is clearly a root.
It is easy to verify similarly that x = 1 and x = 0 are also solutions.
In the next subsection we will consider ways in which polynomial equations of higher degree than
quadratic can be solved.
Exercises
Verify that the given values are solutions of the given equations.
1. x2 − 5x + 6 = 0, x = 3, x = 2
2. 2t3 + t2 − t = 0, t = 0, t = −1, t = 21 .
36 HELM (2006):
®
4. Solving polynomial equations when one solution is known

In Section 3.2 we gave a formula which can be used to solve quadratic equations. Unfortunately
when dealing with equations of higher degree no simple formulae exist. If one of the roots can be
spotted or is known we can sometimes find the others by the method shown in the next Example.
Example 25
Let the polynomial expression x3 − 17x2 + 54x − 18 be denoted by P (x). Verify
that x = 4 is a solution of the equation P (x) = 0. Hence find the other solutions.
Solution
We substitute x = 4 into the polynomial expression P (x):
P (4) = 43 − 17(42 ) + 54(4) − 8 = 64 − 272 + 216 − 8 = 0
So, when x = 4 the left-hand side equals zero. Hence x = 4 is indeed a solution. Knowing that
x = 4 is a root we can state that (x−4) must be a factor of P (x). Therefore P (x) can be re-written
as a product of a linear and a quadratic term:
P (x) = x3 − 17x2 + 54x − 8 = (x − 4) × (quadratic polynomial)
The quadratic polynomial has already been found in a previous task so we deduce that the given
equation can be written
P (x) = x3 − 17x2 + 54x − 8 = (x − 4)(x2 − 13x + 2) = 0
In this form we see that x − 4 = 0 or x2 − 13x + 2 = 0
The first equation gives x = 4 which we already knew.
The second equation must be solved using one of the methods for solving quadratic equations given
in Section 3.2. For example, using the formula we find
√
−b ± b2 − 4ac
x = with a = 1, b = −13, c = 2
p 2a
13 ± (−13)2 − 4.1.2
=
√ 2
13 ± 161 13 ± 12.6886
= =
2 2
So x = 12.8443 and x = 0.1557 are roots of x2 − 13x + 2.
Hence the three solutions of P (x) = 0 are x = 4, x = 12.8443 and x = 0.1557, to 4 d.p.
HELM (2006): 37
Task
Solve the equation x3 + 8x2 + 16x + 3 = 0 given that x = −3 is a root.
Consider the equation x3 + 8x2 + 16x + 3 = 0.
Given that x = −3 is a root state a linear factor of the cubic:
Your solution
Answer
x+3
The cubic can therefore be expressed as
x3 + 8x2 + 16x + 3 = (x + 3)(ax2 + bx + c)
where a, b, and c are constants. These can be found by expanding the right-hand side.
Expand the right-hand side:
Your solution
Answer
x3 + 8x2 + 16x + 3 = ax3 + (3a + b)x2 + (3b + c)x + 3c
Equate coefficients of x3 to find a:
Your solution
Answer
1
Equate constant terms to find c:

Answer
3 = 3c so that c = 1
Your solution
Equate coefficients of x2 to find b:
Your solution
38 HELM (2006):
®
Answer
8 = 3a + b so b = 5
This enables us to write the equation as (x + 3)(x2 + 5x + 1) = 0 so x + 3 = 0 or x2 + 5x + 1 = 0.

Now solve the quadratic and state all three roots:
Your solution
Answer
The quadratic equation can be solved using the formula to obtain x = −4.7913 and x = −0.2087.
Thus the three roots of x3 + 8x2 + 16x + 3 are x = −3, x = −4.7913 and x = −0.2087.
Exercises
1. Verify that the given value is a solution of the equation and hence find all solutions:
(a) x3 + 7x2 + 11x + 2 = 0, x = −2 (b) 2x3 + 11x2 − 2x − 35 = 0, x = −5
2. Verify that x = 1 and x = 2 are solutions of x4 + 4x3 − 17x2 + 8x + 4 and hence find all solutions.
Answers
1(a) −2, −0.2087, −4.7913 1(b) −5, −2.1375, 1.6375
2. 1,2, −0.2984, −6.7016
5. Solving polynomial equations graphically

Polynomial equations, particularly of high degree, are difficult to solve unless they take a particularly
simple form. A useful guide to the approximate values of the solutions can be obtained by sketching
the polynomial, and discovering where the curve crosses the x-axis. The real roots of the polynomial
equation P (x) = 0 are given by the values of the intercepts of the function y = P (x) with the x-axis
because on the x-axis y = P (x), is zero. Computer software packages and graphics calculators exist
which can be used for plotting graphs and hence for solving polynomial equations approximately.
Suppose the graph of y = P (x) is plotted and takes a form similar to that shown in Figure 6.
x1 x2 x3
x
Figure 6: A polynomial function which cuts the x axis at points x1 , x2 and x3 .
HELM (2006): 39
The graph intersects the x axis at x = x1 , x = x2 and x = x3 and so the equation P (x) = 0 has
three roots x1 , x2 and x3 , because P (x1 ) = 0, P (x2 ) = 0 and P (x3 ) = 0.
Example 26
Plot a graph of the function y = 4x4 − 15x2 + 5x + 6 and hence approximately
solve the equation 4x4 − 15x2 + 5x + 6 = 0.
Solution
The graph has been plotted here with the aid of a computer graph plotting package and is shown
in Figure 7. By hand, a less accurate result would be produced, of course.
x
−5 5
Figure 7: Graph of y = 4x4 − 15x2 + 5x + 6
The solutions of the equation are found by looking for where the graph crosses the horizontal axis.
Careful examination shows the solutions are at or close to x = 1, x = 1.5, x = −0.5, x = −2.
An important feature of the graph of a polynomial is that it is continuous. There are never any gaps
or jumps in the curve. Polynomial curves never turn back on themselves in the horizontal direction,
(unlike a circle). By studying the graph in Figure 6 you will see that if we choose any two values
of x, say a and b, such that y(a) and y(b) have opposite signs, then at least one root lies between
x = a and x = b.
40 HELM (2006):
®
Exercises
1. Factorise x3 − x2 − 65x − 63 given that (x + 7) is a factor.
2. Show that x = −1 is a root of x3 +11x2 +31x+21 = 0 and locate the other roots algebraically.
3. Show that x = 2 is a root of x3 − 3x − 2 = 0 and locate the other roots.
4. Solve the equation x4 − 2x2 + 1 = 0.
5. Factorise x4 − 7x3 + 3x2 + 31x + 20 given that (x + 1) is a factor.
6. Given that two of the roots of x4 + 3x3 − 7x2 − 27x − 18 = 0 have the same modulus but
different sign, solve the equation.
(Hint - let two of the roots be α and −α and use the technique of equating coefficients).
7. Consider the polynomial P (x) = 5x3 − 47x2 + 84x. By evaluating P (2) and P (3) show that
at least one root of P (x) = 0 lies between x = 2 and x = 3.
8. Without solving the equation or using a graphical calculator, show that x4 + 4x − 1 = 0 has a
root between x = 0 and x = 1.
Answers
1. (x + 7)(x + 1)(x − 9)
2. x = −1, −3, −7
3. x = 2, −1 (repeated)
4. x = −1, 1 (each root repeated)
5. (x + 1)2 (x − 4)(x − 5)
6. (x + 3)(x − 3)(x + 1)(x + 2)
HELM (2006): 41
Solving Simultaneous
Linear Equations 3.4
Introduction
Equations often arise in which there is more than one unknown quantity. When this is the case there
will usually be more than one equation involved. For example in the two linear equations
7x + y = 9, −3x + 2y = 1
there are two unknowns: x and y. In order to solve the equations we must find values for x and
y that satisfy both of the equations simultaneously. The two equations are called simultaneous
equations. You should verify that the solution of these equations is x = 1, y = 2 because by
substituting these values into both equations, the left-hand and right-hand sides are equal.
In this Section we shall show how two simultaneous equations can be solved either by a method
known as elimination or by drawing graphs. In realistic problems which arise in mathematics and
in engineering there may be many equations with many unknowns. Such problems cannot be solved
using a graphical approach (we run out of dimensions in our 3-dimensional world!). Solving these
more general problems requires the use of more general elimination procedures or the use of matrix
algebra. Both of these topics are discussed in later Workbooks.

Prerequisites • be able to solve linear equations


Learning Outcomes • solve pairs of simultaneous linear

equations

42 HELM (2006):
®
1. Solving simultaneous equations by elimination

One way of solving simultaneous equations is by elimination. As the name implies, elimination,
involves removing one or more of the unknowns. Note that if both sides of an equation are multiplied
or divided by a non-zero number an exactly equivalent equation results. For example, if we are given
the equation
x + 4y = 5
then by multiplying both sides by 7 we find
7x + 28y = 35
and this modified equation is equivalent to the original one.
Given two simultaneous equations, elimination of one unknown can be achieved by modifying the
equations so that the coefficients of that unknown in each equation are the same and then subtracting
one modified equation from the other. Consider the following example.
Example 27
Solve the simultaneous equations
3x + 5y = 31 (1)
2x + 3y = 20 (2)
Solution
We first try to modify each equation so that the coefficient of x is the same in both equations. This
can be achieved if Equation (1) is multiplied by 2 and Equation (2) is multiplied by 3. This gives
6x + 10y = 62
6x + 9y = 60
Now the unknown x can be eliminated if the second equation is subtracted from the first:
6x + 10y = 62
subtract 6x + 9y = 60
0x + 1y = 2
The result implies that 1y = 2 and we see immediately that y must equal 2. To find x we substitute
the value found for y into either of the given Equations (1) or (2). For example, using Equation (1),
3x + 5(2) = 31
3x = 21
x = 7
Thus the solution of the simultaneous equations is x = 7, y = 2.

N.B. You should always check your solution by substituting back into both of the given equations.
HELM (2006): 43
Section 3.4: Solving Simultaneous Linear Equations
Example 28
Solve the equations
−3x + y = 18 (3)
7x − 3y = −44 (4)
Solution
We modify the equations so that x can be eliminated. For example, by multiplying Equation (3) by
7 and Equation (4) by 3 we find
−21x + 7y = 126
21x − 9y = −132
If these equations are now added we can eliminate x. Therefore
−21x + 7y = 126
add 21x − 9y = −132
0x − 2y = −6
from which −2y = −6, so that y = 3. Substituting this value of y into Equation (3) we obtain:
−3x + 3 = 18 so that − 3x = 15 so x = −5.
The solution is x = −5, y = 3.
Example 29
Solve the equations
5x + 3y = −74 (5)
−2x − 3y = 26 (6)
Solution
Note that the coefficients of y differ here only in sign.
By adding Equation (5) and Equation (6) we find 3x = −48 so that x = −16.
It then follows that y = 2, and the solution is x = −16, y = 2.
44 HELM (2006):
®
Task
Solve the equations
5x − 7y = −80
2x + 11y = 106
The first step is to modify the equations so that the coefficient of x is the same in both.
If the first is multiplied by 2 then the second equation must be multiplied by what?
Your solution
Answer
5
Write down the resulting equations:
Your solution
Answer
10x − 14y = −160, 10x + 55y = 530
Subtract one equation from the other to eliminate x and hence find y:
Your solution
Answer
55y − (−14y) = 530 − (−160) so 69y = 690 so y = 10.
Now substitute back to find x:

Your solution
Answer
x = −2
HELM (2006): 45
2. Equations with no solution
On occasions we may encounter a pair of simultaneous equations which have no solution. Consider
the following example.
Example 30
Show that the following pair of simultaneous equations have no solution.
10x − 2y = −3 (7)
−5x + y = 1 (8)
Solution
Leaving Equation (7) unaltered and multiplying Equation (8) by 2 we find
10x − 2y = −3
−10x + 2y = 2
Adding these equations to eliminate x we find that y is eliminated as well:
10x − 2y = −3
add −10x + 2y = 2
0x + 0y = −1
The last line ‘0 = −1’ is clearly nonsense.
We say that Equations (7) and (8) are inconsistent and they have no solution.
46 HELM (2006):
®
3. Equations with an infinite number of solutions

Some pairs of simultaneous equations can possess an infinite number of solutions. Consider the
following example.
Example 31
Solve the equations
2x + y = 8 (9)
4x + 2y = 16 (10)
Solution
If Equation (9) is multiplied by 2 we find both equations are identical: 4x + 2y = 16. This means
that one of them is redundant and we need only consider the single equation
2x + y = 8
There are infinitely many pairs of values of x and y which satisfy this equation. For example, if
x = 0 then y = 8, if x = 1 then y = 6, and if x = −3 then y = 14. We could continue like this
producing more and more solutions. Suppose we choose a value, say λ, for x. We can then write
2λ + y = 8 so that y = 8 − 2λ
The solution is therefore x = λ, y = 8 − 2λ for any value of λ whatsoever. There are an infinite
number of such solutions.
Exercises
Solve the given simultaneous equations by elimination:
1. (a) 5x + y = 8, −3x + 2y = −10, (b) 2x + 3y = −2, 5x − 5y = 20,
(c) 7x + 11y = −24, −9x + y = 46
2. A straight line has equation of the form y = ax + b. The line passes through the points with
coordinates (2, 4) and (−1, 3). Write down the simultaneous equations which must be satisfied
by a and b. Solve the equations and hence find the equation of the line.
3. A quadratic function y = ax2 + bx + c is used in signal processing to approximate a more

complicated signal. If this function must pass through the points with coordinates (0, 0), (1, 3)
and (5, −11) write down the simultaneous equations satisfied by a, b and c. Solve these to
find the quadratic function.
Answers
1.(a) x = 2, y = −2 (b) x = 2, y = −2 (c) x = −5, y = 1
1 10
2. y = 3 x + 3 3. y = − 13
10
x2 + 43
10
x
HELM (2006): 47
4. The graphs of simultaneous linear equations
Each equation in a pair of simultaneous linear equations is, of course, a linear equation and plotting
its graph will produce a straight line. The coordinates (x, y) of the point of intersection of the two
lines represent the solution of the simultaneous equations because this pair of values satisfies both
equations simultaneously. If the two lines do not intersect then the equations have no solution
(this can only happen if they are distinct and parallel). If the two lines are identical, there are an
infinite number of solutions (all points on the line) because the two lines are one on top of the
other. Although not the most convenient (or accurate) approach it is possible to solve simultaneous
equations using this graphical approach. Consider the following examples.
Example 32
Solve the simultaneous equations
4x + y = 9 (11)
−x + y − 1 (12)
by plotting two straight line graphs.
Solution
Equation (11) is rearranged into the standard form for the equation of a straight line: y = −4x + 9.
By selecting two points on the line a graph can be drawn as shown in Figure 8. Similarly, Equation
(12) can be rearranged as y = x − 1 and its graph drawn. This is also shown in Figure 8.
y
4 II: y = x − 1
3
2
1
−5 −4 −3 −2 −1 1 2 3 4 x
I: y = 9 − 4 x
Figure 8: The coordinates of the point of intersection give the required solution
The coordinates of any point on line I satisfy 4x + y = 9. The coordinates of any point on line
II satisfy −x + y = −1. At the point where the two lines intersect the x and y coordinates must
satisfy both equations simultaneously and so the point of intersection represents the solution. We
see from the graph that the point of intersection is (2, 1). The solution of the given equations is
therefore x = 2, y = 1.
48 HELM (2006):
®
Task
Find any solutions of the simultaneous equations: 10x − 2y = 4, 5x − y = −1 by
graphical method.
Your solution
Answer
Re-writing the equations in standard form we find
y = 5x − 2, and y = 5x + 1
Graphs of these lines are shown below. Note that these distinct lines are parallel and so do not
intersect. This means that the given simultaneous equations do not have a solution; they are
inconsistent.
y y = 5x + 1
3
2
1
−2 −1 1 2 x
y = 5x − 2
Exercises
Solve the given equations graphically:
1. 5x − y = 7, 2x + y = 7,
2. 2x − 2y = −2, 5x + y = −9,
3. 7x + 3y = 25, −2x + y = 4,
4. 4x + 4y = −4, x + 7y = −19.
Answers
1. x = 2, y = 3 2. x = −5/3, y = −2/3 3. x = 1, y = 6 4. x = 2, y = −3
HELM (2006): 49

Solving Inequalities 3.5
Introduction
An inequality is an expression involving one of the symbols ≥, ≤, > or <. This Section will first
show how to manipulate inequalities correctly. Then algebraical and graphical methods of solving
inequalities will be described.

Prerequisites • be able to solve linear and quadratic

equations

• re-arrange expressions involving
Learning Outcomes inequalities
On completion you should be able to . . . • solve linear and quadratic inequalities

50 HELM (2006):
®
1. The inequality symbols

Recall the definitions of the inequality symbols in Key Point 11:
Key Point 11
The symbols >, <, ≥, ≤ are called inequalities
> means: ‘is greater than’, ≥ means: ‘is greater than or equal to’
< means: ‘is less than’, ≤ means: ‘is less than or equal to’
So for example,
8>7 9≥2 −2<3 7≤7
A number line is often a helpful way of picturing inequalities. Given two numbers a and b, if b > a
then b will be to the right of a on the number line as shown in Figure 9.
a b
Figure 9: When b > a, b is to the right of a on the number line.
Note from Figure 10 that −3 > −5, 4 > −2 and 8 > 5.
−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
Figure 10
Inequalities can always be written in two ways. For example in English we can state that 8 is greater
than 7, or equivalently, that 7 is less than 8. Mathematically we write 8 > 7 or 7 < 8. In general if
b > a then a < b. If a < b then a will be to the left of b on the number line.
Example 33
Rewrite the inequality − 25 < x using only the ‘greater than’ sign, >.
Solution
− 25 < x can be written as x > − 52
HELM (2006): 51
Section 3.5: Solving Inequalities
Example 34
Rewrite the inequality 5 > x using only the ‘less than’ sign, <.
Solution
5 > x can be written as x < 5.
Sometimes two inequalities are combined into a single statement. Consider for example the statement
3 < x < 6. This is a compact way of writing ‘3 < x and x < 6’. Now 3 < x is equivalent to
x > 3 and so 3 < x < 6 means x is greater than 3 but less than 6.
Inequalities obey simple rules when used in conjunction with arithmetical operations:
Key Point 12
1. Adding or subtracting the same quantity from both sides of an inequality leaves the inequality
symbol unchanged.
2. Multiplying or dividing both sides by a positive number leaves the inequality unchanged.
3. Multiplying or dividing both sides by a negative number reverses the inequality.
For example, since 8 > 5, by adding k to both sides we can state

8+k >5+k
for any value of k. For example (with k = −3) 8 − 3 > 5 − 3. Further, by multiplying both sides of
8 > 5 by k we can state 8k > 5k provided k is positive. However, 8k < 5k if k is negative.
We emphasise that the inequality sign is reversed when multiplying both sides by a negative number.
A common mistake is to forget to reverse the inequality symbol. For example if 8 > 5, multiplying
both sides by −1 gives −8 < −5.
52 HELM (2006):
®
Task
Find the result of multiplying both sides of the inequality −18 < 9 by −3.
Your solution
Answer
54 > −27
The modulus or magnitude sign is sometimes used with inequalities. For example |x| < 1 represents
the set of all numbers whose actual size, irrespective of sign, is less than 1. This means any value
between −1 and 1. Thus
|x| < 1 means − 1 < x < 1
Similarly |x| > 4 means all numbers whose size, irrespective of sign, is greater than 4. This means
any value greater than 4 or less than −4. Thus
|x| > 4 means x > 4 or x < −4
In general, if k is a positive number:
Key Point 13
|x| < k means −k < x < k

|x| > k means x > k or x < −k
Exercises
1. State which of the following statements are true and which are false.
(a) 4 > 9, (b) 4 > 4, (c) 4 ≥ 4, (d) 0.001 < 10−5 , (e) | − 19| < 100,
(f) | − 19| > −20, (g) 0.001 ≤ 10−3
In questions 2-9 rewrite each of the statements without using a modulus sign:
2. |x| < 2, 3. |x| < 5, 4. |x| ≤ 7.5, 5. |x − 3| < 2,
6. |x − a| < 1, 7. |x| > 2, 8. |x| > 7.5, 9. |x| ≥ 0.
HELM (2006): 53
Answers
1. (a) F (b) F (c) T (d) F (e) T (f) T (g) T
2. −2 < x < 2 3. −5 < x < 5 4. −7.5 ≤ x ≤ 7.5
5. −2 < x − 3 < 2 6. −1 < x − a < 1 7. x > 2 or x < −2
8. x > 7.5 or x < −7.5 9. x ≥ 0 or x ≤ 0, in fact any x.
2. Solving linear inequalities algebraically

When we are asked to solve an inequality, the inequality will contain an unknown variable, say x.
Solving means obtaining all values of x for which the inequality is true. In a linear inequality the
unknown appears only to the first power, that is as x, and not as x2 , x3 , x1/2 and so on.
Consider the following examples.
Example 35
Solve the inequality 4x + 3 > 0.
Solution
4x + 3 > 0
4x > −3, by subtracting 3 from both sides
3
x > − by dividing both sides by 4.
4
Hence all values of x greater than − 34 satisfy 4x + 3 > 0.
Example 36
Solve the inequality −3x − 7 ≤ 0.
Solution
−3x − 7 ≤ 0
−3x ≤ 7 by adding 7 to both sides
7
x ≥ − dividing both sides by − 3 and reversing the inequality
3
Hence all values of x greater than or equal to − 37 satisfy −3x − 7 ≤ 0.
54 HELM (2006):
®
Task
Solve the inequality 17x + 2 < 4x + 1.
This is done by making x the subject and obtain it on its own on the left-hand
side.
Start by subtracting 4x from both sides to remove quantities involving x from the right:
Your solution
Answer
13x + 2 < 1
Now subtract 2 from both sides to remove the 2 on the left:

Your solution
Answer
13x < −1. Finally, the range of values of x are x < −1/13
Example 37
Solve the inequality |5x − 2| < 4 and depict the solution graphically.
Solution
|5x − 2| < 4 is equivalent to − 4 < 5x − 2 < 4

We treat each part of the inequality separately:
−4 < 5x − 2
−2 < 5x by adding 2 to both sides
2
− < x by dividing both sides by 5
5
So x > − 25 . Now consider the second part: 5x − 2 < 4.
5x − 2 < 4
5x < 6 by adding 2 to both sides
6
x < by dividing both sides by 5
5
6
So x < .
5
HELM (2006): 55
Solution (contd.)
Putting both parts of the solution together we see that the inequality is satisfied when
− 25 < x < 56 . This range of values is shown in Figure 11.
−2/5 0 6/5
2 6
Figure 11: |5x − 2| < 4 which is equivalent to 5
<x< 5
Task
Solve the inequality |1 − 2x| < 5.
First of all rewrite the inequality without using the modulus sign:
Your solution
|1 − 2x| < 5 is equivalent to:
Answer
−5 < 1 − 2x < 5
Then treat each part separately. First of all consider −5 < 1 − 2x. Solve this:
Your solution
Answer
x<3
The second part is 1 − 2x < 5. Solve this.
Your solution
Answer
x > −2
Finally, give the solution as one statement:
Your solution
Answer
−2 < x < 3.
56 HELM (2006):
®
Exercises
In the following questions solve the given inequality algebraically.
1. 4x > 8 2. 5x > 8 3. 8x > 5 4. 8x ≤ 5
5. 2x > 1 6. 3x < −1 7. 5x > 2 8. 2x > 0
9. 8x < 0 10. 3x ≥ 0 11. 3x > 4 12. 43 x > 1
13. 4x ≤ −3 14. 3x ≤ −4 15. 5x ≥ 0 16. 4x ≤ 0
17. 5x + 1 < 8 18. 5x + 1 ≤ 8 19. 7x + 3 ≥ 0
20. 18x + 2 > 9 21. 14x + 11 > 22 22. 1 − 5x ≤ 0
23. 2 + 5x ≥ 1 24. 11 − 7x < 2 25. 5 + 4x > 2x + 1
26. |7x − 3| > 1 27. |2x + 1| ≥ 3 28. |5x| < 1
29. |5x| ≤ 0 30. |1 − 5x| > 2 31. |2 − 5x| ≥ 3
Answers
1. x > 2 2. x > 8/5 3. x > 5/8 4. x ≤ 5/8
5. x > 1/2 6. x < −1/3 7. x > 2/5 8. x > 0
9. x < 0 10. x ≥ 0 11. x > 4/3 12. x > 4/3
13. x ≤ −3/4 14. x ≤ −4/3 15. x ≥ 0 16. x ≤ 0
17. x < 7/5 18. x ≤ 7/5 19. x ≥ −3/7 20. x > 7/18
21. x > 11/14 22. x ≥ 1/5 23. x ≥ −1/5 24. x > 9/7
25. x > −2 26. x > 4/7 or x < 2/7 27. x ≥ 1 or x ≤ −2 28. −1/5 < x < 1/5
29. x = 0 30. x < −1/5, x > 3/5 31. x ≤ −1/5, x ≥ 1
3. Solving inequalities using graphs

Graphs can be used to help solve inequalities. This approach is particularly useful if the inequality is
not linear as, in these cases solving the inequalities algebraically can often be very tricky. Graphics
calculators or software can save a lot of time and effort here.
Example 38
Solve graphically the inequality 5x + 2 < 0.
Solution
y
10
y = 5x + 2
5
x = −2/5
x
−1 0 1 2
Figure 12: Graph of y = 5x + 2.

We consider the function y = 5x + 2 whose graph is shown in Figure 12. The values of x which
make 5x + 2 negative are those for which y is negative. We see directly from the graph that y is
negative when x < − 25 .
HELM (2006): 57
Example 39
Find the range of values of x for which x2 − x − 6 < 0.
Solution
We consider the graph of y = x2 − x − 6 which is shown in Figure 13.
y
5
y = x2 − x − 6
x
−2 −1 0 1 2
−5
Figure 13: Graph of y = x2 − x − 6

Note that the graph crosses the x axis when x = −2 and when x = 3, and x2 − x − 6 will be
negative when y is negative. Directly from the graph we see that y is negative when −2 < x < 3.
Task
Find the range of values of x for which x2 − x − 6 > 0.
The graph of y = x2 − x − 6 has been drawn in Figure 13. We require
y = x2 − x − 6 to be positive.
Use the graph to solve the problem:
Your solution
Answer
x < −2 or x > 3
58 HELM (2006):
®
Example 40
By plotting a graph of y = 20x4 − 4x3 − 143x2 + 46x + 165 find the range of
values of x for which
20x4 − 4x3 − 143x2 + 46x + 165 < 0
Solution
A software package has been used to plot the graph which is shown in Figure 14. We see that y is
negative when −2.5 < x < −1 and is also negative when 1.5 < x < 2.2.
y
−5 −4 −3 −2 −1 1 3 4 5
x
Figure 14: Graph of y = 20x4 − 4x3 − 143x2 + 46x + 165
Exercises
In questions 1-5 solve the given inequality graphically:
1. 3x + 1 < 0 2. 2x − 7 < 0 3. 6x + 9 > 0, 4. 5x − 3 > 0 5. x2 − x − 6 < 0
Answers
1. x < −1/3 2. x < 7/2, 3. x > −3/2 4. x > 3/5 5. −2 < x < 3
HELM (2006): 59

Partial Fractions 3.6
Introduction
It is often helpful to break down a complicated algebraic fraction into a sum of simpler fractions. For
4x + 7 1 3
example it can be shown that 2 has the same value as + for any value of x.
x + 3x + 2 x+2 x+1
We say that
4x + 7 1 3
is identically equal to +
x2 + 3x + 2 x+2 x+1
4x + 7 1 3
and that the partial fractions of 2 are and .
x + 3x + 2 x+2 x+1
The ability to express a fraction as its partial fractions is particularly useful in the study of Laplace
transforms, Z-transforms, Control Theory and Integration. In this Section we explain how partial
fractions are found.

• be familiar with addition, subtraction,
Prerequisites multiplication and division of algebraic
Before starting this Section you should . . . fractions

#
• distinguish between proper and improper
fractions
Learning Outcomes
• express an algebraic fraction as the sum of its
partial fractions
" !
60 HELM (2006):
®
1. Proper and improper fractions

Frequently we find that an algebraic fraction appears in the form
numerator
algebraic fraction =
denominator
where both numerator and denominator are polynomials. For example
x3 + x2 + 3x + 7 3x2 − 2x + 5 x
2
, 2
, and 4
,
x +1 x − 7x + 2 x +1
The degree of the numerator, n say, is the highest power occurring in the numerator. The degree
of the denominator, d say, is the highest power occurring in the denominator. If d > n the fraction
is said to be proper; the third expression above is such an example. If d ≤ n the fraction is said to
be improper; the first and second expressions above are examples of this type. Before calculating
the partial fractions of an algebraic fraction it is important to decide whether the fraction is proper
or improper.
Task
For each of the following fractions state the degree of the numerator (= n) and
the degree of the denominator (= d). Hence classify the fractions as proper or
improper.
x3 + x2 + 3x + 7 3x2 − 2x + 5 x s2 + 4s + 5
(a) , (b) , (c) , (d)
x2 + 1 x2 − 7x + 2 x4 + 1 (s2 + 2s + 4)(s + 3)
(a) Find the degree of denominator and numerator and hence classify (a):
Your solution
Answer
The degree of the numerator, n, is 3. The degree of the denominator, d, is 2.
Because d ≤ n the fraction is improper.
(b) Here n = 2 and d = 2. State whether (b) is proper or improper:

Your solution
Answer
d ≤ n; the fraction is improper.
(c) Noting that x = x1 , state whether (c) is proper or improper:

Your solution
Answer
d > n; the fraction is proper.
HELM (2006): 61
Section 3.6: Partial Fractions
(d) Find the degree of the numerator and denominator of (d):
Your solution
Answer
Removing the brackets in the denominator we see that d = 3. As n = 2 this fraction is proper.
Exercise
For each fraction state the degrees of the numerator and denominator, and hence determine which
are proper and which are improper.
x+1 x2 (x − 1)(x − 2)(x − 3)
(a) , (b) 3 , (c)
x x −x x−5
Answers (a) n = 1, d = 1, improper, (b) n = 2, d = 3, proper, (c) n = 3, d = 1, improper.
The denominator of an algebraic fraction can often be factorised into a product of linear and/or
quadratic factors. Before we can separate algebraic fractions into simpler (partial) fractions we need
to completely factorise the denominators into linear and quadratic factors. Linear factors are those
of the form ax + b; for example 2x + 7, 3x − 2 and 4 − x. Irreducible quadratic factors are those
of the form ax2 + bx + c such as x2 + x + 1, and 4x2 − 2x + 3, which cannot be factorised into linear
factors (these are quadratics with complex roots).
2. Proper fractions with linear factors

Firstly we describe how to calculate partial fractions for proper fractions where the denominator may
be written as a product of linear factors. The steps are as follows:
• Factorise the denominator.

• Each factor will produce a partial fraction. A factor such as 3x + 2 will produce a partial
A
fraction of the form where A is an unknown constant. In general a linear factor
3x + 2
A
ax + b will produce a partial fraction . The unknown constants for each partial
ax + b
fraction may be different and so we will call them A, B, C and so on.
• Evaluate the unknown constants by equating coefficients or using specific values of x.
The sum of the partial fractions is identical to the original algebraic fraction for all values of x.
Key Point 14
A
A linear factor ax + b in the denominator gives rise to a single partial fraction of the form
ax + b
62 HELM (2006):
®
The steps involved in expressing a proper fraction as partial fractions are illustrated in the following
Example.
Example 41
7x + 10
Express in terms of partial fractions.
2x2 + 5x + 3
Solution
Note that this fraction is proper. The denominator is factorised to give (2x + 3)(x + 1). Each of
the linear factors produces a partial fraction. The factor 2x + 3 produces a partial fraction of the
A B
form and the factor x + 1 produces a partial fraction , where A and B are constants
2x + 3 x+1
which we need to find. We write
7x + 10 A B
= +
(2x + 3)(x + 1) 2x + 3 x + 1
By multiplying both sides by (2x + 3)(x + 1) we obtain
7x + 10 = A(x + 1) + B(2x + 3) . . . (*)
We may now let x take any value we choose. By an appropriate choice we can simplify the
right-hand side. Let x = −1 because this choice eliminates A. We find
7(−1) + 10 = A(0) + B(−2 + 3)

3 = B
so that the constant B must equal 3. The constant A can be found either by substituting some
other value for x or alternatively by ‘equating coefficients’.
Observe that, by rearranging the right-hand side, Equation (*) can be written as
7x + 10 = (A + 2B)x + (A + 3B)
Comparing the coefficients of x on both sides we see that 7 = A + 2B. We already know B = 3
and so
7 = A + 2(3)
= A+6
from which A = 1. We can therefore write

7x + 10 1 3
2
= +
2x + 5x + 3 2x + 3 x + 1
We have succeeded in expressing the given fraction as the sum of its partial fractions. The result
can always be checked by adding the fractions on the right.
HELM (2006): 63
Task
9 − 4x
Express in partial fractions.
3x2−x−2

Your solution
3x2 − x − 2 =
Answer
(3x + 2)(x − 1)
Because there are two linear factors we write

9 − 4x A B
2
= +
3x − x − 2 3x + 2 x − 1
Multiply both sides by (3x + 2)(x − 1) to obtain the equation from which to find A and B:
Your solution
9 − 4x =
Answer
9 − 4x = A(x − 1) + B(3x + 2)
Substitute an appropriate value for x to obtain B:

Your solution
Answer
Substitute x = 1 and get B = 1
Equating coefficients of x to obtain the value of A:

Your solution
Answer
−4 = A + 3B, A = −7 since B = 1
Finally, write down the partial fractions:

Your solution
9 − 4x
=
3x2 − x − 2
Answer
−7 1
+
3x + 2 x − 1
64 HELM (2006):
®
Exercises
5x − 1 7x + 25 11x + 1
1. Find the partial fractions of (a) , (b) , (c) .
(x + 1)(x − 2) (x + 4)(x + 3) (x − 1)(2x + 1)
Check by adding the partial fractions together again.
2. Express each of the following as the sum of partial fractions:

3 5 −3
(a) , (b) 2 , (c) ,
(x + 1)(x + 2) x + 7x + 12 (2x + 1)(x − 3)
Answers
2 3 3 4 4 3
1(a) + , 1(b) + 1(c) + ,
x+1 x−2 x+4 x+3 x − 1 2x + 1
3 3 5 5 6 3
2(a) − , 2(b) − , 2(c) − .
x+1 x+2 x+3 x+4 7(2x + 1) 7(x − 3)
3. Proper fractions with repeated linear factors

Sometimes a linear factor appears more than once. For example in
1 1 1
= which equals
x2 + 2x + 1 (x + 1)(x + 1) (x + 1)2
the factor (x + 1) occurs twice. We call it a repeated linear factor. The repeated linear factor
A B
(x + 1)2 produces two partial fractions of the form + . In general, a repeated linear
x + 1 (x + 1)2
factor of the form (ax + b)2 generates two partial fractions of the form
A B
+
ax + b (ax + b)2
This is reasonable since the sum of two such fractions always gives rise to a proper fraction:
A B A(ax + b) B x(Aa) + Ab + B
+ = + =
ax + b (ax + b)2 (ax + b)2 (ax + b)2 (ax + b)2
Key Point 15
A repeated linear factor (ax + b)2 in the denominator produces two partial fractions:
A B
+
ax + b (ax + b)2
Once again the unknown constants are found by either equating coefficients and/or substituting
specific values for x.
HELM (2006): 65
Task
10x + 18
Express in partial fractions.
4x2+ 12x + 9
Your solution
4x2 + 12x + 9 =
Answer
(2x + 3)(2x + 3) = (2x + 3)2
There is a repeated linear factor (2x + 3) which gives rise to two partial fractions of the form
10x + 18 A B
2
= +
(2x + 3) 2x + 3 (2x + 3)2
Multiply both sides through by (2x + 3)2 to obtain the equation to be solved to find A and B:
Your solution
Answer
10x + 18 = A(2x + 3) + B
Now evaluate the constants A and B by equating coefficients:

Your solution
Answer
Equating the x coefficients gives 10 = 2A so A = 5. Equating constant terms gives 18 = 3A + B
from which B = 3.
Finally express the answer in partial fractions:
Your solution
Answer
10x + 18 5 3
2
= +
(2x + 3) 2x + 3 (2x + 3)2
66 HELM (2006):
®
Exercises
Express the following in partial fractions.
3−x 7x − 15 3x + 14
(a) , (b) − (c)
x2 − 2x + 1 (x − 1)2 x2 + 8x + 16
5x + 18 2x2 − x + 1 5x2 + 23x + 24

(d) (e) (f)
(x + 4)2 (x + 1)(x − 1)2 (2x + 3)(x + 2)2
6x2 − 30x + 25 s+2 2s + 3

(g) (h) (i) .
(3x − 2)2 (x + 7) (s + 1)2 s2
Answers
1 2 7 8 3 2
(a) − + (b) − + (c) +
x − 1 (x − 1)2 x − 1 (x − 1)2 x + 4 (x + 4)2
5 2 1 1 1 3 1 2
(d) − (e) + + (f) + +
x + 4 (x + 4)2 x + 1 x − 1 (x − 1)2 2x + 3 x + 2 (x + 2)2
1 1 1 1 1 2 3
(g) − + + (h) + (i) + 2.
3x − 2 (3x − 2)2 x + 7 s + 1 (s + 1)2 s s
4. Proper fractions with quadratic factors

Sometimes when a denominator is factorised it produces a quadratic term which cannot be factorised
into linear factors. One such quadratic factor is x2 + x + 1. This factor produces a partial fraction
Ax + B
of the form 2 . In general a quadratic factor of the form ax2 + bx + c produces a single
x +x+1
Ax + B
partial fraction of the form 2 .
ax + bx + c
Key Point 16
A quadratic factor ax2 + bx + c in the denominator produces a partial fraction of the form
Ax + B
ax2 + bx + c
HELM (2006): 67
Task
3x + 1
Express as partial fractions
(x2 + x + 10)(x − 1)
Note that the quadratic factor cannot be factorised further. We have

3x + 1 Ax + B C
= 2 +
(x2 + x + 10)(x − 1) x + x + 10 x − 1
First multiply both sides by (x2 + x + 10)(x − 1):
Your solution
3x + 1 =
Answer
(Ax + B)(x − 1) + C(x2 + x + 10)
Evaluate C by letting x = 1:
Your solution
Answer
1
4 = 12C so that C =
3
Equate coefficients of x2 and hence find A, and then substitute any other value for x (or equate
coefficients of x) to find B:
Your solution
A= B=
Answer
7
− 13 , .
3
Finally express in partial fractions:
Your solution
Answer
3x + 1 − 13 x + 37 1
3 7−x 1
= + = +
(x2 + x + 10)(x − 1) x2 + x + 10 x − 1 3(x2 + x + 10) 3(x − 1)
68 HELM (2006):
®
Admittance
Admittance, Y , is a quantity which is used in analysing electronic circuits. A typical expression for
admittance is
s2 + 4s + 5
Y (s) =
(s2 + 2s + 4)(s + 3)
where s can be thought of as representing frequency. To predict the behaviour of the circuit it is
often necessary to express the admittance as the sum of its partial fractions and find the effect of
each part separately. Express Y (s) in partial fractions.
The fraction is proper. The denominator contains an irreducible quadratic factor, which cannot be
factorised further, and also a linear factor. Thus
s2 + 4s + 5 As + B C
2
= 2 + (1)
(s + 2s + 4)(s + 3) s + 2s + 4 s + 3
Multiplying both sides of Equation (1) by (s2 + 2s + 4)(s + 3) we obtain
s2 + 4s + 5 = (As + B)(s + 3) + C(s2 + 2s + 4) (2)
To find the constant C we let s = −3 in Equation (2) to eliminate A and B.
Thus
(−3)2 + 4(−3) + 5 = C((−3)2 + 2(−3) + 4)
so that
2
2 = 7C and so C=
7
Equating coefficients of s2 in Equation (2) we find
1=A+C
2
so that A = 1 − C = 1 − 7
= 57 .
Equating constant terms in Equation (2) gives 5 = 3B + 4C

2 27
so that 3B = 5 − 4C = 5 − 4 =
7 7
9
so B=
7
5
s2 + 4s + 5 7
s + 79 2
7
Finally Y (s) = 2 = 2 +
(s + 2s + 4)(s + 3) s + 2s + 4 s + 3
5s + 9 2
which can be written as Y (s) = +
7(s2+ 2s + 4) 7(s + 3)
HELM (2006): 69
Exercise
Express each of the following as the sum of its partial fractions.
3 27x2 − 4x + 5 2x + 4 6x2 + 13x + 2
(a) , (b) , (c) , (d)
(x2 + x + 1)(x − 2) (6x2 + x + 2)(x − 3) 4x2 + 12x + 9 (x2 + 5x + 1)(x − 1)
Answers
3 3(x + 3) 3x + 1 4 1 1
(a) − (b) + (c) +
7(x − 2) 7(x2 + x + 1) 6x2+x+2 x−3 2x + 3 (2x + 3)2
3x + 1 3
(d) + .
x2 + 5x + 1 x − 1
5. Improper fractions
When calculating the partial fractions of improper fractions an extra polynomial is added to any
partial fractions that would normally arise. The added polynomial has degree n − d where n is the
degree of the numerator and d is the degree of the denominator. Recall that
a polynomial of degree 0 is a constant, A say,
a polynomial of degree 1 has the form Ax + B,
a polynomial of degree 2 has the form Ax2 + Bx + C,
and so on.
If, for example, the improper fraction is such that the numerator has degree 5 and the denominator
has degree 3, then n − d = 2, and we need to add a polynomial of the form Ax2 + Bx + C.
Key Point 17
If a fraction is improper an additional term is included taking the form of a polynomial of degree
n − d, where n is the degree of the numerator and d is the degree of the denominator.
70 HELM (2006):
®
Example 42
Express as partial fractions
2x2 − x − 2
x+1
Solution
The fraction is improper because n = 2, d = 1 and so d ≤ n. Here n − d = 1, so we need to include
as an extra term a polynomial of the form Bx + C, in addition to the usual partial fractions. The
A
linear term in the denominator gives rise to a partial fraction . So altogther we have
x+1
2x2 − x − 2 A
= + (Bx + C)
x+1 x+1
Multiplying both sides by x + 1 we find
2x2 − x − 2 = A + (Bx + C)(x + 1) = Bx2 + (C + B)x + (C + A)
Equating coefficients of x2 gives B = 2.
Equating coefficients of x gives −1 = C + B and so C = −1 − B = −3.
Equating the constant terms gives −2 = C + A and so A = −2 − C = −2 − (−3) = 1.
Finally, we have
2x2 − x − 2 1
= + 2x − 3
x+1 x+1
Exercise
Express each of the following improper fractions in terms of partial fractions.
x+3 3x − 7 x2 + 2x + 2 2x2 + 7x + 7
(a) , (b) , (c) , (d)
x+2 x−3 x+1 x+2
3x5 + 4x4 − 21x3 − 40x2 − 24x − 29 4x5 + 8x4 + 23x3 + 27x2 + 25x + 9
(e) , (f)
(x + 2)2 (x − 3) (x2 + x + 1)(2x + 1)
Answers
1 2 1 1
(a) 1 + ’ (b) 3 + , (c) 1 + x + (d) 2x + 3 + ,
x+2 x−3 x+1 x+2
1 1 1 1 1
(e) 2
+ + + 3x2 + x + 2, (f) 2x2 + x + 7 + + 2
(x + 2) x+2 x−3 2x + 1 x + x + 1
HELM (2006): 71
Contents 4
Trigonometry
4.1 Right-angled Triangles 2
4.2 Trigonometric Functions 19
4.3 Trigonometric Identities 36
4.4 Applications of Trigonometry to Triangles 53
4.5 Applications of Trigonometry to Waves 65
Learning outcomes
In this Workbook you will learn about the basic building blocks of trigonometry. You will
learn about the sine, cosine, tangent, cosecant, secant, cotangent functions and their
many important relationships. You will learn about their graphs and their periodic nature.
You will learn how to apply Pythagoras' theorem and the Sine and Cosine rules to find
lengths and angles of triangles.
Right-angled
Triangles 4.1
Introduction
Right-angled triangles (that is triangles where one of the angles is 90◦ ) are the easiest topic for
introducing trigonometry. Since the sum of the three angles in a triangle is 180◦ it follows that in
a right-angled triangle there are no obtuse angles (i.e. angles greater than 90◦ ). In this Section we
study many of the properties associated with right-angled triangles.

Prerequisites • have a basic knowledge of the geometry of

triangles

'
$
• define trigonometric functions both in
right-angled triangles and more generally
Learning Outcomes • express angles in degrees
On completion you should be able to . . . • calculate all the angles and sides in any
right-angled triangle given certain information
& %
2 HELM (2006):
Workbook 4: Trigonometry
®
1. Right-angled triangles
Look at Figure 1 which could, for example, be a profile of a hill with a constant gradient.
B2
B1
A A C1 C2
Figure 1
The two right-angled triangles AB1 C1 and AB2 C2 are similar (because the three angles of triangle
AB1 C1 are equal to the equivalent 3 angles of triangle AB2 C2 ). From the basic properties of similar
triangles corresponding sides have the same ratio. Thus, for example,
B1 C1 B2 C2 AC1 AC2
= and = (1)
AB1 AB2 AB1 AB2
The values of the two ratios (1) will clearly depend on the angle A of inclination. These ratios are
called the sine and cosine of the angle A, these being abbreviated to sin A and cos A.
Key Point 1
B
B
BC AC
sin A = cos A =
AB AB
A A C
Figure 2
AC is the side adjacent to angle A.
BC is the side opposite to angle A.
AB is the hypotenuse of the triangle (the longest side).
Task
Referring again to Figure 2 in Key Point 1, write down the ratios which give sin B
and cos B.
Your solution
Answer
AC BC
sin B = cos B = .
AB AB
Note that sin B = cos A = cos(90◦ − B) and cos B = sin A = sin(90◦ − B)
HELM (2006): 3
Section 4.1: Right-angled Triangles
A third result of importance from Figure 1 is
B1 C1 B2 C2
= (2)
AC1 AC2
These ratios is referred to as the tangent of the angle at A, written tan A.
Key Point 2
B
B
BC AC
sin A = cos A =
AB AB
A A C
Figure 3
BC length of opposite side
tan A = =
AC length of adjacent side
For any right-angled triangle the values of sine, cosine and tangent are given in Key Point 3.
Key Point 3
B
B
BC AC
sin A = cos A =
AB AB
A A C
Figure 4
We can write, therefore, for any right-angled triangle containing an angle θ (not the right-angle)
length of side opposite angle θ Opp

sin θ = =
length of hypotenuse Hyp
length of side adjacent to angle θ Adj
cos θ = =
length of hypotenuse Hyp
length of side opposite angle θ Opp
tan θ = =
length of side adjacent to angle θ Adj
These are sometimes memorised as SOH, CAH and T OA respectively.
These three ratios are called trigonometric ratios.
4 HELM (2006):
®
Task
Write tan θ in terms of sin θ and cos θ.
Your solution
Answer
Opp Opp Hyp Opp Hyp Opp . Adj sin θ
tan θ = = . = . = i.e. tan θ =
Adj Adj Hyp Hyp Adj Hyp Hyp cos θ
Key Point 4
Pythagoras’ Theorem
a2 + b 2 = c 2 c b
a
Figure 5
Example 1
Use the isosceles triangle in Figure 6 to obtain the sine, cosine and tangent of 45◦ .
B
45
45
A x C
Figure 6
Solution
√
By Pythagoras’ theorem (AB)2 = x2 + x2 = 2x2 so AB = x 2
BC x 1 AC 1 BC x
Hence sin 45◦ = = √ =√ cos 45◦ = =√ tan 45◦ = = =1
AB x 2 2 AB 2 AC x
HELM (2006): 5
Noise reduction by sound barriers
Introduction
Audible sound has much longer wavelengths than light. Consequently, sound travelling in the atmo-
sphere is able to bend around obstacles even when these obstacles cause sharp shadows for light.
This is the result of the wave phenomenon known as diffraction. It can be observed also with water
waves at the ends of breakwaters. The extent to which waves bend around obstacles depends upon
the wavelength and the source-receiver geometry. So the efficacy of purpose built noise barriers, such
as to be found alongside motorways in urban and suburban areas, depends on the frequencies in the
sound and the locations of the source and receiver (nearest noise-affected person or dwelling) relative
to the barrier. Specifically, the barrier performance depends on the difference in the lengths of the
hypothetical ray paths passing from source to receiver either directly or via the top of the barrier (see
Figure 7).
T
R
receiver
W
r
H
source
S
U V hr
s
hs
barrier
Figure 7
Problem in words
Find the difference in the path lengths from source to receiver either directly or via the top of the
barrier in terms of
(i) the source and receiver heights,
(ii) the horizontal distances from source and receiver to the barrier and
(iii) the height of the barrier.
Calculate the path length difference for a 1 m high source, 3 m from a 3 m high barrier when the
receiver is 30 m on the other side of the barrier and at a height of 1 m.
Find ST + T R − SR in terms of hs, hr, s, r and H.
Calculate this quantity for hs = 1, s = 3, H = 3, r = 30 and hr = 1.
6 HELM (2006):
®
Note the labels V, U, W on points that are useful for the analysis. Note that the length of RV =
hr − hs and that the horizontal separation between S and R is r + s. In the right-angled triangle
SRV, Pythagoras’ theorem gives
(SR)2 = (r + s)2 + (hr − hs)2
So
p
SR = (r + s)2 + (hr − hs)2 (3)
Note that the length of T U = H − hs and the length of T W = H − hr. In the right-angled triangle
ST U,
(ST )2 = s2 + (H − hs)2
In the right-angled triangle TWR,
(T R)2 = r2 + (H − hr)2
So
p p
ST + T R = s2 + (H − hs)2 + r2 + (H − hr)2 (4)
So using (3) and (4)
p p p
ST + T R − SR = s2 + (H − hs)2 + r2 + (H − hr)2 − (r + s)2 + (hr − hs)2 .
For hs = 1, s = 3, H = 3, r = 30 and hr = 1,
p p p
ST + T R − SR = 32 + (3 − 1)2 + 302 + (3 − 1)2 − (30 + 3)2 + (1 − 1)2
√ √
= 13 + 904 − 33
= 0.672
So the path length difference is 0.672 m.

Interpretation
Note that, for equal source and receiver heights, the further either receiver or source is from the
barrier, the smaller the path length difference. Moreover if source and receiver are at the same height
as the barrier, the path length difference is zero. In fact diffraction by the barrier still gives some
sound reduction for this case. The smaller the path length difference, the more accurately it has to
be calculated as part of predicting the barriers noise reduction.
HELM (2006): 7
Horizon distance
Problem in words
Looking from a height of 2 m above sea level, how far away is the horizon? State any assumptions
made.
Assume that the Earth is a sphere. Find the length D of the tangent to the Earth’s sphere from the
observation point O.
D
O
h R
R
Figure 8: The Earth’s sphere and the tangent from the observation point O
Using Pythagoras’ theorem in the triangle shown in Figure 8,
(R + h)2 = D2 + R2
Hence
p
R2 + 2Rh + h2 = D2 + R2 → h(2R + h) = D2 → D= h(2R + h)
If R = 6.373 × 106 m, then the variation of D with h is shown in Figure 9.
15000
Horizon D (m) 10000
5000
0 2 4 6 8 10
Height h (m)
Figure 9
8 HELM (2006):
®
At an observation height of 2 m, the formula predicts that the horizon is just over 5 km away. In
fact the variation of optical refractive index with height in the atmosphere means that the horizon is
approximately 9% greater than this.
Task
Using the triangle ABC in Figure 10 which can be regarded as one half of the
equilateral triangle ABD, calculate sin, cos, tan for the angles 30◦ and 60◦ .
B
30◦
x
60◦
A x C D
2
Figure 10
Your solution
Answer √
2 2 x2 3x2 2 2 3
By Pythagoras’ theorem: (BC) = (AB) − (AC) = x − = so BC = x
4 4 2
√
3
√ x
BC x 2 3 AC 1 AC 1
Hence sin 60◦ = = = sin 30◦ = = 2= cos 60◦ = =
AB x 2 AB x 2 AB 2
√
3
√ √
3 1
BC x 3 √ 1
cos 30◦ = = 2 = tan 60◦ = 21 = 3 tan 30◦ = √2 = √
AB x 2 2
3 3
2
Values of sin θ, cos θ and tan θ can of course be obtained by calculator. When entering the angle
in degrees ( e.g. 30◦ ) the calculator must be in degree mode. (Typically this is ensured by pressing
the DRG button until ‘DEG’ is shown on the display). The keystrokes for sin 30◦ are usually simply
sin 30 or, on some calculators, 30 sin perhaps followed by = .
Task
(a) Use your calculator to check the values of sin 45◦ , cos 30◦ and tan 60◦ obtained
in the previous Task.
1◦
(b) Also obtain sin 3.2◦ , cos 86.8◦ , tan 28◦ 150 . (0 denotes a minute = )
60
HELM (2006): 9
Your solution
(a)
(b)
Answer
(a) 0.7071, 0.8660, 1.7321 to 4 d.p.
(b) sin 3.2◦ = cos 86.8◦ = 0.0558 to 4 d.p., tan 28◦ 150 = tan 28.25◦ = 0.5373 to 4 d.p.
Inverse trigonometric functions (a first look)

Consider, by way of example, a right-angled triangle with sides 3, 4 and 5, see Figure 11.
B
B
5
3
C 4 A A
Figure 11
3 4 3
Suppose we wish to find the angles at A and B. Clearly sin A = , cos A = , tan A = so we
5 5 4
need to solve one of the above three equations
to find A.
3 3 3
Using sin A = we write A = sin−1 (read as ‘A is the inverse sine of ’)
5 5 5
The value of A can be obtained by calculator using the ‘sin−1 ’ button (often a second function to
the sin function and accessed
using a SHIFT or INV or SECOND FUNCTION key).
3
Thus to obtain sin−1 we might use the following keystrokes:
5
INV SIN 0.6 = or 3 ÷ 5 INV SIN =
3
We find sin−1 = 36.87◦ (to 4 significant figures).
5
Key Point 5
Inverse Trigonometric Functions
sin θ = x implies θ = sin−1 x
cos θ = y implies θ = cos−1 y
tan θ = z implies θ = tan−1 z
(The alternative notations arcsin, arccos, arctan are sometimes used for these inverse functions.)
10 HELM (2006):
®
Task
Check the values of the angles at A and B in Figure 11 above using the cos−1
functions on your calculator. Give your answers in degrees to 2 d.p.
Your solution
Answer
4 3
A = cos−1 = 36.87◦ B = cos−1 = 53.13◦
5 5
Task
Check the values of the angles at A and B in Figure 11 above using the tan−1
functions on your calculator. Give your answers in degrees to 2 d.p.
Your solution
Answer
3 4
A = tan−1 = 36.87◦ B = tan−1 = 53.13◦
4 3
1
You should note carefully that sin−1 x does not mean .
sin x
1
Indeed the function has a special name – the cosecant of x, written cosec x. So
sin x
1
cosec x ≡ (the cosecant function).
sin x
Similarly
1
sec x ≡ (the secant function)
cos x
1
cot x ≡ (the cotangent function).
tan x
HELM (2006): 11
Task
Use your calculator to obtain to 3 d.p. cosec 38.5◦ , sec 22.6◦ , cot 88.32◦ (Use
the sin, cos or tan buttons unless your calculator has specific buttons.)
Your solution
Answer
1 1
cosec 38.5◦ = = 1.606 sec 22.6◦ = = 1.083
sin 38.5◦ cos 22.6◦
1
cot 88.32◦ = = 0.029
tan 88.32◦
2. Solving right-angled triangles

Solving right-angled triangles means obtaining the values of all the angles and all the sides of a given
right-angled triangle using the trigonometric functions (and, if necessary, the inverse trigonometric
functions) and perhaps Pythagoras’ theorem.
There are three cases to be considered:
Case 1 Given the hypotenuse and an angle

We use sin or cos as appropriate:
B
h
y
A θ
x C
(a)
Figure 12
Assuming h and θ in Figure 12 are given then
x
cos θ = which gives x = h cos θ
h
from which x can be calculated.
Also
y
sin θ = so y = h sin θ which enables us to calculate y.
h
Clearly the third angle of this triangle (at B) is 90◦ − θ.
12 HELM (2006):
®
Case 2 Given a side other than the hypotenuse and an angle.

We use tan:
y
(a) If x and θ are known then, in Figure 12, tan θ = so y = x tan θ
x
which enables us to calculate y.
y y
(b) If y and θ are known then tan θ = gives x = from which x can be calculated.
x tan θ
p
Then the hypotenuse can be calculated using Pythagoras’ theorem: h= x2 + y 2
Case 3 Given two of the sides

We use tan−1 or sin−1 or cos−1 :
(a)
y y
y −1
tan θ = so θ = tan
x x
θ
x
Figure 13
(b)
h y y
y sin θ = so θ = sin−1
h h
θ
Figure 14
(c)
h x x
−1
cos θ = so θ = cos
h h
θ
x
Figure 15
Note: since two sides are given we can use Pythagoras’ theorem to obtain the length of the third
side at the outset.
HELM (2006): 13
Vintage car brake pedal mechanism
Introduction
Figure 16 shows the structure and some dimensions of a vintage car brake pedal arrangement as
far as the brake cable. The moment of a force about a point is the product of the force and the
perpendicular distance from the point to the line of action of the force. The pedal is pivoted about
the point A. The moments about A must be equal as the pedal is stationary.
Problem in words
If the driver supplies a force of 900 N , to act at point B, calculate the force (F ) in the cable.

The perpendicular distance from the line of action of the force provided by the driver to the pivot
point A is denoted by x1 and the perpendicular distance from the line of action of force in the cable
to the pivot point A is denoted by x2 . Use trigonometry to relate x1 and x2 to the given dimensions.
Calculate clockwise and anticlockwise moments about the pivot and set them equal.
15◦
900 N
cable ◦
40 B
F
75 mm
x2
A 40◦
x1
210 mm
Figure 16: Structure and dimensions of vintage car brake pedal arrangement
Mathematical Analysis
The distance x1 is found by considering the right-angled triangle shown in Figure 17 and using the
definition of cosine.
210 mm
40◦ x1
cos(40◦ ) = hence x1 = 161 mm.
x1 0.210
Figure 17
14 HELM (2006):
®
The distance x2 is found by considering the right-angled triangle shown in Figure 18.
x2
15◦ cos(15◦ ) = hence x2 = 72 mm.
75 mm 0.075
x2
Figure 18
Equating moments about A:
900x1 = F x2 so F = 2013 N.
Interpretation
This means that the force exerted by the cable is 2013 N in the direction of the cable. This force is
more than twice that applied by the driver. In fact, whatever the force applied at the pedal the force
in the cable will be more than twice that force. The pedal structure is an example of a lever system
that offers a mechanical gain.
Task
Obtain all the angles and the remaining side for the triangle shown:
c 4
B 5 C
Your solution
Answer
4
This is Case 3. To obtain the angle at B we use tan B = so B = tan−1 (0.8) = 38.66◦ .
5
Then the angle at A is 180◦ − (90◦ − 38.66◦ ) = 51.34◦ .
√ √
By Pythagoras’ theorem c = 42 + 52 = 41 ≈ 6.40.
HELM (2006): 15
Task
Obtain the remaining sides and angles for the triangle shown.
A
15
b
31◦ 40 B
C a
Your solution
Answer
a
This is Case 1. Since 31◦ 400 = 31.67◦ then cos 31.67◦ = so a = 15 cos 31.67◦ = 12.77.
15
◦ ◦ ◦
The angle at A is 180 − (90 + 31.67 ) = 58.33 .
b
Finally sin 31.67◦ = . .. b = 15 sin 31.67◦ = 7.85.
15
(Alternatively, of course, Pythagoras’ theorem could be used to calculate the length b.)
Task
Obtain the remaining sides and angles of the following triangle.
A
c
8
34◦ 20" B
C a
Your solution
Answer
This is Case 2.
8 8
Here tan 34.33◦ = so a = = 11.7
a tan 34.33◦
√
Also c = 82 + 11.72 = 14.18 and the angle at A is 180◦ − (90◦ + 34.33◦ ) = 55.67◦ .
16 HELM (2006):
®
Exercises
1. Obtain cosec θ, sec θ, cot θ, θ in the following right-angled triangle.
A
C 15 θ B
2. Write down sin θ, cos θ, tan θ, cosec θ for each of the following triangles:
A A
(a) 2 5 (b) y
C θ B C x θ B
3. If θ is an acute angle such that sin θ = 2/7 obtain, without use of a calculator, cos θ and tan θ.
4. Use your calculator to obtain the acute angles θ satisfying
(a) sin θ = 0.5260, (b) tan θ = 2.4, (c) cos θ = 0.2
5. Solve the right-angled triangle shown:

A α
α = 57.5◦
b c
C 10 β B
6. A surveyor measures the angle of elevation between the top of a mountain and ground level at
two different points. The results are shown in the following figure. Use trigonometry to obtain
the distance z (which cannot be measured) and then obtain the height h of the mountain.
h
◦ ◦
37 41
0.5 km z
7. As shown below two tracking stations S1 and S2 sight a weather balloon (W B) between them
at elevation angles α and β respectively.
WB
h
S1 α P β S2
c
c
Show that the height h of the balloon is given by h =
cot α + cot β
8. A vehicle entered in a ‘soap box derby’ rolls down a hill as shown in the figure. Find the total
distance (d1 + d2 ) that the soap box travels.
START
d1 200 metres
15◦ 28◦
FINISH
d2
HELM (2006): 17
Answers
√ 1 17 1 17 1 15
1. h = 152 + 82 = 17, cosec θ = = sec θ = = cot θ = =
sin θ 8 cos θ 15 tan θ 8
8
θ = sin−1 (for example) . .. θ = 28.07◦
17
√ √
2 21 2 21 5
2. (a) sin θ = cos θ = tan θ = cosec θ =
5 5 21 2
p
y x y x2 + y 2
(b) sin θ = p cos θ = p tan θ = cosec θ =
x2 + y 2 x2 + y 2 x y
3. Referring to the following diagram

C
7 2 √ √ √
!= 72 − 22 = 45 = 3 5
θ B
A
!
√ √
3 5 2 2 5
Hence cos θ = tan θ = √ =
7 3 5 15
4. (a) θ = sin−1 0.5260 = 31.73◦ (b) θ = tan−1 2.4 = 67.38◦ (c) θ = cos−1 0.2 = 78.46◦
10 10
5. β = 90 − α = 32.5◦ , b= ' 6.37 c= ' 11.86
tan 57.5◦ sin 57.5◦
h h
6. tan 37◦ = tan 41◦ = from which
z + 0.5 z
h = (z + 0.5) tan 37◦ = z tan 41◦ , so z tan 37◦ − z tan 41◦ = −0.5 tan 37◦
−0.5 tan 37◦

. .. z = ' 3.2556 km, so h = z tan 41◦ = 3.2556 tan 41◦ ' 2.83 km
tan 37◦ − tan 41◦
7. Since the required answer is in terms of cot α and cot β we proceed as follows:
1 x 1 c−x
Using x to denote the distance S1 P cot α = = cot β = =
tan α h tan β h
x c−x c c
Adding: cot α + cot β = + = . .. h= as required.
h h h cot α + cot β
200
8. From the smaller right-angled triangle d1 = = 426.0 m. The base of this triangle
◦
sin 28◦
then has length ` = 426 cos 28 = 376.1 m
From the larger right-angled triangle the straight-line distance from START to FINISH is
200 √
= 772.7 m. Then, using Pythagoras’ theorem (d2 + `) = 772.72 − 2002 = 746.4 m
sin 15◦
from which d2 = 370.3 m . .. d1 + d2 = 796.3 m
18 HELM (2006):
®
Trigonometric
Functions 4.2
Introduction
Our discussion so far has been limited to right-angled triangles where, apart from the right-angle
itself, all angles are necessarily less than 90◦ . We now extend the definitions of the trigonometric
functions to any size of angle, which greatly broadens the range of applications of trigonometry.


triangles

'
$
• express angles in radians
Learning Outcomes • define trigonometric functions generally
On completion you should be able to . . . • sketch the graphs of the three main
trigonometric functions: sin, cos, tan
& %
HELM (2006): 19
Section 4.2: Trigonometric Functions
1. Trigonometric functions for any size angle
The radian
First we introduce an alternative to measuring angles in degrees. Look at the circle shown in Figure
19(a). It has radius r and we have shown an arc AB of length ` (measured in the same units as r.)
As you can see the arc subtends an angle θ at the centre O of the circle.
A

r ◦
B 180
θ
A B
O O
(a) (b)
Figure 19
The angle θ in radians is defined as
length of arc AB `
θ= =
radius r
20
So, for example, if r = 10 cm, ` = 20 cm, the angle θ would be = 2 radians.
10
The relation between the value of an angle in radians and its value in degrees is readily obtained
as follows. Referring to Figure 19(b) imagine that the arc AB extends to cover half the complete
perimeter of the circle. The arc length is now πr (half the circumference of the circle) so the angle
θ subtended by AB is now
πr
θ= = π radians
r
But clearly this angle is 180◦ . Thus π radians is the same as 180◦ .
180
Note conversely that since π radians = 180◦ then 1 radian = degrees (about 57.3◦ ).
π
Key Point 6
180◦ = π radians
180
360◦ = 2π radians 1 radian = degrees (≈ 57.3◦ )
π
π
1◦ = radians
180
πx 180y
x◦ = radians y radians = degrees
180 π
20 HELM (2006):
®
Task
Write down the values in radians of 30◦ , 45◦ , 90◦ , 135◦ . (Leave your answers as
multiples of π.)
Your solution
Answer
30 π π π 3π
30◦ = π × = radians 45◦ = radians 90◦ = radians 135◦ = radians
180 6 4 2 4
Task
Write in degrees the following angles given in radians
π π 7π 23π
, , ,
10 5 10 12
Your solution
Answer
π 180 π π 180 π 7π 180 7π
rad = × = 18◦ rad = × = 36◦ rad = × = 126◦
10 π 10 5 π 5 10 π 10
23π 180 23π
rad = × = 345◦
12 π 12
Task
Put your calculator into radian mode (using the DRG button if necessary) for
this Task: Verify these facts by first converting the angles to radians:
1 1 √
sin 30◦ = cos 45◦ = √ tan 60◦ = 3 (Use the π button to obtain π.)
2 2
Your solution
Answer
π π 1
◦ ◦
sin 30 = sin = 0.5, cos 45 = cos = 0.7071 = √ ,
6 4 2
π √
tan 60◦ = tan = 1.7320 = 3
3
HELM (2006): 21
2. General definitions of trigonometric functions
We now define the trigonometric functions in a more general way than in terms of ratios of sides of
a right-angled triangle. To do this we consider a circle of unit radius whose centre is at the origin
of a Cartesian coordinate system and an arrow (or radius vector) OP from the centre to a point P
on the circumference of this circle. We are interested in the angle θ that the arrow makes with the
positive x-axis. See Figure 20.
P
r
θ
O
Figure 20
Imagine that the vector OP rotates in anti-clockwise direction. With this sense of rotation the
angle θ is taken as positive whereas a clockwise rotation is taken as negative. See examples in
Figure 21.
θ
θ
O O O θ
P
P
◦ π ◦ 7π ◦ π
θ = 90 = rad θ = 315 = rad θ = −45 = − rad
2 4 4
Figure 21
22 HELM (2006):
®
The sine and cosine of an angle

π
For 0 ≤ θ ≤ (called the first quadrant) we have the following situation with our unit radius circle.
2
See Figure 22.
R P
x
O Q
Figure 22
The projection of OP along the positive x−axis is OQ. But, in the right-angled triangle OP Q
OQ
cos θ = or OQ = OP cos θ
OP
and since OP has unit length cos θ = OQ (3)
Similarly in this right-angled triangle
PQ
sin θ = or P Q = OP sin θ
OP
but P Q = OR and OP has unit length
so sin θ = OR (4)
Equation (3) tells us that we can interpret cos θ as the projection of OP along the positive x -axis
and sin θ as the projection of OP along the positive y -axis.
We shall use these interpretations as the definitions of sin θ and cos θ for any values of θ.
Key Point 7
For a radius vector OP of a circle of unit radius making an angle θ with the positive x−axis
cos θ = projection of OP along the positive x−axis
sin θ = projection of OP along the positive y−axis
HELM (2006): 23
Sine and cosine in the four quadrants
First quadrant (0 ≤ θ ≤ 90◦ )
y y y
P
R P
P θ
O x O Q x O x
θ = 0◦ 0 < θ < 90◦ θ = 90◦

OQ = OP = 1 cos θ = OQ OQ = 0
∴ cos 0◦ = 1 ∴ 0 < cos θ < 1 ∴ cos 90◦ = 0
OR = 0 sin θ = OR OR = OP = 1
∴ sin 0◦ = 0 ∴ 0 < sin θ < 1 ∴ sin 90◦ = 1
Figure 23
It follows from Figure 23 that cos θ decreases from 1 to 0 as OP rotates from the horizontal position
to the vertical, i.e. as θ increases from 0◦ to 90◦ .
sin θ = OR increases from 0 (when θ = 0) to 1 (when θ = 90◦ ).
Second quadrant (90◦ ≤ θ ≤ 180◦ )
Referring to Figure 24, remember that it is the projections along the positive x and y axes that
are used to define cos θ and sin θ respectively. It follows that as θ increases from 90◦ to 180◦ , cos θ
decreases from 0 to −1 and sin θ decreases from 1 to 0.
y y y
P
P
R
θ
O x Q O x P O x
θ = 90◦ 90◦ < θ < 180◦ θ = 180◦

cos 90◦ = 0 cos θ = OQ (negative) cos θ = OQ = OP = −1
sin 90◦ = 1 sin θ = OR (positive) sin θ = OR = 0
Figure 24
Considering for example an angle of 135◦ , referring to Figure 25, by symmetry we have:
1 1
sin 135◦ = OR = sin 45◦ = √ cos 135◦ = OQ2 = −OQ1 = − cos 45◦ = − √
2 2
y
P2 R P1
45◦
Q2 O Q1 x
Figure 25
24 HELM (2006):
®
Key Point 8
sin(180 − x) ≡ sin x and cos(180 − x) ≡ − cos x
Task
Without using a calculator write down the values of
sin 120◦ , sin 150◦ , cos 120◦ , cos 150◦ , tan 120◦ , tan 150◦ .
sin θ
(Note that tan θ ≡ for any value of θ.)
cos θ
Your solution
Answer
√
3
sin 120◦ = sin(180 − 60) = sin 60◦ =
2
1
sin 150◦ = sin(180 − 30) = sin 30◦ =
2
1
cos 120◦ = − cos 60 = −
2
√
3
cos 150◦ = − cos 30◦ = −
2
√
3 √
tan 120◦ = 2
=− 3
− 21
1
◦ 2 1
tan 150 = √ = −√
3 3
− 2
HELM (2006): 25
Third quadrant (180◦ ≤ θ ≤ 270◦ ).
θ ◦
270
P Q
O O O
R
P P
cos 180◦ = −1 180◦ < θ < 270◦ θ = 270◦

sin 180◦ = 0 cos θ = OQ (negative) cos θ =?
sin θ = OR (negative) sin θ =?
Figure 26
Task
Using the projection definition write down the values of cos 270◦ and sin 270◦ .
Your solution
Answer
cos 270◦ = 0 (OP has zero projection along the positive x−axis)
sin 270◦ = −1 (OP is directed along the negative axis)
Thus in the third quadrant, as θ increases from 180◦ to 270◦ so cos θ increases from −1 to 0 whereas
sin θ decreases from 0 to −1.
From the results of the last Task, with θ = 180◦ + x (see Figure 27) we obtain for all x the relations:
sin θ = sin(180 + x) = OR = −OR0 = − sin x cos θ = cos(180 + x) = OQ = −OQ0 = − cos x
sin(180◦ + x) sin x
Hence tan(180 + x) = ◦
= = + tan x for all x.
cos(180 + x) cos x
R!
Q x
x O Q!
P R
Figure 27: θ = 180◦ + x
Key Point 9
sin(180 + x) ≡ − sin x cos(180 + x) ≡ − cos x tan(180 + x) ≡ + tan x
26 HELM (2006):
®
Fourth quadrant (270◦ ≤ θ ≤ 360◦ )
θ
θ Q θ
O
R
P
P
θ = 270◦ 270◦ < θ < 360◦ 360◦
◦
cos θ = 0 (alternatively −90◦ < θ < 0◦) (results as for 0 )
sin θ = −1 cos θ = OQ < 0
sin θ = OR < 0
Figure 28
From Figure 28 the results in Key Point 10 should be clear.
Key Point 10
cos(−x) ≡ cos x sin(−x) ≡ − sin x tan(−x) ≡ − tan x.
Task
Write down (without using a calculator) the values of
sin 300◦ , sin(−60◦ ), cos 330◦ , cos(−30◦ ).
Describe the behaviour of cos θ and sin θ as θ increases from 270◦ to 360◦ .
Your solution
Answer
√ √
sin 300◦ = − sin 60◦ = − 3/2 cos 330◦ = cos 30◦ = 3/2
√ √
sin(−60◦ ) = − sin 60◦ = − 3/2 cos(−30◦ ) = cos 30◦ = 3/2
cos θ increases from 0 to 1 and sin θ increases from −1 to 0 as θ increases from 270◦ to 360◦ .
HELM (2006): 27
Rotation beyond the fourth quadrant (360◦ < θ)
If the vector OP continues to rotate around the circle of unit radius then in the next complete
rotation θ increases from 360◦ to 720◦ . However, a θ value of, say, 405◦ is indistinguishable from
one of 45◦ (just one extra complete revolution is involved).
1 1
So sin(405◦ ) = sin 45◦ = √ and cos(405◦ ) = cos 45◦ = √
2 2
In general sin(360◦ + x◦ ) = sin x◦ , cos(360◦ + x◦ ) = cos x◦
Key Point 11
If n is any integer sin(x◦ + 360n◦ ) ≡ sin x◦ cos(x◦ + 360n◦ ) ≡ cos x◦
or, since 360◦ ≡ 2π radians, sin(x + 2nπ) ≡ sin x cos(x + 2nπ) = cos x
We say that the functions sin x and cos x are periodic with period (in radian measure) of 2π.
3. Graphs of trigonometric functions

Graphs of sin θ and cos θ
Since we have defined both sin θ and cos θ in terms of the projections of the radius vector OP of a
circle of unit radius it follows immediately that
−1 ≤ sin θ ≤ +1 and − 1 ≤ cos θ ≤ +1 for any value of θ.
We have discussed the behaviour of sin θ and cos θ in each of the four quadrants in the previous
subsection.
Using all the above results we can draw the graphs of these two trigonometric functions. See Figure
29. We have labelled the horizontal axis using radians and have shown two periods in each case.
sin θ cos θ
1 1
π
2
−2π −π 0 π 2π θ −2π −π 0 π 2π θ
−1 −1
Figure 29
We have extended the graphs to negative values of θ using the relations sin(−θ) = sin θ, cos(−θ) =
cos θ. Both graphs could be extended indefinitely to the left (θ → −∞) and right (θ → +∞).
28 HELM (2006):
®
Task
(a) Using the graphs in Figure 29 and the fact that tan θ ≡ sin θ/ cos θ calculate
the values of tan 0, tan π, tan 2π.
(b) For what values of θ is tan θ undefined?
(c) State whether tan θ is positive or negative in each of the four quadrants.
Your solution
(a)
(b)
(c)
Answer
(a)
sin 0 0
tan 0 = = =0
cos 0 1
sin π 0
tan π = = =0
cos π −1
sin 2π 0
tan 2π = = =0
cos 2π 1
(b)
π 3π 5π
tan θ is not be defined when cos θ = 0 i.e. when θ = ± , ± ,± ,...
2 2 2
(c)
sin θ +ve
1st quadrant: tan θ = = = +ve
cos θ +ve
sin θ +ve
2nd quadrant: tan θ = = = −ve
cos θ −ve
sin θ −ve
3rd quadrant: tan θ = = = +ve
cos θ −ve
sin θ −ve
4th quadrant: tan θ = = = −ve
cos θ +ve
HELM (2006): 29
The graph of tan θ
The graph of tan θ against θ, for −2π ≤ θ ≤ 2π is then as in Figure 30. Note that whereas sin θ
and cos θ have period 2π, tan θ has period π.
tan θ
− 3π − π π 3π
2 2 2 2
−2π −π 0 2π θ
Figure 30
Task
On the following diagram showing the four quadrants mark which trigonometric
quantities cos, sin, tan, are positive in the four quadrants. One entry has been
made already.
Your solution
cos
Answer
sin all
tan cos
30 HELM (2006):
®
Optical interference fringes due to a glass plate
Monochromatic light of intensity I0 propagates in air before impinging on a glass plate (see Figure
31). If a screen is placed beyond the plate then a pattern is observed including alternate light and
dark regions. These are interference fringes.
I0 Air
α
ψ
Glass plate
Air α
Figure 31: Geometry of a light ray transmitted and reflected through a glass plate
The intensity I of the light wave transmitted through the plate is given by
I0 |t|4
I=
1 + |r|4 − 2|r|2 cos θ
where t and r are the complex transmission and reflection coefficients. The phase angle θ is the sum
of
(i) a phase proportional to the incidence angle α and
(ii) a fixed phase lag due to multiple reflections.
The problem is to establish the form of the intensity pattern (i.e. the minima and maxima charac-
teristics of interference fringes due to the plate), and deduce the shape and position θ of the fringes
captured by a screen beyond the plate.
Solution
The intensity of the optical wave outgoing from the glass plate is given by
I0 |t|4
I= (1)
1 + |r|4 − 2|r|2 cos θ
The light intensity depends solely on the variable θ as shown in equation (1), and the objective is
to find the values θ that will minimize and maximize I. The angle θ is introduced in equation (1)
through the function cos θ in the denominator. We consider first the maxima of I.
HELM (2006): 31
Solution (contd.)
Light intensity maxima
I is maximum when the denominator is minimum. This condition is obtained when the factor
2|r| cos θ is maximum due to the minus sign in the denominator. As stated in Section 4.2, the
maxima of 2|r| cos θ occur when cos θ = +1. Values of cos θ = +1 correspond to θ = 2nπ where
n = . . . − 2, −1, 0, 1, 2, . . . (see Section 4.5) and θ is measured in radians. Setting cos θ = +1 in
equation (1) gives the intensity maxima
I0 |t|4
Imax = .
1 + |r|4 − 2|r|2
Since the denominator can be identified as the square of (1 + |r|2 ), the final result for maximum
intensity can be written as
I0 |t|4
Imax = . (2)
(1 − |r|2 )2
Light intensity minima
I is minimum when the denominator in (1) is maximum. As a result of the minus sign in the
denominator, this condition is obtained when the factor 2|r| cos θ is minimum. The minima of
2|r| cos θ occur when cos θ = −1. Values of cos θ = −1 correspond to θ = π(2n + 1) where
n = . . . − 2, −1, 0, 1, 2, . . . (see Section 4.5). Setting cos θ = −1 in equation (1) gives an expression
for the intensity minima
I0 |t|4
Imin = .
1 + |r|4 + 2|r|2
Since the denominator can be recognised as the square of (1 + |r|2 ), the final result for minimum
intensity can be written as
I0 |t|4
Imin = (3)
(1 + |r|2 )2
Interpretation
The interference fringes for intensity maxima or minima occur at constant angle θ and therefore
describe concentric rings of alternating light and shadow as sketched in the figure below. From the
centre to the periphery of the concentric ring system, the fringes occur in the following order
(a) a fringe of maximum light at the centre (bright dot for θ = 0),
(b) a circular fringe of minimum light at angle θ = π,
(c) a circular fringe of maximum light at 2π etc.
θ=π
θ = 2π
θ = 3π
Figure 32: Sketch of interference fringes due to a glass plate
32 HELM (2006):
®
Exercises
1. Express the following angles in radians (as multiples of π)
(a) 120◦ (b) 20◦ (c) 135◦ (d) 300◦ (e) −90◦ (f) 720◦
2. Express in degrees the following quantities which are in radians

π 3π 5π 11π π 1
(a) (b) (c) (d) (e) − (f)
2 2 6 9 8 π
3. Obtain the precise values of all 6 trigonometric functions of the angle θ for the situation shown
in the figure:
P (−3, 1)
θ
4. Obtain all the values of x between 0 and 2π such that

√
1 1 3 1
(a) sin x = √ (b) cos x = (c) sin x = − (d) cos x = − √ (e) tan x = 2
2 2 2 2
1 1
(f) tan x = − (g) cos(2x + 60◦ ) = 2 (h) cos(2x + 60◦ ) =
2 2
5. Obtain all the values of θ in the given domain satisfying the following quadratic equations
(a) 2 sin2 θ − sin θ = 0 0 ≤ θ ≤ 360◦

(b) 2 cos2 θ + 7 cos θ + 3 = 0 0 ≤ θ ≤ 360◦
(c) 4 sin2 θ − 1 = 0
6. (a) Show that the area A of a sector formed by a central angle θ radians in a circle of radius
r is given by
1
A = r2 θ.
2
(Hint: By proportionality the ratio of the area of the sector to the total area of the circle
equals the ratio of θ to the total angle at the centre of the circle.)
(b) What is the value of the shaded area shown in the figure if θ is measured (i) in radians,
(ii) in degrees?
r
θ
R
1 1
7. Sketch, over 0 < θ < 2π, the graph of (a) sin 2θ (b) sin θ (c) cos 2θ (d) cos θ.
2 2
Mark the horizontal axis in radians in each case. Write down the period of sin 2θ and the
1
period of cos θ.
2
HELM (2006): 33
Answers
2π π 3π 5π π
1. (a) (b) (c) (d) (e) − (f) 4π
3 9 4 3 2
180◦
2. (a) 15◦ (b) 270◦ (c) 150◦ (d) 220◦ (e) −22.5◦ (f)
π2
p √
3. The distance of the point P from the origin is r = (−3)2 + 12 = 10. Then, since P lies
√
on a circle radius 10 rather than a circle of unit radius:
1 √
sin θ = √ cosec θ = 10
10
√
3 10
cos θ = − √ sec θ = −
10 3
1 1
tan θ = =− cot θ = −3
−3 3
π
◦

◦ 3π
4. (a) x = 45 radians
x = 135 (recall sin(180 − x) = sin x)
4 4

◦ π 5π

◦
(b) x = 60 x = 300
3 3

◦ 4π ◦ 5π
(c) x = 240 x = 300
3 3

3π 5π
(d) x = 135◦ x = 225◦
4 4
(e) x = 63.43◦ x = 243.43◦ (remember tan x has period 180◦ or π radians)
(f) x = 153.43◦ x = 333.43◦
(g) No solution !
(h) x = 0◦ , 120◦ , 180◦ , 300◦ , 360◦
5. (a) 2 sin2 θ − sin θ = 0 so sin θ(2 sin θ − 1) = 0 so sin θ = 0

1
giving θ = 0◦ , 180◦ , 360◦ or sin θ = giving θ = 30◦ , 150◦
2
(b) 2 cos2 θ+7 cos θ+3 = 0. With x = cos θ we have 2x2 +7x+3 = 0 (2x+1)(x+3) = 0
1
(factorising) so 2x = −1 or x = − . The solution x = −3 is impossible since x = cos θ.
2
1
The equation x = cos θ = − has solutions θ = 120◦ , 240◦
2
1 1
(c) 4 sin2 θ = 1 so sin2 θ = i.e. sin θ = ± giving θ = 30◦ , 150◦ , 210◦ , 330◦
4 2
34 HELM (2006):
®
Answers continued
6. (a) Using the hint,

θ A
= 2
2π πr
πr2 θ r2 θ
from where we obtain A = =
2π 2
(b) With θ in radians the shaded area is
R2 θ r 2 θ θ
S= − = (R2 − r2 )
2 2 2
180x◦ πx
If θ is in degrees, then since x radians = or x◦ = radians, we have
π 180
πθ◦ 2
S= (R − r2 )
360◦
7. The graphs of sin 2θ and cos 2θ are identical in form with those of sin θ and cos θ respectively
but oscillate twice as rapidly.
1 1
The graphs of sin θ and cos θ oscillate half as rapidly as those of sin θ and cos θ.
2 2
sin 2θ cos 2θ
1 1
π π 2π π 2π
2
−1 −1
1 1
sin θ cos θ
2 1 2
1
π 2π π 2π
−1
From the graphs sin 2θ has period 2π and cos 21 θ has period 4π. In general sin nθ has period
2π/n.
HELM (2006): 35
Trigonometric
Identities 4.3

Introduction
A trigonometric identity is a relation between trigonometric expressions which is true for all values
of the variables (usually angles). There are a very large number of such identities. In this Section we
discuss only the most important and widely used. Any engineer using trigonometry in an application
is likely to encounter some of these identities.


triangles

• use the main trigonometric identities
Learning Outcomes
• use trigonometric identities to combine
On completion you should be able to . . . trigonometric functions

36 HELM (2006):
®
1. Trigonometric identities
An identity is a relation which is always true. To emphasise this the symbol ‘≡’ is often used rather
than ‘=’. For example, (x + 1)2 ≡ x2 + 2x + 1 (always true) but (x + 1)2 = 0 (only true for x = −1).
Task
(a) Using the exact values, evaluate sin2 θ + cos2 θ for (i) θ = 30◦ (ii) θ = 45◦
[Note that sin2 θ means (sin θ)2 , cos2 θ means (cos θ)2 ]
(b) Choose a non-integer value for θ and use a calculator to evaluate sin2 θ+cos2 θ.
Your solution
Answer
2 √ !2
1 3 1 3
(a) (i) sin2 30◦ + cos2 30◦ = + = + =1
2 2 4 4
2 2
2 ◦ 2 ◦ 1 1 1 1
(ii) sin 45 + cos 45 = √ + √ = + =1
2 2 2 2
(b) The answer should be 1 whatever value you choose.
Key Point 12
For any value of θ
sin2 θ + cos2 θ ≡ 1 (5)
One way of proving the result in Key Point 12 is to use the definitions of sin θ and cos θ obtained
from the circle of unit radius. Refer back to Figure 22 on page 23.
Recall that cos θ = OQ, sin θ = OR = QP . By Pythagoras’ theorem
(OQ)2 + (QP )2 = (OP )2 = 1
hence cos2 θ + sin2 θ = 1.
We have demonstrated the result (5) using an angle θ in the first quadrant but the result is true for
any θ i.e. it is indeed an identity.
HELM (2006): 37
Section 4.3: Trigonometric Identities
Task
By dividing the identity sin2 θ + cos2 θ ≡ 1 by (a) sin2 θ (b) cos2 θ obtain two
further identities.
[Hint: Recall the definitions of cosec θ, sec θ, cot θ.]
Your solution
Answer
sin2 θ cos2 θ 1 sin2 θ cos2 θ 1
(a) 2 + 2 = (b) + =
sin θ sin θ sin2 θ cos2 θ cos2 θ cos2 θ
1 + cot2 θ ≡ cosec2 θ tan2 θ + 1 ≡ sec2 θ
Key Point 13 introduces two further important identities.
Key Point 13
sin(A + B) ≡ sin A cos B + cos A sin B (6)
cos(A + B) ≡ cos A cos B − sin A sin B (7)
Note carefully the addition sign in (6) but the subtraction sign in (7).
Further identities can readily be obtained from (6) and (7).
Dividing (6) by (7) we obtain
sin(A + B) sin A cos B + cos A sin B
tan(A + B) ≡ ≡
cos(A + B) cos A cos B − sin A sin B
Dividing every term by cos A cos B we obtain
tan A + tan B
tan(A + B) ≡
1 − tan A tan B
Replacing B by −B in (6) and (7) and remembering that cos(−B) ≡ cos B, sin(−B) ≡ − sin B
we find
sin(A − B) ≡ sin A cos B − cos A sin B
cos(A − B) ≡ cos A cos B + sin A sin B
38 HELM (2006):
®
Task
Using the identities sin(A − B) ≡ sin A cos B − cos A sin B and
cos(A − B) ≡ cos A cos B + sin A sin B obtain an expansion for tan(A − B):
Your solution
Answer
sin A cos B − cos A sin B
tan(A − B) ≡ .
cos A cos B + sin A sin B
Dividing every term by cos A cos B gives
tan A − tan B
tan(A − B) ≡
1 + tan A tan B
The following identities are derived from those in Key Point 13.
Key Point 14
tan A + tan B
tan(A + B) ≡ (8)
1 − tan A tan B
sin(A − B) ≡ sin A cos B − cos A sin B (9)
cos(A − B) ≡ cos A cos B + sin A sin B (10)
tan A − tan B
tan(A − B) ≡ (11)
1 + tan A tan B
HELM (2006): 39
Amplitude modulation
Introduction
Amplitude Modulation (the AM in AM radio) is a method of sending electromagnetic signals of a
certain frequency (signal frequency) at another frequency (carrier frequency) which may be better
for transmission. Modulation can be represented by the multiplication of the carrier and modulating
signals. To demodulate the signal the carrier frequency must be removed from the modulated
signal.
Problem in words
(a) A single frequency of 200 Hz (message signal) is amplitude modulated with a carrier frequency
of 2 MHz. Show that the modulated signal can be represented by the sum of two frequencies at
2 × 106 ± 200 Hz
(b) Show that the modulated signal can be demodulated by using a locally generated carrier and
applying a low-pass filter.
(a) Express the message signal as m = a cos(ωm t) and the carrier as c = b cos(uc t).
Assume that the modulation gives the product mc = ab cos(uc t) cos(ωm t).
Use trigonometric identities to show that
mc = ab cos(ωc t) cos(um t) = k1 cos((ωc − um )t) + k2 cos((ωc + um )t)
where k1 and k2 are constants.
Then substitute ωc = 2 × 106 × 2π and ωm = 200 × 2π to calculate the two resulting frequencies.
(b) Use trigonometric identities to show that multiplying the modulated signal by b cos(uc t) results
in the lowest frequency component of the output having a frequency equal to the original message
signal.
(a) The message signal has a frequency of fm = 200 Hz so ωm = 2πfc = 2π × 200 = 400π radians
per second.
The carrier signal has a frequency of fc = 2 × 106 Hz.
Hence ωc = 2πfc = 2π × 2 × 106 = 4 × 106 π radians per second.
So mc = ab cos(4 × 106 πt) cos(400πt).
Key Point 13 includes the identity:
cos(A + B) + cos(A − B) ≡ 2 cos(A) cos(B)
40 HELM (2006):
®
Rearranging gives the identity:

cos(A) cos(B) ≡ 12 (cos(A + B) + cos(A − B)) (1)
Using (1) with A = 4 × 106 πt and B = 400πt gives
mc = ab(cos(4 × 106 πt) cos(400πt)

= ab(cos(4 × 106 πt + 400πt) + cos(4 × 106 πt − 400πt))
= ab(cos(4000400πt) + cos(3999600πt))
So the modulated signal is the sum of two waves with angular frequency of 4000400π and 3999600π
radians per second corresponding to frequencies of 4000400π/(2π) and 39996000π/(2π), that is
2000200 Hz and 1999800 Hz i.e. 2 × 106 ± 200 Hz.
(b) Taking identity (1) and multiplying through by cos(A) gives
cos(A) cos(A) cos(B) ≡ 12 cos(A)(cos(A + B) + cos(A − B))
so
cos(A) cos(A) cos(B) ≡ 21 (cos(A) cos(A + B) + cos(A) cos(A − B)) (2)
Identity (1) can be applied to both expressions in the right-hand side of (2). In the first expression,
using A + B instead of ‘B’, gives
1 1
cos(A) cos(A + B) ≡ (cos(A + A + B) + cos(A − A − B)) ≡ (cos(2A + B) + cos(B))
2 2
where we have used cos(−B) ≡ cos(B).
Similarly, in the second expression, using A − B instead of ‘B’, gives
cos(A) cos(A − B) ≡ 21 (cos(2A − B) + cos(B))
Together these give:
1
cos(A) cos(A) cos(B) ≡ (cos(2A + B) + cos(B) + cos(2A − B) + cos(B))
2
1
≡ cos(B) + (cos(2A + B) + cos(2A − B))
2
With A = 4 × 106 πt and B = 400πt and substituting for the given frequencies, the modulated signal
multiplied by the original carrier signal gives
ab2 cos(4 × 106 πt) cos(4 × 106 πt) cos(400πt) =
ab2 cos(2π × 200t) + 12 ab2 (cos(2 × 4 × 106 πt + 400πt) + cos(2 × 4 × 106 πt − 400πt))
The last two terms have frequencies of 4 × 106 ± 200 Hz which are sufficiently high that a low-pass
filter would remove them and leave only the term
ab2 cos(2π × 200t)
which is the original message signal multiplied by a constant term.
HELM (2006): 41
Interpretation
Amplitude modulation of a single frequency message signal (fm ) with a single frequency carrier signal
(fc ) can be shown to be equal to the sum of two cosines with frequencies fc ± fm . Multiplying the
modulated signal by a locally generated carrier signal and applying a low-pass filter can reproduce
the frequency, fm , of the message signal.
This is known as double side band amplitude modulation.
Example 2
Obtain expressions for cos θ in terms of the sine function and for sin θ in terms of
the cosine function.
Solution
π
Using (9) with A = θ, B = we obtain
2
π π π
cos θ − ≡ cos θ cos + sin θ sin ≡ cos θ (0) + sin θ (1)
2 2 2
π π
i.e. sin θ ≡ cos θ − ≡ cos −θ
2 2
This result explains why the graph of sin θ has exactly the same shape as the graph of cos θ but it
π
is shifted to the right by . (See Figure 29 on page 28). A similar calculation using (6) yields the
2
result
π
cos θ ≡ sin θ + .
2
Double angle formulae

If we put B = A in the identity given in (6) we obtain Key Point 15:
Key Point 15
sin 2A ≡ sin A cos A + cos A sin A so sin 2A ≡ 2 sin A cos A (12)
42 HELM (2006):
®
Task
Substitute B = A in identity (7) in Key Point 13 on page 38 to obtain an identity
for cos 2A. Using sin2 A + cos2 A ≡ 1 obtain two alternative forms of the identity.
Your solution
Answer
Using (7) with B ≡ A
cos(2A) ≡ (cos A)(cos A) − (sin A)(sin A)

. .. cos(2A) ≡ cos2 A − sin2 A (13)
Substituting for sin2 A in (13) we obtain
cos 2A ≡ cos2 A − (1 − cos2 A)

≡ 2 cos2 A − 1 (14)
Alternatively substituting for cos2 A in (13)
cos 2A ≡ (1 − sin2 A) − sin2 A

cos 2A ≡ 1 − 2 sin2 A (15)
Task
Use (14) and (15) to obtain, respectively, cos2 A and sin2 A in terms of cos 2A.
Your solution
Answer
1 1
From (14) cos2 A ≡ (1 + cos 2A). From (15) sin2 A ≡ (1 − cos 2A).
2 2
HELM (2006): 43
Task
Use (12) and (13) to obtain an identity for tan 2A in terms of tan A.
Your solution
Answer
sin 2A 2 sin A cos A
tan 2A ≡ ≡
cos 2A cos2 A − sin2 A
Dividing numerator and denominator by cos2 A we obtain
sin A
2
tan 2A ≡ cos A ≡ 2 tan A (16)
sin2 A 1 − tan2 A
1−
cos2 A
Half-angle formulae
A
If we replace A by and, consequently 2A by A, in (12) we obtain
2

A A
sin A ≡ 2 sin cos (17)
2 2
Similarly from (13)

2 A
cos A ≡ 2 cos − 1. (18)
2
These are examples of half-angle formulae. We can obtain a half-angle formula for tan A using
A
(16). Replacing A by and 2A by A in (16) we obtain
2

A
2 tan
2
tan A ≡ (19)
2 A
1 − tan
2
Other formulae, useful for integration when trigonometric functions are present, can be obtained
using (17), (18) and (19) shown in the Key Point 16.
44 HELM (2006):
®
Key Point 16

A
If t = tan then
2
2t
sin A = (20)
1 + t2
1 − t2
cos A = (21)
1 + t2
2t
tan A = (22)
1 − t2
Sum of two sines and sum of two cosines

Finally, in this Section, we obtain results that are widely used in areas of science and engineering
such as vibration theory, wave theory and electric circuit theory.
We return to the identities (6) and (9)
sin(A + B) ≡ sin A cos B + cos A sin B

sin(A − B) ≡ sin A cos B − cos A sin B
Adding these identities gives

sin(A + B) + sin(A − B) ≡ 2 sin A cos B (23)
Subtracting the identities produces
sin(A + B) − sin(A − B) ≡ 2 cos A sin B (24)
It is now convenient to let C = A + B and D = A − B so that
C +D C −D
A= and B =
2 2
Hence (23) becomes

C +D C −D
sin C + sin D ≡ 2 sin cos (25)
2 2
Similarly (24) becomes

C +D C −D
sin C − sin D ≡ 2 cos sin (26)
2 2
HELM (2006): 45
Task
Use (7) and (10) to obtain results for the sum and difference of two cosines.
Your solution
Answer
cos(A + B) ≡ cos A cos B − sin A sin B and cos(A − B) ≡ cos A cos B + sin A sin B
. .. cos(A + B) + cos(A − B) ≡ 2 cos A cos B
cos(A + B) − cos(A − B) ≡ −2 sin A sin B
Hence with C = A + B and D = A − B

C +D C −D
cos C + cos D ≡ 2 cos cos (27)
2 2

C +D C −D
cos C − cos D ≡ −2 sin sin (28)
2 2
Summary
In this Section we have covered a large number of trigonometric identities. The most important of
them and probably the ones most worth memorising are given in the following Key Point.
Key Point 17
cos2 θ + sin2 θ ≡ 1
sin 2θ ≡ 2 sin θ cos θ
cos 2θ ≡ cos2 θ − sin2 θ
≡ 2 cos2 θ − 1
≡ 1 − 2 sin2 θ
sin(A ± B) ≡ sin A cos B ± cos A sin B
cos(A ± B) ≡ cos A cos B ∓ sin A sin B
46 HELM (2006):
®
Task
A projectile is fired from the ground with an initial speed u m s−1 at an angle of
elevation α◦ . If air resistance is neglected, the vertical height, y m, is related to
the horizontal distance, x m, by the equation
gx2 sec2 α
y = x tan α − where g m s−2 is the gravitational constant.
2u2
[This equation is derived in 34 Modelling Motion pages 16-17.]
(a) Confirm that y = 0 when x = 0:
Your solution
Answer
When y = 0, the left-hand side of the equation is zero. Since x appears in both of the terms on
the right-hand side, when x = 0, the right-hand side is zero.
(b) Find an expression for the value of x other than x = 0 at which y = 0 and state how this value
is related to the maximum range of the projectile:
Your solution
Answer
gx2 sec2 α
When y = 0, the equation can be written − x tan α = 0
2u2
If x = 0 is excluded from consideration, we can divide through by x and rearrange to give
gxsec2 α
= tan α
2u2
2u2
To make x the subject of the equation we need to multiply both sides by .
gsec2 α
Given that 1/sec2 α ≡ cos2 α, tan α ≡ sin α/ cos α and sin 2α ≡ 2 sin α cos α, this results in
2u2 sin α cos α u2 sin 2α
x= =
g g
This represents the maximum range.
HELM (2006): 47
(c) Find the value of x for which the value of y would be a maximum and thereby obtain an expression
for the maximum height:
Your solution
Answer
If air resistance is neglected, we can assume that the parabolic path of the projectile is symmetrical
about its highest point. So the highest point will occur at half the maximum range i.e. where
u2 sin 2α
x=
2g
Substituting this expression for x in the equation for y gives
2 2 2
gsec2 α

u sin 2α u sin 2α
y= tan α −
2g 2g 2u2
Using the same trigonometric identities as before,
u2 sin2 α u2 sin2 α u2 sin2 α
y= − = This represents the maximum height.
g 2g 2g
(d) Assuming u = 20 m s−1 , α = 60◦ and g = 10 m s−2 , find the maximum value of the range and
the horizontal distances travelled when the height is 10 m:
Your solution
48 HELM (2006):
®
Answer
Substitution of u = 20, α = 60, g = 10 and y = 10 in the original equation gives a quadratic for
x:
10 = 1.732x − 0.05x2 or 0.05x2 − 1.732x + 10 = 0
Solution of this quadratic yields x = 7.33 or x = 27.32 as the two horizontal ranges at which
y = 10. These values are illustrated in the diagram below which shows the complete trajectory of
the projectile.
x1 = 7.33 x2 = 27.32
15
10
Height
0 5 10 15 20 25 30 35
Horizontal Range
HELM (2006): 49
Exercises
1. Show that sin tsect ≡ tan t.
2. Show that (1 + sin t)(1 + sin(−t)) ≡ cos2 t.

1 1
3. Show that ≡ sin 2θ.
tan θ + cot θ 2
4. Show that sin2 (A + B) − sin2 (A − B ≡ sin 2A sin 2B.
(Hint: the left-hand side is the difference of two squared quantities.)

sin 4θ + sin 2θ
5. Show that ≡ tan 3θ.
cos 4θ + cos 2θ
6. Show that cos4 A − sin4 A ≡ cos 2A
7. Express each of the following as the sum (or difference) of 2 sines (or cosines)
1 1 3
(a) sin 5x cos 2x (b) 8 cos 6x cos 4x (c) sin x cos x
3 2 2
8. Express (a) sin 3θ in terms of cos θ. (b) cos 3θ in terms of cos θ.
9. By writing cos 4x as cos 2(2x), or otherwise, express cos 4x in terms of cos x.

2 tan t
10. Show that tan 2t ≡ .
2 − sec2 t
cos 10t − cos 12t
11. Show that ≡ tan t.
sin 10t + sin 12t
x2
12. Show that the area of an isosceles triangle with equal sides of length x is sin θ
2
where θ is the angle between the two equal sides. Hint: use the following diagram:
A
θ
2
x x
B D C
50 HELM (2006):
®
Answers
1 sin t
1. sin t.sect ≡ sin t. ≡ ≡ tan t.
cos t cos t
2. (1 + sin t)(1 + sin(−t)) ≡ (1 + sin t)(1 − sin t) ≡ 1 − sin2 t ≡ cos2 t
1 1 1 sin θ cos θ 1
3. ≡ ≡ 2 ≡ 2 ≡ sin θ cos θ ≡ sin 2θ
tan θ + cos θ sin θ cos θ sin θ + cos θ2
sin θ + cos θ2 2
+
cos θ sin θ sin θ cos θ
4. Using the hint and the identity x2 − y 2 ≡ (x − y)(x + y) we have
sin2 (A + B) − sin2 (A − B) ≡ (sin(A + B) − sin(A − B))(sin(A + B) + sin(A − B))
The first bracket gives
sin A cos B + cos A sin B − (sin A cos B − cos A sin B) ≡ 2 cos A sin B
Similarly the second bracket gives 2 sin A cos B.
Multiplying we obtain (2 cos A sin A)(2 cos B sin B) ≡ sin 2A. sin 2B
sin 4θ + sin 2θ 2 sin 3θ cos θ sin 3θ
5. ≡ ≡ ≡ tan 3θ
cos 4θ + cos 2θ 2 cos 3θ cos θ cos 3θ
6.
cos4 A − sin4 A ≡ (cos A)4 − (sin A)4 ≡ (cos2 A)2 − (sin2 A)2
≡ (cos2 A − sin2 A)(cos2 A + sin2 A)
≡ cos2 A − sin2 A ≡ cos 2A

A+B A−B
7. (a) Using sin A + sin B ≡ 2 sin cos
2 2
A+B A−B
Clearly here = 5x = 2x giving A = 7x B = 3x
2 2
1
. .. sin 5x cos 2x ≡ (sin 7x + sin 3x)
2

A+B A−B
(b) Using cos A + cos B ≡ 2 cos cos .
2 2
A+B A−B
With = 6x = 4x giving A = 10x B = 2x
2 2
. .. 8 cos 6x cos 4x ≡ 4(cos 6x + cos 2x)

1 1 3x 1
(c) sin x cos ≡ (sin 2x − sin x)
3 2 2 6
HELM (2006): 51
Answers
8.
(a) sin 3θ ≡ sin(2θ + θ) = sin 2θ cos θ + cos 2θ sin θ

≡ 2 sin θ cos2 θ + (cos2 θ − sin2 θ) sin θ
≡ 3 sin θ cos2 θ − sin3 θ
≡ 3 sin θ(1 − sin2 θ) − sin3 θ ≡ 3 sin θ − 4 sin3 θ
(b) cos 3θ ≡ cos(2θ + θ) ≡ cos 2θ cos θ − sin 2θ sin θ

≡ (cos2 θ − sin2 θ) cos θ − 2 sin θ cos θ sin θ
≡ cos3 θ − 3 sin2 θ cos θ
≡ cos3 θ − 3(1 − cos2 θ) cos θ
≡ 4 cos3 θ − 3 cos θ
9.
cos 4x = cos 2(2x) ≡ 2 cos2 (2x) − 1

≡ 2(cos 2x)2 − 1
≡ 2(2 cos2 x − 1)2 − 1
≡ 2(4 cos4 x − 4 cos2 x + 1) − 1 ≡ 8 cos4 x − 8 cos2 x + 1.
2 tan t 2 tan t 2 tan t
10. tan 2t ≡ 2
≡ 2
≡
1 − tan t 1 − (sec t − 1) 2 − sec2 t
11. cos 10t − cos 12t ≡ 2 sin 11t sin t sin 10t + sin 12t ≡ 2 sin 11t cos(−t)
cos 10t − cos 12t sin t sin t
. .. ≡ ≡ ≡ tan t
sin 10t + sin 12t cos(−t) cos t
1
12. The right-angled triangle ACD has area (CD)(AD)
2

θ CD θ
But sin = . .. CD = x sin
2 x 2

θ AD θ
cos = . .. AD = x cos
2 x 2

1 2 θ θ 1
. .. area of ∆ACD = x sin cos = x2 sin θ
2 2 2 4
1
. .. area of ∆ABC = 2 × area of ∆ACD = x2 sin θ
2
52 HELM (2006):
®
Applications of
Trigonometry
to Triangles 4.4
Introduction
We originally introduced trigonometry using right-angled triangles. However, the subject has appli-
cations in dealing with any triangles such as those that might arise in surveying, navigation or the
study of mechanisms.
In this Section we show how, given certain information about a triangle, we can use appropriate rules,
called the Sine rule and the Cosine rule, to fully ‘solve the triangle’ i.e. obtain the lengths of all
the sides and the size of all the angles of that triangle.
#
• have a knowledge of the basics of
trigonometry
Prerequisites
Before starting this Section you should . . . • be aware of the standard trigonometric
identities
"
# !
• use trigonometry in everyday situations
Learning Outcomes • fully determine all the sides and angles and
On completion you should be able to . . . the area of any triangle from partial
information
" !
HELM (2006): 53
Section 4.4: Applications of Trigonometry to Triangles
1. Applications of trigonometry to triangles
Area of a triangle
1
The area S of any triangle is given by S = × (base) × (perpendicular height) where ‘perpendicular
2
height’ means the perpendicular distance from the side called the ‘base’ to the opposite vertex. Thus
1
for the right-angled triangle shown in Figure 33(a) S = b a. For the obtuse-angled triangle
2
1
shown in Figure 33(b) the area is S = bh.
2
B B
c c
a a h
A θ A θ C D
C
b b C
(a) (b)
Figure 33
If we use C to denote the angle ACB in Figure 33(b) then
h
sin(180 − C) = (triangle BCD is right-angled)
. a
.. h = a sin(180 − C) = a sin C (see the graph of the sine wave or expand sin(180 − c))
1
. .. S= b a sin C 1(a)
2
By other similar constructions we could demonstrate that
1
S= a c sin B 1(b)
2
and
1
S= b c sin A 1(c)
2
Note the pattern here: in each formula for the area the angle involved is the one between the sides
whose lengths occur in that expression.
Clearly if C is a right-angle (so sin C = 1) then
1
S= b a as for Figure 33(a).
2
Note: from now on we will not generally write ‘≡’ but use the more usual ‘=’.
54 HELM (2006):
®
The Sine rule

The Sine rule is a formula which, if we are given certain information about a triangle, enables us to
fully ‘solve the triangle’ i.e. obtain the lengths of all three sides and the value of all three angles.
To show the rule we note that from the formulae (1a) and (1b) for the area S of the triangle ABC
in Figure 33 we have
b c
ba sin C = ac sin B or =
sin B sin C
Similarly using (1b) and (1c)
a b
ac sin B = bc sin A or =
sin A sin B
Key Point 18
The Sine Rule
For any triangle ABC where a is the length of the side opposite angle A, b the side length opposite
angle B and c the side length opposite angle C states
a b c
= =
sin A sin B sin C
Use of the Sine rule

To be able to fully determine all the angles and sides of a triangle it follows from the Sine rule that
we must know
either two angles and one side : (knowing two angles of a triangle really means that all
three are known since the sum of the angles is 180◦ )
or two sides and an angle opposite one of those two sides.
Example 3
Solve the triangle ABC given that a = 32 cm, b = 46 cm and angle B = 63.25◦ .
Solution
Using the first pair of equations in the Sine rule (Key Point 18) we have
32 46 32
= . .. sin A = sin 63.25◦ = 0.6212
sin A sin 63.25◦ 46
so A = sin−1 (0.6212) = 38.4◦ (by calculator)
HELM (2006): 55
Solution (contd.)
You should, however, note carefully that because of the form of the graph of the sine function there
are two angles between 0◦ and 180◦ which have the same value for their sine i.e. x and (180 − x).
See Figure 34.
sin θ
x 180◦ − x θ
Figure 34
In our example
A = sin−1 (0.6212) = 38.4◦
or
A = 180◦ − 38.4◦ = 141.6◦ .
However since we are given that angle B is 63.25◦ , the value of 141.6◦ for angle A is clearly
impossible.
To complete the problem we simply note that
C = 180◦ − (38.4◦ + 63.25◦ ) = 78.35◦
The remaining side c is calculated from the Sine rule, using either a and sin A or b and sin B.
Task
Find the length of side c in Example 3.
Your solution
Answer
a c
Using, for example, =
sin A sin C
sin C sin 78.35◦ 32 × 0.9794
we have c=a = 32 × = = 50.45 cm.
sin A 0.6212 0.6212
56 HELM (2006):
®
The ambiguous case

When, as in Example 3, we are given two sides and the non-included angle of a triangle, particular
care is required.
Suppose that sides b and c and the angle B are given. Then the angle C is given by the Sine rule as
B
sin B c a
sin C = c
b C
A b
Figure 35
Various cases can arise:
(i) c sin B > b
c sin B
This implies that > 1 in which case no triangle exists since sin C cannot exceed 1.
b
(ii) c sin B = b
c sin B
In this case sin C = = 1 so C = 90◦ .
b
(iii) c sin B < b
c sin B
Hence sin C = < 1.
b
As mentioned earlier there are two possible values of angle C in the range 0 to 180◦ , one acute angle
(< 90◦ ) and one obtuse (between 90◦ and 180◦ .) These angles are C1 = x and C2 = 180 − x. See
Figure 36.
If the given angle B is greater than 90◦ then the obtuse angle C2 is not a possible solution because,
of course, a triangle cannot possess two obtuse angles.
c
b b
B C2 C1
B C2 C1
Figure 36
For B less than 90◦ there are still two possibilities.
If the given side b is greater than the given side c, the obtuse angle solution C2 is not possible because
then the larger angle would be opposite the smaller side. (This was the situation in Example 3.)
The final case
b < c, B < 90◦
does give rise to two possible values C1 , C2 of the angle C and is referred to as the ambiguous
case. In this case there will be two possible values a1 and a2 for the third side of the triangle
corresponding to the two angle values
A1 = 180◦ − (B + C1 )
A2 = 180◦ − (B + C2 )
HELM (2006): 57
Task
Show that two triangles fit the following data for a triangle ABC:
a = 4.5 cm b = 7 cm A = 35◦
Obtain the sides and angle of both possible triangles.
Your solution
Answer
b sin A 7 sin 35◦
We have, by the Sine rule, sin B = = = 0.8922
a 4.5
So B = sin−1 0.8922 − 63.15◦ (by calculator) or 180 − 63.15◦ = 116.85◦ .
In this case, both values of B are indeed possible since both values are larger than angle A (side b
is longer than side a). This is the ambiguous case with two possible triangles.
B = B1 = 63.15◦ B = B2 = 116.85◦
C = C1 = 81.85◦ C = C2 = 28.15◦
c1 4.5 c2 4.5
c = c1 where = c = c2 where =
sin 81.85◦ sin 35◦ sin 28.15 sin 35◦
4.5 × 0.9899 4.5 × 0.4718
c1 = c2 =
0.5736 0.5736
= 7.766 cm = 3.701 cm
You can clearly see that we have one acute angled triangle AB1 C1 and one obtuse angled AB2 C2
corresponding to the given data.
58 HELM (2006):
®
The Cosine rule

The Cosine rule is an alternative formula for ‘solving a triangle’ ABC. It is particularly useful for
the case where the Sine rule cannot be used, i.e. when two sides of the triangle are known together
with the angle between these two sides.
Consider the two triangles ABC shown in Figure 37.
B
B
a c a
c
A A
C A C
D A D
b b
(a) (b)
Figure 37
In Figure 37(a) using the right-angled triangle ABD, BD = c sin A.
In Figure 37(b) using the right-angled triangle ABD, BD = c sin(π − A) = c sin A.
In Figure 37(a) DA = c cos A . .. CD = b − c cos A
In Figure 37(b) DA = c cos(180 − A) = −c cos A . .. CD = b + AD = b − c cos A
In both cases, in the right-angled triangle BDC
(BC)2 = (CD)2 + (BD)2
So, using the above results,
a2 = (b − c cos A)2 + c2 (sin A)2 = b2 − 2bc cos A + c2 (cos2 A + sin2 A)
giving
a2 = b2 + c2 − 2bc cos A (3)
Equation (3) is one form of the Cosine rule. Clearly it can be used, as we stated above, to calculate
the side a if the sides b and c and the included angle A are known.
Note that if A = 90◦ , cos A = 0 and (3) reduces to Pythagoras’ theorem.
Two similar formulae to (3) for the Cosine rule can be similarly derived - see following Key Point:
HELM (2006): 59
Key Point 19
Cosine Rule
For any triangle with sides a, b, c and corresponding angles A, B, C
b 2 + c 2 − a2
a2 = b2 + c2 − 2bc cos A cos A =
2bc
c 2 + a2 − b 2
b2 = c2 + a2 − 2ca cos B cos B =
2ca
a + b2 − c 2
2
c2 = a2 + b2 − 2bc cos C cos C =
2ab
Example 4
Solve the triangle where b = 7.00 cm, c = 3.59 cm, A = 47◦ .
Solution
Since two sides and the angle A between these sides is given we must first use the Cosine rule in
the form (3a):
a2 = (7.00)2 + (3.59)2 − 2(7.00)(3.59) cos 47◦ = 49 + 12.888 − 34.277 = 27.610

√
so a = 27.610 = 5.255 cm.
We can now most easily use the Sine rule to solve one of the remaining angles:
7.00 5.255 7.00 sin 47◦
= so sin B = = 0.9742
sin B sin 47◦ 5.255
from which B = B1 = 76.96◦ or B = B2 = 103.04◦ .
At this stage it is not obvious which value is correct or whether this is the ambiguous case and both
values of B are possible.
The two possible values for the remaining angle C are
C1 = 180◦ − (47◦ + 76.96◦ ) = 56.04◦
C2 = 180◦ − (47 + 103.04) = 29.96◦

Since for the sides of this triangle b > a > c then similarly for the angles we must have
B > A > C so the value C2 = 29.96◦ is the correct one for the third side.
The Cosine rule can also be applied to some triangles where the lengths a, b and c of the three sides
are known and the only calculations needed are finding the angles.
60 HELM (2006):
®
Task
A triangle ABC has sides
a = 7cm b = 11 cm c = 12 cm.
Obtain the values of all the angles of the triangle. (Use Key Point 19.)
Your solution
Answer
Suppose we find angle A first using the following formula from Key Point 19
b 2 + c 2 − a2
cos A =
2bc
112 + 122 − 72
Here cos A = = 0.818 so A = cos−1 (0.818) = 35.1◦
2 × 11 × 12
(There is no other possibility between 0◦ and 180◦ for A. No ‘ambiguous case’ arises using the
Cosine rule!)
Another angle B or C could now be obtained using the Sine rule or the Cosine rule.
Using the following formula from Key Point 19:
c 2 + a2 − b 2 122 + 72 − 112
cos B = = = 0.429 so B = cos−1 (0.429) = 64.6◦
2ca 2 × 12 × 7
Since A + B + C = 180◦ we can deduce C = 80.3◦
HELM (2006): 61
Exercises
1. Determine the remaining angles and sides for the following triangles:
(a) A
c 130◦ 6
20◦
B a C
(b) 3 4
80◦ C
B a C
(c) A
10 b
◦
26
B C
12
(d) The triangles ABC with B = 50◦ , b = 5, c = 6. (Take special care here!)
2. Determine all the angles of the triangles ABC where the sides have lengths a = 7, b = 66
and c = 9
3. Two ships leave a port at 8.00 am, one travelling at 12 knots (nautical miles per hour) the
other at 10 knots. The faster ship maintains a bearing of N 47◦ W, the slower one a bearing
S20◦ W. Calculate the separation of the ships at midday. (Hint: Draw an appropriate diagram.)
4. The crank mechanism shown below has an arm OA of length 30 mm rotating anticlockwise
about 0 and a connecting rod AB of length 60 mm. B moves along the horizontal line
1
OB. What is the length OB when OA has rotated by of a complete revolution from the
8
horizontal?
A
O B
62 HELM (2006):
®
Answers
1.
a 6 c
(a) Using the Sine rule ◦
= ◦
= . From the two left-hand equations
◦
sin 130 sin 20 sin C
sin 130
a=6 ' 13.44.
sin 20◦
sin 30◦
Then, since C = 30◦ , the right hand pair of equations give c = 6 ' 8.77.
sin 20◦
a 4 3 3
(b) Again using the Sine rule = ◦
= so sin C = sin 80◦ = 0.7386
sin A sin 80 sin C 4
there are two possible angles satisfying sin C = 0.7386 or C = sin−1 (0.7386).
These are 47.61◦ and 180◦ − 47.614◦ = 132.39◦ . However the obtuse angle value is
impossible here because the angle B is 80◦ and the sum of the angles would then exceed
180◦ Hence c = 47.01◦ so A = 180◦ − (80◦ + 47.61◦ ) = 52.39◦ .
a 4 sin 52.39◦
Then, ◦
= so a=4 ' 3.22
sin 52.39 sin 80◦ sin 80◦
(c) In this case since two sides and the included angle are given we must use the Cosine rule.
The appropriate form is
b2 = c2 + a2 − 2ca cos B = 102 + 122 − (2)(10)(12) cos 26◦ = 28.2894

√
so b = 28.2894 = 5.32
Continuing we use the Cosine rule again to determine say angle C where
c2 = a2 + b2 − 2ab cos C that is 102 = 122 + (5.32)2 − 2(1.2)(5.32) cos C
from which cos C = 0.5663 and C = 55.51◦ (There is no other possibility for C between
0◦ and 180◦ . Recall that the cosine of an angle between 90◦ and 180◦ is negative.)
Finally, A = 180 − (26◦ + 55.51◦ ) = 98.49◦ .
(d) By the Sine rule
a 5 6 sin 50◦
= = . .. sin C = 6 = 0.9193
sin A sin 50◦ sin C 5
Then C = sin−1 (0.9193) = 66.82◦ (calculator) or 180◦ − 66.82◦ = 113.18◦ . In this case
both values of C say C1 = 66.82◦ and C2 = 113.18◦ are possible and there are two
possible triangles satisfying the given data. Continued use of the Sine rule produces
(i) with C1 = 66.82 (acute angle triangle) A = A1 = 180 − (66.82◦ + 50◦ ) = 63.18◦
a = a1 = 5.83
(ii) with C2 = 113.18◦ A = A2 = 16.82◦ a = a2 = 1.89
HELM (2006): 63
Answers continued
2. We use the Cosine rule firstly to find the angle opposite the longest side. This will tell us
whether the triangle contains an obtuse angle. Hence we solve for c using
c2 = a2 + b2 − 2ab cos C 81 = 49 + 36 − 84 cos C
from which 84 cos C = 4 cos C = 4/84 giving C = 87.27◦ .
So there is no obtuse angle in this triangle and we can use the Sine rule knowing that there
is only one possible triangle fitting the data. (We could continue to use the Cosine rule if we
wished of course.) Choosing to find the angle B we have
6 9
=
sin B sin 87.27◦
from which sin B = 0.6659 giving B = 41.75◦ . (The obtuse case for B is not possible, as
explained above.) Finally A = 180◦ − (41.75◦ + 87.27◦ ) = 50.98◦ .
A N
47◦
48
c O
40
20◦
3. B S
At midday (4 hours travelling) ships A and B are respectively 48 and 40 nautical miles from
the port O. In triangle AOB we have
AOB = 180◦ − (47◦ + 20◦ ) = 113◦ .
We must use the Cosine rule to obtain the required distance apart of the ships. Denoting the
distance AB by c, as usual,
c2 = 482 + 402 − 2(48)(40) cos 113◦ from which c2 = 5404.41 and c = 73.5 nautical miles.
30 60 30
4. By the Sine rule = . .. sin B = sin 45◦ = 0.353 so B = 20.704◦ .
sin B sin 45 60
A
30mm 60mm (Position after 1
revolution)
8
45◦
O B
The obtuse value of sin−1 (0.353) is impossible. Hence,
A = 180◦ − (45◦ + 20.704◦ ) = 114.296◦ .
30 OB
Using the sine rule again = from which OB = 77.5 mm.
0.353 sin 114.296
64 HELM (2006):
®
Applications of
Trigonometry
to Waves 4.5
Introduction
Waves and vibrations occur in many contexts. The water waves on the sea and the vibrations of
a stringed musical instrument are just two everyday examples. If the vibrations are simple ‘to and
fro’ oscillations they are referred to as ‘sinusoidal’ which implies that a knowledge of trigonometry,
particularly of the sine and cosine functions, is a necessary pre-requisite for dealing with their analysis.
In this Section we give a brief introduction to this topic.
#
• have a knowledge of the basics of
trigonometry
Prerequisites
Before starting this Section you should . . . • be aware of the standard trigonometric
identities
"
# !
• use simple trigonometric functions to
describe waves
Learning Outcomes
• combine two waves of the same frequency as
a single wave in amplitude-phase form
" !
HELM (2006): 65
Section 4.5: Applications of Trigonometry to Waves
1. Applications of trigonometry to waves
Two-dimensional motion
Suppose that a wheel of radius R is rotating anticlockwise as shown in Figure 38.
B
Q
R
ωt A
x
O P
Figure 38
Assume that the wheel is rotating with an angular velocity ω radians per second about O so that, in
a time t seconds, a point (x, y) initially at position A on the rim of the wheel moves to a position B
such that angle AOB = ωt radians.
Then the coordinates (x, y) of B are given by
x = OP = R cos ωt
y = OQ = P B = R sin ωt
We know that both the standard sine and cosine functions have period 2π. Since the angular velocity
2π
is ω radians per second the wheel will make one complete revolution in seconds.
ω
2π
The time (measured in seconds in this case) for one complete revolution is called the period of
ω
1
rotation of the wheel. The number of complete revolutions per second is thus = f say which is
T
1 ω
called the frequency of revolution. Clearly f = = relates the three quantities
T 2π
introduced here. The angular velocity ω = 2πf is sometimes called the angular frequency.
One-dimensional motion
The situation we have just outlined is two-dimensional motion. More simply we might consider
one-dimensional motion.
An example is the motion of the projection onto the x-axis of a point B which moves with uniform
angular velocity ω round a circle of radius R (see Figure 39). As B moves round, its projection P
moves to and fro across the diameter of the circle.
66 HELM (2006):
®
B
R
ωt A
x
O x P
Figure 39
The position of P is given by
x = R cos ωt (1)
Clearly, from the known properties of the cosine function, we can deduce the following:
2π
1. x varies periodically with t with period T = .
ω
2. x will have maximum value +R and minimum value −R.
(This quantity R is called the amplitude of the motion.)
Task
Using (1) write down the values of x at the following times:
π π 3π 2π
t = 0, t = ,t= ,t= ,t= .
2ω ω 2ω ω
Your solution
π π 3π 2π
t 0
2ω ω 2ω ω
Answer
π π 3π 2π
t 0
2ω ω 2ω ω
x R 0 −R 0 R
HELM (2006): 67
Using (1) this ’to and fro’ or ‘vibrational’ or ‘oscillatory’ motion between R and −R continues
indefinitely. The technical name for this motion is simple harmonic. To a good approximation it
is the motion exhibited (i) by the end of a pendulum pulled through a small angle and then released
(ii) by the end of a hanging spring pulled down and then released. See Figure 40 (in these cases
damping of the pendulum or spring is ignored).
Figure 40
Task
Using your knowledge of the cosine function and the results of the previous Task
sketch the graph of x against t where
4π
x = R cos ωt for t = 0 to t =
ω
Your solution
Answer
x = R cos ωt
R period
2π 4π t
ω ω
−R
This graph shows part of a cosine wave, specifically two periods of oscillation. The shape of the
graph suggests that the term wave is indeed an appropriate description.
68 HELM (2006):
®
π
We know that the shape of the cosine graph and the sine graph are identical but offset by
1.7
radians horizontally. Bearing this in mind, attempt the following Task.
Task
Write the equation of the wave x(t), part of which is shown in the following graph.
You will need to find the period T and angular frequency ω.
x
5
4 8 t (secs)
−5
Your solution
Answer
From the shape of the graph we have a sine wave rather than a cosine wave. Theamplitude is 5.
2π π πt
The period T = 4s so the angular frequency ω = = . Hence x = 5 sin .
4 2 2
The quantity x, a function of t, is referred to as the displacement of the wave.
Phase of a wave
π
We recall that cos θ − = sin θ which means that the graph of x = sin θ is the same shape
2
π
as that of x = cos θ but is shifted to the right by .
2
Suppose now that we consider the waves
x1 = R cos 2t x2 = R sin 2t
Both have amplitude R, angular frequency ω = 2 rad s−1 . Also
π h π i
x2 = R cos 2t − = R cos 2 t −
2 4
π
The graphs of x1 against t and of x2 against t are said to have a phase difference of . Specifically
4
π
x1 is ahead of, or ‘leads’ x2 by radians.
4
More generally, consider the following two sine waves of the same amplitude and frequency:
x1 (t) = R sin ωt
x2 (t) = R sin(ωt − α)
HELM (2006): 69
α h α i
Now x1 t − = R sin ω t − = R sin(ωt − α) = x2 (t)
ω ω
α α
so it is clear that the waves x1 and x2 are out of phase by . Specifically x1 leads x2 by .
ω ω
Task
Calculate the phase difference between the waves
x1 = 3 cos(10πt)
π
x2 = 3 cos 10πt +
4
where the time t is in seconds.
Your solution
Answer
Note firstly that the waves have the same amplitude 3 and angular frequency 10π (corresponding
2π 1
to a common period = s)
10π 5

π 1
Now cos 10πt + = cos 10π t +
4 40

1
so x1 t + = x2 (t).
40
1
In other words the phase difference is s, the wave x2 leads the wave x1 by this amount.
40
1
Alternatively we could say that x1 lags x2 by s.
40
70 HELM (2006):
®
Key Point 20
The equations
x = R cos ωt x = R sin ωt
2π
both represent waves of amplitude R and period .
ω
π n π o
The phase difference between these waves is because cos ω t − = sin ωt.
2ω 2ω
Combining two wave equations

A situation that arises in some applications is the need to combine two trigonometric terms such as
A cos θ + B sin θ where A and B are constants.
For example this sort of situation might arise if we wish to combine two waves of the same frequency
but not necessarily the same amplitude and with a phase difference. In particular we wish to be able
to deal with an expression of the form
R1 cos ωt + R2 sin ωt
π
where the individual waves have, as we have seen, a phase difference of .
2ω
Consider an expression A cos θ + B sin θ. We seek to transform this into the single form
C cos(θ − α) (or C sin(θ − α)), where C and α have to be determined. The problem is easily solved
with the aid of trigonometric identities.
We know that
C cos(θ − α) ≡ C(cos θ cos α + sin θ sin α)
Hence if A cos θ + B sin θ = C cos(θ − α) then
A cos θ + B sin θ = (C cos α) cos θ + (C sin α) sin θ
For this to be an identity (true for all values of θ) we must be able to equate the coefficients of cos θ
and sin θ on each side.
Hence
A = C cos α and B = C sin α (2)
HELM (2006): 71
Task
By squaring and adding the Equations (2), obtain C in terms of A and B.
Your solution
Answer
A = C cos α and B = C sin α gives

A + B = C 2 cos2 α + C 2 sin2 α = C 2 (cos2 α + sin2 α) = C 2
2 2
√
. .. C = A2 + B 2 (We take the positive square root.)
Task
By eliminating C from Equations (2) and using the result of the previous Task,
obtain α in terms of A and B.
Your solution
Answer
B C sin α B
By division, = = tan α so α is obtained by solving tan α = . However, care must be
A C cos α A
taken to obtain the correct quadrant for α.
Key Point 21
√ B
If A cos θ + B sin θ = C cos(θ − α) then C = A2 + B 2 and tan α = .
A
Note that the following cases arise for the location of α:
1. A > 0, B > 0 : 1st quadrant 3. A < 0, B < 0 : 3rd quadrant
2. A < 0, B > 0 : 2nd quadrant 4. A > 0, B < 0 : 4th quadrant
72 HELM (2006):
®
In terms of waves, using Key Point 21 we have

R1 cos ωt + R2 sin ωt = R cos(ωt − α)
p R2
where R = R12 + R22 and tan α = .
R1
The form R cos(ωt − α) is said to be the amplitude/phase form of the wave.
Example 5
Express in the form C cos(θ − α) each of the following:
(a) 3 cos θ + 3 sin θ

(b) −3 cos θ + 3 sin θ
(c) −3 cos θ − 3 sin θ
(d) 3 cos θ − 3 sin θ
Solution
√ √ √
In each case C = A2 + B 2 = 9+9= 18
B 3
(a) tan α = = = 1 gives α = 45◦ (A and B are both positive so the first quadrant
A 3 √ √ π
is the correct one.) Hence 3 cos θ + 2 sin θ = 18 cos(θ − 45◦ ) = 18 cos θ −
4
(b) The angle α must be in the second quadrant as A = −3 < 0, B = +3 > 0. By
calculator : tan α = −1 gives α = −45◦ but this is in the 4th quadrant. Remembering
that tan α has period π or 180◦ we must therefore add 180◦ to the calculator value to
obtain the correct α value of 135◦ . Hence
√
−3 cos θ + 3 sin θ = 18 cos(θ − 135◦ )
−3
(c) Here A = −3, B = −3 so α must be in the 3rd quadrant. tan α = = 1 giving
◦ ◦
−3
α = 45 by calculator. Hence adding 180 to this tells us that
√
−3 cos θ − 3 sin θ = 18 cos(θ − 225◦ )
(d) Here A = 3 B = −3 so α is in the 4th quadrant. tan α = −1 gives us (correctly)

α = −45◦ so
√
3 cos θ − 3 sin θ = 18 cos(θ + 45◦ ).
Note that in the amplitude/phase form the angle may be expressed in degrees or radians.
HELM (2006): 73
Task
Write the wave form x = 3 cos ωt + 4 sin ωt in amplitude/phase form. Express
the phase in radians to 3 d.p..
Your solution
Answer
√ 4
We have x = R cos(ωt − α) where R= 32 + 42 = 5 and tan α = 3 from which, using the
π
calculator in radian mode, α = 0.927 radians. This is in the first quadrant 0 < α < which is
2
correct since A = 3 and B = 4 are both positive. Hence x = 5 cos(ωt − 0.927).
74 HELM (2006):
®
Exercises
1. Write down the amplitude and the period of y = 52 sin 2πt.
2. Write down the amplitude, frequency and phase of

π 3π
(a) y = 3 sin 2t − (b) y = 15 cos 5t −
3 2
3. The current in an a.c. circuit is i(t) = 30 sin 120πt amp where t is measured in seconds.
What is the maximum current and at what times does it occur?
π
4. The depth y of water at the entrance to a small harbour at time t is y = a sin b t − +k
2
where k is the average depth. If the tidal period is 12 hours, the depths at high tide and low
tide are 18 metres and 6 metres respectively, obtain a, b, k and sketch two cycles of the graph
of y.
5. The Fahrenheit temperature at a certain location over 1 complete day is modelled by

π
F (t) = 60 + 10 sin (t − 8) 0 ≤ t ≤ 24
12
where t is in the time in hours after midnight.
(a) What are the temperatures at 8.00 am and 12.00 noon?

(b) At what time is the temperature 60◦ F?
(c) Obtain the maximum and minimum temperatures and the times at which they occur.
6. In each of the following write down expressions for shifted sine and shifted cosine functions
that satisfy the given conditions:
2π π
(a) Amplitude 3, Period , Phase shift
3 3
(b) Amplitude 0.7, Period 0.5, Phase shift 4.
7. Write the a.c. current i = 3 cos 5t + 4 sin 5t in the form i = C cos(5π − α).
8. Show that if A cos ωt + B sin ωt = C sin(ωt + α) then

√ B A
C = A2 + B 2 , cos α = , sin α = .
C C
9. Using Exercise 8 express the following in the amplitude/phase form C sin(ωt + α)
√ √
(a) y = − 3 sin 2t + cos 2t (b) y = cos 2t + 3 sin 2t
2 1
10. The motion of a weight on a spring is given by y= cos 8t − sin 8t.
3 6
Obtain C and α such that y = C sin(8t + α)
11. Show that for the two a.c. currents

π π π
i1 = sin ωt + and i2 = 3 cos ωt − then i1 + i2 = 4 cos ωt − .
3 6 6
HELM (2006): 75
v2 π

12. Show that the power P = in an electrical circuit where v = V0 cos ωt + 4
is
R
V02
P = (1 − sin 2ωt)
2R
13. Show that the product of the two signals
f1 (t) = A1 sin ωt f2 (t) = A2 sin {ω(t + τ ) + φ} is given by

A1 A2
f1 (t)f2 (t) = {cos(ωτ + φ) − cos(2ωt + ωτ + φ)}.
2
Answers
5 5 2π
1. y = sin 2πt has amplitude . The period is = 1.
2 2 2π
5 5 5
Check: y(t + 1) = sin(2π(t + 1)) = sin(2πt + 2π) = sin 2πt = y(t)
2 2 2
2π π
2. (a) Amplitude 3, Period = π. Writing y = 3 sin 2 t − we see that there is a
2 0
π
phase shift of radians in this wave compared with y = 3 sin 2t.
6

2π 3π
(b) Amplitude 15, Period . Clearly y = 15 cos 5 t − so there is a phase shift of
5 10
3π
compared with y = 15 cos 5t.
10
π 1
3. Maximum current = 30 amps at a time t such that 120πt = . i.e. t = s.
2 240

1 n
This maximum will occur again at + s, n = 1, 2, 3, . . .
240 60
n π o 2π π −1
4. y = a sin b t − + h. The period is = 12 hr . .. b = hr .
2 b 6
Also since ymax = a + k ynmin = −a + k we have a + k = 18 − a + k = 6 so k = 12
π π o
m, a = 6 m. i.e. y = 6 sin t− + 12.
6 2
π
5. F (t) = 60 + 10 sin (t − 8) 0 ≤ t < 24
12
π
(a) At t = 8 : temp = 60◦ F. At t = 12: temp = 60 + 10 sin = 68.7◦ F
3
π
(b) F (t) = 60 when (t − 8) = 0, π, 2π, . . . giving t − 8 = 0, 12, 24, . . . hours so
12
t = 8, 20, 32, . . . hours i.e. in 1 day at t = 8 (8.00 am) and t = 20 (8.00 pm)
π π
(c) Maximum temperature is 70◦ F when (t = 8) = i.e. at t = 14 (2.00 pm).
12 2
π 3π
Minimum temperature is 50◦ F when (t − 8) = i.e. at t = 26 (2.00 am).
12 2
76 HELM (2006):
®
Answers
6. (a) y = 3 sin(3t−π) y = 3 cos(3t−π) (b) y = 0.7 sin(4πt−16) y = 0.7 cos(4πt−16)

√ 4
7. C = 32 + 42 = 5 tan α = and α must be in the first quadrant (since A = 3, B = 4 are
3
4
both positive.) . .. α = tan−1 = 0.9273 rad . .. i = 5 cos(5t − 0.9273)
3
8. Since sin(ωt + α) = sin ωt cos α + cos ωt sin α then A = C sin α (coefficients of cos ωt)
A B
B = C cos α (coefficients of sin ωt) from which C 2 = A2 + B 2 , sin α = , cos α =
C C
√
√ 3 1
9. (a) C = 3 + 1 = 2; cos α = − sin α − so α is in the second quadrant,
2 2
5π . 5π π
α= .. y = 2 sin 2t + (b) y = 2 sin 2t +
6 6 6
√
4 1 17 17 −1 1 2
4
2
10. C = + = so C = cos α = √ 6 = − √ sin α = √3 = √
9 36 36 6 17 17 17 17
6 6
so α is in the second quadrant. α = 1.8158 radians.

π π π π π
11. Since sin x = cos x − sin ωt + = cos ωt + − = cos ωt −
2 3 3 2 6
π π π
. .. i1 + i2 = cos ωt − + 3 cos ωt − = 4 cos ωt −
6 6 6
π
= V0 cos ωt cos π4 − sin ωt sin π4 = √V02 (cos ωt − sin ωt)

12. v = V0 cos ωt +
4
V02 V2
. .. v2 = (cos2 ωt + sin2 ωt − 2 sin ωt cos ωt) = 0 (1 − sin 2ωt)
2 2
v2 V2
and hence P = = 0 (1 − sin 2ωt.)
R 2R
13. Since the required answer involves the difference of two cosine functions we use the identity

A+B B−A
cos A − cos B = 2 sin sin
2 2
A+B B−A
Hence with = ωt, − ωt + ωτ + φ.
2 2
We find, by adding these equations B = 2ωt+ωτ +φ and by subtracting A = −ωτ −φ.
1
Hence sin(ωt) sin(ωt + ωτ + φ) = {cos(ωτ + φ) − cos(2ωt + ωτ + φ)}.
2
(Recall that cos(−x) = cos x.) The required result then follows immediately.
HELM (2006): 77
Contents 5
Functions and Modelling
5.1 The Modelling Cycle and Functions 2
5.2 Quadratic Functions and Modelling 24
5.3 Oscillating Functions and Modelling 35
5.4 Inverse Square Law Modelling 45
Learning outcomes
After studying the Workbook and completing associated Tasks and Exercises you should
be able to: list and explain the stages of the modelling cycle; use linear, quadratic and
power law functions in modelling where appropriate.
The Modelling Cycle
and Functions 5.1

Introduction
In this Section we look at the process of modelling with mathematics which is vitally important in
engineering. Knowledge of mathematics is not much use to an engineer unless it can be applied to
engineering problems. After discussing the mathematical modelling process we discuss the use of
linear models.

• be competent at algebraic manipulation

Prerequisites
Before starting this Section you should . . . • be familiar with linear functions

' $
• understand the basics of the modelling
process
Learning Outcomes • use linear functions to model motion
On completion you should be able to . . . under constant acceleration
• analyse motion under gravity

& %
2 HELM (2006):
Workbook 5: Functions and Modelling
1. Functions and modelling
Engineers use mathematics to a considerable extent. Mathematical techniques offer ways of handling
mathematical models of an engineering problem and coming up with a solution. Of course it is
possible to model a problem in ways that are not mathematical e.g. by physical or scale modelling,
but this Workbook is concerned exclusively with mathematical modelling, so we will drop the word
‘mathematical’ and refer just to modelling. This Section is intended to introduce some modelling
ideas as well as to show applications of the functions and techniques introduced in 2,
3 and 6. By modelling we mean the process by which we set up a mathematical model of
a situation or of an assumed situation, use the model to make some predictions and then interpret
the results in the original context. The mathematical techniques themselves contribute only to part
of the modelling procedure. The modelling procedure can be regarded as a cycle. If we do not like
the outcome for some reason we can try again. Five steps of a modelling cycle can be identified as
follows:
Step 1 Specify the purpose of the model.
Step 2 Create the mathematical model after making and stating relevant assumptions.
Step 3 Do the resulting mathematics.
Step 4 Interpret the results.
Step 5 Evaluate the outcome, usually by comparing with reality and/or purpose and, if
necessary, try again!.
Much of this first Section is concerned with steps 2 and 3 of the cycle: creating a mathematical
model and doing the maths. Engineering case studies found in many Workbooks will aim to
demonstrate the complete cycle. An important part of step 2 may include choosing an appropriate
function based on the assumptions made also as part of this step. This choice will influence the
kind of mathematical activity that is involved in step 3.
So far in your engineering mathematical studies you might have had little opportunity to think about
what is ‘appropriate’, since the type of function to be studied and used has been chosen for you.
Sometimes, however, you may be faced with making appropriate choices of function for yourself so it
is important to have some understanding of what might be appropriate in any given circumstance. A
well chosen function will be appropriate in two different ways. Firstly the function should be consistent
with the purpose of the model, with known data or theory or facts, and with known or assumed
behaviours. For example, the purpose might be to predict the future behaviour of a quantity which
is expected to increase with time. In this case time can be identified as the independent variable
since the quantity depends on time. The function chosen for mathematical activity should be one in
which the value of the dependent variable increases with time. Secondly, bearing in mind that the
modelling process is a cycle and so it is possible, and usual, to go round it more than once, the
first choice of function should be as simple as allowed by the modelling context. The main reason
for doing this is to avoid complication unless it is really necessary. Philosophically, an initial choice of
a simple function is consistent with the fundamental belief that most phenomena may be modelled
adequately by simple laws and theories. It is common engineering practice always to use the simplest
model possible in a given situation. So, for the first trip around the cycle, the appropriate function
should be the simplest that is consistent with known facts, behaviours, theory or data. If the quantity
of interest is known not to be constant, this might be a linear function. If the first choice turns out
to be inadequate at the stage of the cycle where the result is interpreted or the outcome is evaluated
(step 5) then it is reasonable to try something more complicated; a quadratic function might be the
second choice if the first choice was linear.
HELM (2006): 3
Section 5.1: The Modelling Cycle and Functions
It is important to realise that sophistication is not necessarily a virtue in itself. The merits of
complication depend upon the purpose for which the model is being formulated. A model of the
weather that enables a decision on whether or not to take an umbrella to work on any particular day
will be rather less sophisticated than that required to give an accurate prediction of the amount of
rainfall in the vicinity of the workplace on that day.
In the next subsection we will look at various types of functions that have been introduced so far
but in a different way, concentrating more on their graphical behaviour and their parameters. As
mentioned earlier, appropriateness is determined by the extent to which the behaviour of the chosen
function reflects the behaviour to be modelled as the independent variable varies. The behaviour of
a function is determined by whether it is linear, non-linear, or periodic and its range of validity. An
important task of this Workbook is to get you to think more and more in modelling terms about the
forms and associated behaviours of functions. We shall also take the opportunity of deriving some
generalities from specific examples.
2. Constant functions
There are two physical interpretations of constancy that are of interest here. A very common form is
constancy with time. Motion under gravity may be modelled as motion with constant acceleration.
By definition, Fixed Rate Mortgages (increasingly popular in the late 1990s) offer a constant rate
of interest over a specified period. In these examples, the constancy will be limited to a certain
time interval. Motion under gravity will only involve the constant acceleration due to the Earth’s
gravitational pull as long as the motion is close to the Earth’s surface. In any case the acceleration
will only be from the time the object is released to the time it stops. Unfortunately, increases in base
interest rates eventually feed into mortgage rates. So mortgage lenders are only able to offer fixed
rates for a certain time. A mathematical statement of these limits is a statement of the range of
validity of the constant function model.
Another type of constancy is constancy in space. Long stretches of Roman roads were built in
a fixed direction. For at least part of their lengths, roads have constant width. In modelling the
formation and movement of seismic waves in the Earth’s crust it is convenient to assume that the
layers from which the Earth’s crust is formed have constant thickness with respect to the Earth’s
surface. In these cases the assumption of constancy will only be valid within certain limits in space.
4 HELM (2006):
Example 1
The rate of flow of water from a tap is denoted as r (litres per minute). The time
for which it is turned on is denoted by t (minutes). Suppose that a tap is turned
on and that the rate of water running out of a tap is assumed to be constant at
3 litres per minute and that it is turned off after 10 minutes.
(a) Write down a mathematical statement of the model for the flow from the
tap, including its range of validity.
(b) Sketch a graph of the variation of r with t.
(c) Find an equation for the number of litres of water that have run out of the
tap after t minutes.
(d) Calculate the volume of water that has run out of the tap three minutes
before it is turned off.
Solution
(a) r = 3 (0 ≤ t ≤ 10)
(b)
3
r (litres per min)
2
0 5 10 15
t (minutes)
Figure 1: Flow from tap
(c) V = rt (0 ≤ t ≤ 10)
(d) The tap will have run for seven minutes; 3 litres per minute x 7 minutes = 21 litres
Note that a more sophisticated model would allow for the variation in flow rate as the tap is turned
on and turned off.
HELM (2006): 5
3. Linear functions
2 has introduced linear functions of the form y = ax + b. Such functions give rise to straight-
line graphs. The coefficient a is the slope. If a is positive the graph of y against x slopes upwards. If
a is negative the graph slopes downwards. The coefficient b gives the intercept on the y-axis. The
terms a and b may be called the parameters of the line. Note that this is a different use of the
term ‘parameter’ than in the parametrisation of functions discussed in 2.
Linear models for falling rocks

In modelling it is wise to use a notation which fits in with the application. When modelling velocity
under constant acceleration, we shall replace the dependent variable y by v (for velocity), and the
independent variable x by t (for time). The acceleration will be denoted by the symbol a. Consider
the motion of a rock dislodged from the top of a cliff (35 m high) by a villain during the filming of a
thriller. The film producer might be interested in how long the rock would take to fall to the ground
below the cliff and how fast it would be travelling at ground impact. The rock may be assumed
to have a constant downward acceleration of 9.8 m s−2 which the acceleration due to gravity. The
velocity (v m s−1 ) of a rock, falling from the top of a cliff 35 m high, can be modelled by the equation
v = 9.8t (0 ≤ t ≤ 2.7)
where t is the time in seconds after the rock starts to fall. This follows from the fact that acceleration
is the rate of change of velocity with time. If the acceleration is constant and the object starts from
rest, then the velocity is given simply by the product of acceleration and time. The upper limit for t
is the time at which the rock hits the ground measured with a stop-watch (about 2.7 s in this case).
Figure 2 shows v as a function of t. Velocity is a linearly increasing function of time and its graph is
a straight line passing through t = 0, v = 0. Note that various assumptions are needed to obtain the
quoted result of a linear variation in speed with time: it is assumed that there is no air resistance,
no spinning and no wind.
20
1
Velocity (m s )
10
0
0 0.5 1 1.5 2 2.5 3 Time (s)
Figure 2: Graph of v = 9.8t for the falling rock

In what way should the equation for v be altered if the villain were able to throw the rock downwards
at 5 m s−1 ? Provided we are measuring position or displacement downwards, a downwards velocity
is positive. Now we have that v = 5 when t = 0. So a new model for v is
v = 9.8t + 5 (0 ≤ t ≤ T1 )
Since they are both downwards, the initial velocity simply adds to the velocity at any time resulting
from falling under gravity. Note that T1 is being used now for the upper limit on t (instead of 2.7)
because 2.7 is (approximately) the time taken to fall 35 m from rest rather than with an initial
downwards velocity. (Using the symbol T1 saves us trying to work out its value for the moment!)
Note that a general form of the model for motion under constant acceleration of magnitude a m s−2
given an initial speed b m s−1 is v = at + b. In the model just considered a = 9.8 and b = 5.
6 HELM (2006):
Task
For the above example modelling a falling rock:
(a) Determine whether T1 is more or less than 2.7.
(b) Sketch a graph of v for 0 ≤ t ≤ T1 .
Your solution
Answer
(a) T1 will be less than 2.7 since the rock will be moving faster throughout its descent.
(b) The graph is still a straight line but displaced upwards compared with Figure 2.
30
Velocity (m s 1 )
20
10
0
0 0.5 1 1.5 2 2.5 3 Time (s)
Graph of v = 9.8t + 5 for the falling rock
Consider now how the function for v will change if the villain is even mightier than we previously
thought and throws the rock upwards with an initial speed of 5 m s−1 instead of simply dislodging
it or throwing it downwards. In this circumstance, the initial velocity is directed upwards, and since
position is being measured downwards, the initial velocity is negative. We can use the equation
v = 9.8t + b again. This time v = −5 when t = 0, leading to b = −5 and
v = 9.8t − 5 (0 ≤ t ≤ T2 )
HELM (2006): 7
The new time at which the rock hits the ground is denoted by T2 . The rock will rise before falling
to the gound this time so T2 will be larger than T1 .
From the modelling point of view, there is one other significant time before the rock hits the ground.
Figure 3 shows the new graph of v against t. Notice that there is a time at which v (which starts at
−5) is zero. What does this mean?
20
Velocity (m s 1 )
10
0
0 0.5 1 1.5 2 2.5 3
Time (s)
5
Figure 3: Graph of v = 9.8t − 5 for the falling rock

As time goes by, the fact that gravity is causing the rock to accelerate downwards means that the
rock’s upward motion will slow. Its velocity will decrease in magnitude until it reaches zero. At
this particular instant the rock will be at its highest point and its velocity will change from upwards
(negative) to downwards (positive) passing instantaneously through zero in the process.
We can calculate this time by substituting zero for v and working out the corresponding t.
5
0 = 9.8t − 5, so t= = 0.51.
9.8
This means that the rock is stationary about a half second after being thrown upwards. Subsequently
the rock will fall until it hits the ground. But there is yet one more time that may be significant
in the modelling context chosen here. During its journey to the ground 35 m beneath the cliff-top,
the rock will pass the top of the cliff again. Note that we are modelling the motion of a particular
point, say the lowest point, on the rock. A real rock, with appreciable size, will only pass the top
of the cliff, without landing on it or hitting it, if it is thrown a little forward as well as up. Anyway,
in principle we could use the function that we started with, representing the velocity of an object
falling from rest under gravity, to work out how long the rock will take to pass the top of the cliff
having reached the highest point in its path. A simpler method is to argue that, as long as the rock
is thrown from the cliff top level (this requires the villain to be lying down!), the rock should take
exactly the same time (approximately 0.5 s) to return to the level of the cliff top as it took to rise
above the cliff top to the highest point in its path. So we simply double 0.5 s to deduce that the
rock passes the cliff top again about 1 s after being thrown.
8 HELM (2006):
Task
This Task concerns the falling rock model just discussed.
(a) Add lines to a sketch version of Figure 3 to represent velocity as a function of time if the rock is
(i) dislodged (ii) thrown with velocity 3 m s−1 downwards (iii) thrown with velocity −2 m s−1 :
Your solution
Answer
30
20
1
Velocity (m s )
10
0
0 0.5 1 1.5 2 2.5 3 Time (s)
5
(b) What do you deduce about the effect of the initial velocity on the graph of velocity against time?
Your solution
Answer
The effect of changing the initial velocity (in size or in sign) is simply to displace the straight line
upwards or downwards without changing its slope.
(c) Imagine that the filming was on the Moon with roughly one-sixth the gravitational pull of Earth.
Find a linear function that would describe the velocity of a dislodged rock:
Your solution
Answer
9.8
v= t ≈ 1.6t
6
(d) What do you deduce about the effect of changing the acceleration due to gravity on the graph
of velocity against time?
Your solution
Answer
The graph of velocity agains time is still linear but the change in the acceleration due to gravity
changes the slope.
HELM (2006): 9
So, in the context of modelling motion under gravity, the initial velocity determines the vertical
displacement of the line, its intercept on the v-axis, and the acceleration determines the slope.
Again, given the modelling context, both of these influence the range of validity of the model since
they alter the time taken for the rock to reach the ground and this fixes the upper limit on time.
Like velocity, acceleration has direction as well as magnitude. As long as position is being measured
downwards, and only gravity is considered to act, falling objects do not provide any examples of
negative accelerations - but rocket motion does. Where downwards accelerations are represented as
positive, an upwards acceleration will be negative. So a model of the motion of a rocket accelerating
away from the Earth could include a constant negative acceleration. Horizontal acceleration, say
of a road vehicle, in the same direction as position as being measured, is represented as positive.
Deceleration, for example when this vehicle is being braked, implies that velocity is decreasing with
time, and is represented as negative. In mathematical modelling, it is usual to refer to acceleration,
whether it represents positive acceleration or deceleration.
Suppose that we are describing the motion of a rocket taking off vertically during its initial booster
stage of 10 s. We might model the acceleration as a constant −20 m s−2 . The negative sign arises
because downwards is being taken as the positive direction but the acceleration is upwards. Since
the rocket is starting from rest, an appropriate function is
v = −20t (0 ≤ t ≤ 10)
This should describe the variation of its velocity with time until the end of the initial booster stage
of its flight. Figure 4 shows the corresponding graph of velocity against time. Note the way in which
the graph slopes downwards to the right. This function describes an increasingly negative velocity as
time passes, consistent with an increasing upwards velocity. The corresponding graph for a positive
acceleration of the same magnitude would slope upwards towards the right.
t (s)
0
2 4 6 8 10
100
v (m s ) 1
200
Figure 4: Variation of velocity of rocket during the initial booster stage.
Task
Imagine that a satellite is falling towards Earth at 5 m s−1 when a booster rocket
is fired for 5 s accelerating it away from the Earth at 10 m s−2 .
(a) Write down a corresponding linear function that would describe its velocity during the booster
stage.
Your solution
10 HELM (2006):
Answer
If position is measured downwards, acceleration away from the Earth may be written as −10 m s−2 .
The initial velocity towards the Earth may be denoted by (+)5 m s−1 so v = −10t+5 (0 ≤ t ≤ 5).
If position is measured upwards v = 10t − 5 (0 ≤ t ≤ 5).
(b) Sketch the corresponding graph of velocity against time if position is measured downwards
towards Earth:
Your solution
Answer
20
t (s)
0
2 4 6
20
1
v (m s )
40
Satellite velocity (position measured downwards towards Earth)
(c) Sketch the corresponding graph of velocity against time if position is measured upwards away
from the Earth:
Your solution
Answer
40
v (m s 1 )
20
0
2 4 6
t (s)
20
Satellite velocity (position measured upwards away from Earth)
HELM (2006): 11
(d) At what time would the velocity of the satellite be zero?
Your solution
Answer
When v is 0, 0 = −10t + 5, so t = 0.5. The satellite has zero velocity towards the Earth after 0.5 s.
(e) What is the value of the velocity at the end of the booster stage?
Your solution
Answer
When t = 5, either v = −10 × 1 + 5 = −5, so the velocity is 5 m s−1 away from the Earth, or,
using the second equation in (a), v = 10 − 5 = 5, leading to the same conclusion.
Other contexts for linear models

Linear functions may arise in other contexts. In each of these situations, the slope and intercept
values will have some modelling significance. Indeed the behaviour and hence the suitability of a
linear function, of the form y = ax + b, when modelling any given situation will be determined by
the values of a and b.
Task
During 20 minutes of rain, a cylindrical rain barrel that is initially empty is filled
to a depth of 1.5 cm.
(a) Choose variables to represent the level of water in the barrel and time. Sketch a graph representing
the level of water in the barrel if the intensity of rainfall remains constant over the 20 minute period.
Your solution
12 HELM (2006):
Answer
In this answer h cm is used for the level of water measured from the bottom of the barrel and t
minutes for time.
1.5
h (cm )
1
0.5 h
h=0
0
0 5 10 15 20
t (minutes)
Height (depth) of rainwater in a barrel.
(b) Write down a linear function that represents the level of water in the vessel together with its
range of validity.
Your solution
Answer
The intensity of rainfall is stated to be constant, so the rate at which the barrel fills may be taken as
constant. The gradient of an appropriate linear function relating level of water (h cm) measured from
the bottom of the vessel and time (t, minutes) measured would be 1.5 20
= 0.075 and an appropriate
linear function would be h = 0.075t + c. Since the barrel is empty to start with, h = 0 when t =
0, implying that c = 0. So the appropriate linear function and its range of validity are expressed by
h = 0.075t, (0 ≤ t ≤ 20).
(c) State any assumptions that you have made:
Your solution
Answer
It is assumed that the barrel has a uniformly cylindrical cross section, that no water is removed
during the rainfall and there are no holes or leaks up to 1.5 cm depth.
(d) Write down the amended form of your answer to (b), if the vessel contains 2 cm of water initially.
Your solution
Answer
h = 0.075t + 2 (0 ≤ t ≤ 20)
HELM (2006): 13
Task
Suppose that you travel often from Nottingham to Milton Keynes which is a
distance of 87 miles almost all of which is along the M1 motorway. Usually it
takes 1.5 hours. Suppose also that, on one occasion, you have agreed to pick
someone up at the Leicester junction (21) of the M1. This is 25 miles from the
start of your journey in Nottingham. If you start your journey at 8 a.m., what time
should you advise for the pick-up?
Your solution
Answer
(Graphical method)
Assume a constant speed for the whole journey. This means that if 87 miles is covered in 1.5 hours,
then half the distance (43.5 miles) is covered in 0.75 hours and so on.
80 0.43
A distance of 25 miles will be
covered in 0.43 hours
60
distance
in miles 40
87
The average speed is = 58 mph.
25 1.5
20 This is also the gradient of the graph
0
0.5 1 1.5
time in hours
14 HELM (2006):
Answer
(Symbolic method)
d
Let d miles be the distance travelled in time t hours. Then = 58t. This is valid only for the
t
duration of the journey (0 ≤ t ≤ 1.5). The equation can be used to find the time at which d = 25.
25
Now 25 = 58t and so t = = 0.43103448 = 0.43 (to two decimal places). Either way, given that
58
0.43 h is about 26 minutes, a possible suggestion is that the passenger should be advised 8.26 a.m.
for the pick-up. But the assumption of constant speed has its limitations. It would be safer to say
“be there by 8.20 a.m. but be prepared to wait perhaps until 8.30 a.m.”
Task
A local authority has flood control plans in which the emergency and rescue services
are alerted when the river level rises to critical values. A linear model is used to
estimate the variation of height with time. After a period of continuous heavy rain
the level one day was 1.5 m at 8 a.m. and 1.8 m at 2 p.m.
(a) Use a linear model to write down an equation for estimating the level of the river at different
times of the day:
Your solution
HELM (2006): 15
Answer
If the level of water is represented by L m and time by t hours after 8 a.m. then a linear model for
the level as a function of time may be written
L = at + b
where a and b are constants to be found from the other information in the problem. Specifically, it
is stated that L = 1.5 when t = 0 and L = 1.8 when t = 6. The first statement implies that
1.5 = 0 + b or b = 1.5
The second statement implies that
1.8 = 6a + b
or, after substituting for b,
1.8 = 6a + 1.5 or 0.3 = 6a or a = 0.05
So the equation for estimating the level of the river at different times is
L = 0.05t + 1.5
(b) Suggest a suitable range of values of time for which the model could be used:
Your solution
Answer
The model is valid between 8 a.m. and 2 p.m. and, subsequently, only as long as the river level
rises steadily.
(c) What time does the model predict that the level of the river will reach 2 m?
Your solution
Answer
The model will predict a level of 2 m at time t given by
0.5
2 = 0.05t + 1.5 or t= = 10
0.05
i.e. 10 hours after 8 a.m. which is 6 p.m.
16 HELM (2006):
Task
During one winter, the roads in a rural area were completely free from snow when
it started snowing at midnight. It snowed heavily all night and day. By 10 a.m. it
was 19 cm deep.
To save money the local authorities wait until the snow is 30 cm deep before
ploughing the snow away from the roads. Forecast when ploughing should start,
stating any assumptions you have made.
Your solution
Answer
If the depth of snow is represented by D cm and time by t hours after midnight then a linear model
for the depth as a function of time may be written:
D = at + b
where a and b are constants to be found from the other information in the problem or from assump-
tions. As there was no snow at midnight
0=0+b or b=0
It is stated that D = 19 when t = 10, i.e.
19 = 10a or a = 1.9
So the equation for estimating the depth of snow at different times is
D = 1.9t
The model will predict a level of 30 cm at a time t given by
30
30 = 1.9t or t= = 15.789
1.9
i.e. 15.789 hours after midnight which is a little after 3.47 p.m.
Assuming that the snow build up is steady e.g. no drifting or change in precipitation, this suggests
that ploughing should start about 3.45 p.m.
HELM (2006): 17
Exercises
1. A cross-channel ferry usually takes 2 hours to make the 40 km crossing from England and
France.
(a) What is the boat’s average speed?

(b) Derive a linear model connecting distance from England and time since leaving port. State
any limitations of the model.
(c) According to this model, when will the boat be 15 km and 35 km from England?
2. During one winter, the roads in the country district were completely free from snow when it
started snowing at 2:30 a.m. and it snowed steadily all day. At 7:30 a.m. it was 14 cm deep.
To save money, the local practice was to wait until the snow was 20 cm deep before ploughing
the roads. Forecast when ploughing would start, stating any assumptions.
3. In a drought, the population of a particular species of water beetle in a pond is observed to have
halved when the volume of water in the pond has fallen by half. Make a simple assumption
about the relationship between the beetle population and the volume of water in the pond and
express this in symbols as an equation. What would your model predict for the population
when the water volume is only one third of what it was originally.
4. A firm produces a specialised instrument and, although it has the facilities to produce 100
instruments per week, it rarely produces more than 50. It is finding it difficult to assess the
cost of producing the instruments and to set realistic prices. The firm’s accountant estimates
that the firm pays out £5000 per week on fixed costs (overheads, salaries etc.) and that the
additional cost of producing each instrument is £50.
(a) Derive and use a linear model for the variation in total costs with the quantity of instru-
ments produced. State any limitation of this model.
(b) What is the model’s prediction for the cost of producing 80 instruments per week?
18 HELM (2006):
Answers
1. (a) A distance of 40 km is covered in 2 hours. So the average speed is 40/2 = 20 km h−1 .
(b) A linear model assumes that the boat is a point moving at constant speed and will only
be valid for 2 hours (or 40 km) while the boat is travelling from England to France. It does
not allow for variations in speed. If the distance from the English port at any time t hours is
denoted by d km, then d = 20t.
(c) When the boat is 15 km from England 15 = 20t, so t = 15/20 = 0.75, so the boat
is 0.75 hour (45 minutes) from port. When the boat is 35 km from England, 35 = 20t, so
t = 35/20 = 1.75, so the boat is 1.75 hour (1 hour 45 minutes) from port.
2. Assume that there is no snow at 2:30 a.m. and that the rate of accumulation of snow is
constant. Then, if the snow is 14 cm deep at 7:30 a.m., the rate of accumulation is 2.8 cm
per hour. A linear model for the depth (d cm) of snow t hours after 2:30 a.m. is d = 2.8t. d
will be 20 when 20 = 2.8t, i.e. t = 20/2.8 = 7.143. This corresponds to about 9:39 a.m. So
ploughing should start at about 9:40 a.m.
3. Denote population by P and volume of pond by V . Then P is proportional to V so P = kV

where k is a constant of proportionality. When V becomes V /3, then P becomes P/3.
4. (a) Denote the number of instruments made per week by N and the corresponding cost by
£C. Asume that C increases at a constant rate with N (i.e. C is proportional to N ). Then a
linear model for total costs (£T ) is T = 5000+50N . This will be valid only for 0 ≤ N ≤ 100.
If N = 80, then T = 5000 + 50 × 80 = 9000.
(b) The predicted total cost is £9000.
4. Methods for calculating gradient

Occasionally you may be faced with two different pairs of values or coordinates with which to de-
termine the parameters of a linear function. Put another way, two pairs of values are needed to
determine the two (unknown) parameters. Perhaps, unconsciously, you might have used this result
1.5
already when carrying out the rain barrel task. The gradient, written as in the answer, may be
20
1.5 − 0
expressed also as since the line connects the (time, level of water) coordinates (20, 1.5) with
20 − 0
(0, 0). In general the gradient is given by
the change in the dependent variable
the corresponding change in the independent variable
Once the gradient of the line has been calculated, it can be used with one of the known points to
determine the intercept. If one of the points is (0, 0) the intercept is zero.
Suppose that a new type of automatic car is being road tested. The measuring team wants to know
the maximum acceleration between 0 and 30 m s−1 . It plans to calculate this by assuming that it
is constant and measuring the time taken from rest to achieve a speed of 30 m s−1 at maximum
HELM (2006): 19
acceleration. In their first test the speedometer reading is 30 m s−1 after 12 s from start of timing
and motion. We can think of these values in terms of (time, velocity) coordinates. At the start
of timing the coordinates are (0, 0). When the speedometer reads 30 m s−1 the coordinates are
(12, 30). If the acceleration is constant then its magnitude will be given by the gradient of the line
30 − 0
joining these two points. Using the ‘change in variable idea’, the gradient is = 2.5, and so the
12 − 0
magnitude of the acceleration is 2.5 m s−2 . The ’change in variable’ route to calculating the gradient
is an abridged version of a more general method. The two pairs of coordinates may be used with
the general equation of a line to work out the parameters of the particular line that passes through
these two points. The assumption of constant acceleration leads to a linear relationship between
the velocity (v m s−1 ) and time (t s) of the form v = at + b where a and b are the parameters
corresponding to gradient and intercept respectively. The road test gives v = 0 when t = 12. These
may be substituted into the general form to give
0 = 0 + b and 30 = 12a + b.
You may recognise that these are simultaneous equations. The first gives b = 0 which may be
substituted into the second to give a = 2.5, corresponding to an acceleration of 2.5 m s−2 as before.
Suppose that the test team carry out a second test. In this test they note when speeds of 15 m
s−1 and 27 m s−1 are reached and assume constant acceleration between these times and speeds.
The speedometer reads 15 m s−1 , after 4 seconds from the start of motion and 27 m s−1 after 9
s from the start of motion. We apply the general method to the data from this test. The (time,
velocity) coordinates corresponding to the readings are (4, 15) and (9, 27). The equations resulting
from substitutions in the general form are
15 = 4a + b
27 = 9a + b
We use the elimination method of solving these simultaneous equations ( 3). The first of these
equations may be subtracted from the second to eliminate b.
27 − 15 = 9a + b − 4a − b
or
a = 2.4.
The resulting value of a may be substituted into either of the equations expressing the data to
calculate b. In the first, 15 = 4 × 2.5 + b, so b = 5. The resulting model is
v = 2.4t + 5 (4 ≤ t ≤ 9).
This model predicts an acceleration of 2.4 m s−2 , which is fairly close to the previous result of
2.5 m s−2 but if we try to use this model at t = 0, what do we predict? The model predicts that
v = 5 when t = 0. This is not consistent with t = 0 being the time at which the vehicle starts to
move! So, even if the acceleration is constant between 15 and 27 m s−1 , it does not have the same
values between 0 m s−1 and 15 m s−1 as either between 15 m s−1 and 27 m s−1 and 30 m s−1 . A
more general principle is illustrated by this example. It may be dangerous to use a model based on
certain data at points other than those given by these data! The business of using a model outside
the range of data for which is is known to be valid is called extrapolation. Use of the model between
the data points on which it is based is called interpolation. So the general principle may also be
stated as that it is very risky to extrapolate and it can be risky to interpolate. Nevertheless
20 HELM (2006):
extrapolation or interpolation may be part of the purpose for a mathematical model in the first place.
The method of finding gradient and intercept just exemplified may be generalised. Suppose that we
are specifying a linear function y = ax + b where the dependent variable is y and the independent
variable is x. We represent two known points by (p, q) and (r, s). The gradient, a, for the straight
p−r
line, may be calculated either from or by substituting y = q when x = r in y = ax + b to
q−s
obtain two simultaneous equations. Subtraction of these eliminates b and allows a to be calculated.
The intercept of the line on the y-axis, b, may be found by substitution in y = ax + b, of either p, q
and a or r, s and a.
Task
Use the general method to deduce the different accelerations (assuming that they
are constant) between the start of motion and 15 m s−1 and between velocities of
27 m s−1 and 30 m s−1 .
Your solution
Answer
For the (time, velocity) coordinates (0, 0) and (4, 15),
0 = 0a + b
15 = 4a + b
15
From the first of these b = 0 and hence, in the second, a = = 3.75. So the acceleration up to
4
15 m s−1 −2
is 3.756 m s . For the (time, velocity) coordinates (9, 27) and (12, 30),
27 = 9a + b
30 = 12a + b
Subtracting the first from the second gives
3 = 3a so a = 1,
so the acceleration between 27 m s−1 and 30 m s−1 is 1 m s−2
Linear functions may be useful in economics. A lot of attention is paid to the way in which demand
for a product varies with its price. A measure of demand is the number of items sold, if available, in
HELM (2006): 21
a given period. For example, the purpose might be to determine the best price for a product given
certain details about costs and with certain assumptions about the way the number of items sold per
month varies with price. The price affects the profit and hence, in turn, the number manufactured
in response to the demand. The number of items manufactured in a given period is known as the
supply. Information about the variation of demand or supply with price may be obtained from market
surveys. Constant functions are not appropriate in this context since both demand and supply vary
with price. In the absence of other information the simplest way to model the variation of either
demand or supply with price is to use a linear function.
Task
When the price of a luxury consumer item is £1000, a market survey reveals that
the demand is 100,000 items per year. However another survey has shown that
at a price of £600, the demand for the item is 200,000 items per year. Assuming
that both surveys are valid, find a linear function that relates demand Q to price
P . What demand would be predicted by the linear function at a price of £750?
Comment on the validity of both predictions.
Your solution
22 HELM (2006):
Answer
The linear function will be of the form
Q = aP + b (600 ≤ P ≤ 1000)
The limits on P represent the given range of data on price. Substituting the first pair of values of
Q and P :
100000 = 100a + b
Substituting the second pair of values:
200000 = 600a + b
Subtracting the first expression from the second:
100000 = −400a so a = −250
Note that the negative gradient is consistent with the fact that demand falls as price increases.
Check that the ‘Change in variable’ definition for finding a works.
Change in dependent variable (Q) = 200000 − 100000 = 100000.

Corresponding change in independent variable (P ) = 600 − 1000 = −400. The ratio of these
100000
changes is = −250
−250
This value of a may be used with the first pair of values,
100000 = −250000 + b
so
b = 350000
and the linear function relating demand and price is
Q = 350000 − 250P.
[A precautionary check is to make sure that this result is consistent with the other pairs of values.
When P = 600, Q = 350000 − 250 × 600 = 350000 − 150000 = 200000, as required.] When P
= 750:
Q = 350000 − 250 × 750 = 350000 − 187500 = 162500.
So a linear relationship between demand and the price for this luxury suggests a demand of 162500
items per year when the price per item is £750. At a price of £500, P = 500, and the model
predicts that
Q = −250 × 500 + 350000 = 225000.
So the linear model suggests a demand of 225,000 items per year when the price per item is £500.
Such a price however is outside the range of given data. Consequently the corresponding demand
prediction represents an extrapolation and this might not be reliable. On the other hand, the
price of £750 lies within the given range of data and the corresponding demand prediction is an
interpolation. If the given data points are close to each other then interpolation between these
points is more reliable than extrapolation to points further away.
HELM (2006): 23
Quadratic Functions
and Modelling 5.2
Introduction
This Section describes forms of equations for quadratic functions (also called parabolas), ways in
which quadratic functions can be used to model motion involving projectiles, and certain kinds of
problem involving a single maximum or minimum.


Prerequisites
• be familiar with quadratic functions

#
• use quadratic functions to model motion
under constant acceleration
Learning Outcomes
• express the equation of a parabola in a
general form
" !
24 HELM (2006):
1. Quadratic functions
Quadratic functions and parabolas
Graphs of y against x resulting from quadratic functions ( 2.8, Table 1) are called parabolas.
2
These take the general form: y = ax + bx + c. The coefficients a, b and c influence the shape, form
and position of the graph of the associated parabola. They are the parameters of the parabola.
In particular the magnitude of a determines how wide the parabola opens (large a implies a narrow
parabola, small a implies a wide parabola) and the sign of a determines whether the parabola has a
lowest point (minimum) or highest point (maximum). Negative a implies a parabola with a highest
point. The most useful form of equation for determing the graphical appearance of a parabola is
y − C = A(x − B)2 ). To see the relation between this form and the general form simply expand:
y = Ax2 − 2ABx + AB 2 + C
so, comparing with y = ax2 + bx + c we have:
a ≡ A, b ≡ −2AB c ≡ AB 2 + C
We deduce that the relation between the two sets of constants A, B, C and a, b, c is:
b b2
A=a B=− and C = c −
2a 4a
This new form for the parabola enables the coordinates of the highest or lowest point, known as
the vertex to be written down immediately. The coordinates of the vertex are given by (B, C).
Changing the value of B shifts the vertex, and hence the whole parabola, up or down. Changing the
value of C shifts the vertex, and hence the whole parabola, to left or right.
Task
Assume the variation of an object’s location with time is represented by a quadratic
function:-
t2
s= (0 ≤ t ≤ 30)
9
Compare this function with the general form y − C = A(x − B)2 .
(a) What variables correspond to y and x in this case?
(b) What are the values of C, A and B?
Your solution
Answer
1
(a) s corresponds to y, and t corresponds to x (b) C = 0, A = and B = 0
9
HELM (2006): 25
Section 5.2: Quadratic Functions and Modelling
2. Modelling with parabolas
The function
t2
s= (0 ≤ t ≤ 30)
9
is part of a parabola starting at the origin (s = 0 and t = 0) and rising to s = 100 at the end of its
range of validity. s represents the distance of the object from the origin - N.B. Do not confuse this
s with the symbol for seconds. ‘Negative’ time corresponds to time before the motion of the object
is being considered. What would this parabolic function have predicted if it were valid up to 30 s
before the ‘zero’ time? The answer to this can be deduced from the left-hand part of the graph of
the function shown as a dashed curve, for in Figure 4, i.e. the part corresponding to −30 ≤ t ≤ 0.
100 s (m)
80
60
40
20
30 20 10 0 10 20 30
t (s)
t2
Figure 4: Graph of s = 9
for −30 ≤ t ≤ 30
The parabolic form predicts that at t = −30, the object was 100 m away and for (−30 ≤ t ≤ 0) it
was moving towards the point at which the original timing started. The rate of change of position,
or instantaneous velocity, is given by the gradient of the position-time graph. Since the gradient of
the parabola for s is steeper near t = −30 than near t = 0, the chosen function for s and new range
of validity suggests that the object was moving quickly at the start of the motion, slows down on
approaching the initial starting point, and then moves away again accelerating as it does so. Note
that the velocity (i.e. the gradient) for (−30 ≤ t ≤ 0) is negative while for (0 ≤ t ≤ 30) it is
positive. This is consistent with the change in direction at t = 0.
We will consider falling objects again and return to the context of the thriller film and the villain on
a cliff-tip dislodging a rock. Suppose that, as film director, you are considering a variation of the
plot whereby, instead of the ground, the rock hits the roof of a vehicle carrying the hero and heroine.
This means that you might be interested in the position as well as the velocity of the rock at any
time. We can start from the linear function relating velocity and time for the dislodged rock,
v = 9.8t (0 ≤ t ≤ T )
where T represents the time at which the rock hits the roof of the vehicle. The precise value of T
will depend upon the height of the vehicle. If s is measured from the cliff-top and timing starts with
release of the rock, so that s = 0 when t = 0, the resulting function is
s = 4.9t2 (0 ≤ t ≤ T )
(Note that s = 4.9t2 is a particular case of a standard model for falling objects: s = 12 gt2 .)
26 HELM (2006):
Task
This Task refers to the model discussed above.
(a) What kind of function is s = 4.9t2 ?

Your solution
Answer
Quadratic, or parabolic
(b) If the vehicle roof is 2 m above the ground and the cliff-top is 35 m above the ground, calculate
a value for T , the time when a rock falling from the cliff-top hits the car roof:
Your solution
Answer
t = T when s = 35 − 2 = 33
q
33
⇒ 33 = 4.9T 2 so T = 4.9
= 2.5951 ≈ 2.6 (only positive T makes sense)
(c) Given this value for T sketch the function:

Your solution
Answer
33
30
s (m)
20
2.595
10
0 t (s)
0 0.5 1 1.5 2 2.5 3
HELM (2006): 27
In this modelling context, negative time would correspond to time before the villain dislodges the
rock. It seems likely that the rock was stationary before this instant. The parabolic function would
not be appropriate for t ≤ 0 since it would predict that the rock was moving. An appropriate function
would have two parts to its domain:
For t ≤ 0, s would be constant (= 0) and for 0 ≤ t ≤ T, s = 4.9t2 .
The corresponding graph would also have two parts:
A flat line along the s = 0 axis for t ≤ 0 and part of a parabola for 0 < t ≤ T .
A different form of quadratic function for position is appropriate if position is measured upwards as
height (h) above the ground below the cliff-top. This is given as
h = 35 − 4.9t2 (0 ≤ t ≤ 2.6)
Note that once t = 2.6 then h = 0 and the rock cannot fall any further. When position is measured
upwards, velocities and accelerations, which are downwards for falling objects, will be negative.
Task
This Task refers to the model discussed above.
By comparing h = 35 − 4.9t2 with y = ax2 + bx + c, deduce values for a, b and c and determine
whether the parabola corresponding to this function has a highest or lowest point:
Your solution
Answer
Here h corresponds to y and t to x in the general form. The coefficient corresponding to a is
−4.9 × b = 0 and c = 35. The value of a is negative so the parabola opens downwards.
(b) Write down an appropriate function for the variation of h with t if height is measured upwards
from the top of a 2 m high vehicle:
Your solution
Answer
h = 33 − 4.9t2 (0 ≤ t ≤ 2.5951) = 2.60 to 2 d.p.
(c) Sketch this function:
Your solution
28 HELM (2006):
Answer
30
s (m)
20
10
0 t (s)
0 0.5 1 1.5 2 2.5 3
Consider the situation in which position is measured downwards from the cliff-top again but the
villain is lying down on the cliff-top and throws the rock upwards with speed 5 m s−1 . The distance
it would travel in time t seconds if gravity were not acting would be −5t metres (distance is speed
multiplied by time but in the negative s direction in this case). To obtain the resulting distance in
the presence of gravity we add this to the distance function s = 4.9t2 that applies when the rock is
simply dropped. The appropriate quadratic function for s is now
s = 4.9t2 − 5t (0 ≤ t ≤ T )
The nature of this quadratic function means that for any given value of s there are two possible
values of t. If we write the function in a slightly different way, taking out a common factor of t,
s = t(4.9t − 5) (0 ≤ t ≤ T )
it is possible to see that s = 0 at two different times. These are when t = 0 and when 4.9t − 5 = 0.
The first possibility is consistent with the initial position of the rock. The second possibility gives
5
t= which is a little more than 1. The rock will be at the cliff-top level at two different times.
4.9
It is there at the instant when it is thrown. It rises until its speed is zero and then descends, passing
cliff-top level again on its way to impact with the ground below or with the vehicle roof. Since the
initial motion of the rock is upwards and position is defined as positive downwards, the initial part
of the rock’s path corresponds to negative s. The parabola associated with the appropriate function
crosses the s = 0 axis twice and has a vertex at which s is negative. A sketch of s against t for this
case is shown in Figure 5.
30
s (m)
20
10
0 t (s)
0 0.5 1 1.5 2 2.5 3
Figure 5: Graph of rock’s position (measured downwards) when rock is thrown upwards
HELM (2006): 29
Task
For the above modelling of falling rocks, calculate how high the rock rises after
being thrown upwards at 5 m s−1 . (Hint: use the previously determined value of
the time when the rock reaches its highest point.)
Your solution
Answer
5
The value of t at which the rock’s velocity is zero was worked out as t = . This value can be
9.8
used in the function for s to give
5 5 2.5
s= (4.9 × − 5) = − = −1.2755
9.8 9.8 19.6
So the rock rises to a little less than 1.28 m above the cliff-top.
Note that the form of the parabola makes it inevitable that, as long as it is plotted over a sufficiently
wide range, and apart from its vertex, there will always be two values on the curve for each value
of one of the variables. Which of these values makes sense in a mathematical model will depend on
the modelling context. In each of the contexts mentioned so far in this Section each context has
determined the part of the parabola that is of interest.
Note also that there is a connection between the vertex on a parabola and the point where the
gradient of that parabola is zero. In fact these points are the same!
3. Parabolas and optimisation

Because the vertex may represent a highest or lowest point, a quadratic function may be the ap-
propriate type of function to choose in a modelling problem where a maximum or a minimum is
involved (optimisation problems for example). Consider the problem of working out the selling price
for the product of a cottage industry that would maximise the profit, given certain details of costs
and assumptions about market behaviour. A possible function relating profit (£M ) to selling price
(£P ), is
M = −10P 2 + 320P − 2420 (12 ≤ P ≤ 20).
Note that this is a quadratic function. By comparing this function with the form y = ax2 + bx + c it
is possible to decide whether the corresponding parabola that would result from graphing M against
P , would open upwards or downwards. Here M corresponds to y and P to x. The coefficient
corresponding to a in the general form is −10. This is negative, so the resulting parabola will open
downwards. In other words it will have a highest point or maximum for some value of P . This is
comforting in the context of an optimisation problem! We can go further in specifying the resulting
parabola by reference to the other general form: y − C = A(x − B)2 . If we multiply out the bracket
on the right hand side we get (as seen at the beginning of 5.2)
y − C = Ax2 − 2ABx + AB 2
30 HELM (2006):
or
y = Ax2 − 2ABx + AB 2 + C.
Comparing this general form with the function relating profit and price for the cottage industry:
y = Ax2 − 2ABx + AB 2 + C
↓ ↓ ↓
M = −10P 2 + 320P − 2420
Using the equivalances suggested by the arrows, we see that
A = −10,
2AB = −320
AB 2 + C = −2420.
These are three equations for three unknowns. Putting A = −10 in the second equation gives
B = 16. Putting A = −10 and B = 16 in the third equation gives
−2560 + C = −2420,
and so
C = 140.
This means that the equation for M may also be written in the form
M − 140 = −10(P − 16)2 ,
corresponding to the general form y − C = A(x − B)2 . In the general form, C corresponds to the
value of y at the vertex of the parabola. Since y in the general form corresponds to M in the current
modelling context, we deduce that M = 140 at the highest point on the parabola. B represents the
value of x at the lowest or highest point of the general parabola. Here x corresponds to P , so we
deduce thet P = 16 at the vertex of the parabola corresponding to the function relating profit and
price. These deductions mean that a maximum profit of £140 is obtained when the selling price is
£16.
HELM (2006): 31
4. Finding the equation of a parabola
Consider a parabola that has its vertex at s = 50 when t = 0 and rises to s = 100 when t = 30. In
coordinate terms, we need the equation of a parabola that has its lowest point or vertex at (0, 50)
and passes through (30, 100). The general form
y − C = A(x − B)2
is useful here.
In this case y corresponds to s and x to t. So the equation relating s and t is
s − C = A(t − B)2
According to the general form, the coordinates of the vertex are (B, C). We know that the coordinates
of the vertex are (0, 50). So we can deduce that B = 0 and C = 50. It remains to find A. The fact
that the parabola must pass through (30, 100) may be used for this purpose. These values together
with those for B and C may be substituted in the general equation:
100 − 50 = A(30 − 0)2
1
so 50 = 900A or A = and the function we want is
18
1 2
s = 50 + t (0 ≤ t ≤ 30)
18
Task
Find the equation of a parabola with vertex at (0, 2) and passing through the point
(4, 4).
Your solution
Answer
Using the general form, with B = 0 and C = 2,
y − 2 = A(x − 0)2 or y − 2 = Ax2
Then using the point (4, 4)
2 1
4 − 2 = 16A so A= =
16 8
and the required equation is
1
y = 2 + x2
8
32 HELM (2006):
Exercise
An open-topped carton is constructed from a 200 mm × 300 mm sheet of cardboard, using simple
folds as shown in the diagram.
300 mm
200 mm
Cardboard folds to make an open-topped carton
(a) Show that the volume of the carton (in cm3 ) is
x(300 − 2x)(200 − 2x)

V =
1000
x3
so V = − x2 + 60x . . . (*)
250
(b) Sketch Equation (1) as V vs x and hence estimate the maximum volume of carton that may
be obtained by folding the cardboard sheet.
(c) A carton with a volume of 1000 cm3 is to be made from the cardboard sheet.
(i) Show that one solution is to use a height x = 50 mm.

(ii) By factorisation of Equation (*) for V = 1000 cm3 , find a second solution for x which
would give the same carton volume.
(iii) Why does the third root have no physical meaning?
HELM (2006): 33
Answer
x(300 − 2x)(200 − 2x) x3
(a) V = = − x2 + 60x (cm3 )
1000 250
(b)
V
1056
150 x
40 100
Vmax ≈ 1056 cm3 when x ≈ 39.2
(c) (i) x = 50 mm ⇒ V = 1000 cm3 as required.

x3 − 250x2 + 15000x
(ii) − 1000 = 0 factorises to
250
(x − 50)(x2 − 200x + 5000) = 0
√
so x = 50 or x = 100 ± 10 50 ≈ 29.3 or 170.7. The second root is 29.3.
(iii) The third root 170.7 is impossible as 200 − 2x must be a positive distance.
34 HELM (2006):
Oscillating Functions
and Modelling 5.3
Introduction
This Section describes ways in which trigonometric functions can be used to model situations involving
periodic motion, which occur in a wide variety of scientific and engineering situations, and in nature.


Prerequisites
• be familiar with trigonometric functions

#
• use trigonometric functions to model
periodic motion
Learning Outcomes
• define terms associated with the
description of periodic motion
" !
HELM (2006): 35
Section 5.3: Oscillating Functions and Modelling
1. Oscillating functions: amplitude, period and frequency
Particular types of periodic functions ( 2.2) that are especially important in engineering are
the sine and cosine functions. These are possible choices when modelling behaviour that involves
oscillation or motion in a circle. The usefulness of these functions is rather limited if we confine our
attention only to sin(x) and cos(x). Use of functions such as 3sin(2x), 5cos(3x) and so on, and
other functions made up of sums of functions of this type, enables the modelling of a great variety
of situations where the quantity being modelled is known to change in a periodic way. Here we will
examine the behaviour of sine and cosine functions and consider a modelling context where choice
of a sine function is appropriate. Figure 6 shows how the terms amplitude, period and frequency
are defined with respect to a general sinusoid (the name for any general sine or cosine function).
1
frequency =
period
amplitude
period
Figure 6: Defining amplitude and period for a sinusoid

The amplitude represents the difference between the maximum (or minimum) value of a sinusoidal
function and its mean value (which is zero in Figure 6). The frequency represents the number of
complete cycles of the function in each unit change in x. The period is such that f (x + T ) = f (x)
for all x, e.g. for sin x, T = 2π.
36 HELM (2006):
Example 2
Sketch the sinusoids:
x
(a) y = sin x (b) y = 2 sin x (c) y = cos x (d) y = cos
2
Solution
2
y = 2 sin x
1 y = sin x
0 5 10 15 20 25 30 35 40
x
Figure 7
1 y = cos x2
x
0 5 10 15 20 25 30 35 40
y = cos x
Figure 8
HELM (2006): 37
Task
Using the graphs in Figures 7 and 8 on page 37, state the amplitude, frequency
and period of
x
(a) sin x (b) 2 sin x (c) cos x (d) cos
2
Give frequency and period in terms of π.
Your solution
Answer
(a) amplitude = 1, frequency = 1/2π, period = 2π.
(b) amplitude = 1, frequency = 1/2π, period = 2π.
(c) amplitude = 2, frequency = 1/2π, period = 2π.
(d) amplitude = 1, frequency = 1/4π, period = 4π.
See Figure 7 for the sine functions and Figure 8 for the cosine functions.
Note that (b) has twice the amplitude of (a) and (d) has half the frequency and twice the period
of (c).
Note that the cosine functions cos nx have the same shape as the sine functions sin nx but, at x = 0,
the cosine functions have a peak or maximum, whereas the sine functions have the value zero, which
is the mean value for both of these functions. Indeed the graph of y = cos x is exactly like that for
y = sin x with all the x values displaced by π/2.
More general forms of sine and cosine function are given by y = a sin(bx), and y = a cos(bx) where
b 2π
a and b are arbitary constants. These are functions with frequency , period and amplitude
2π b
π 5π 9π
a. The peak values of the sine functions occur at x values equal to , , etc. The minimum
2 2 2
3π 7π 11π
values occur at x values equal to , , etc.
2 2 2
When the period is measured in seconds, frequency is measured in cycles per second or Hz which has
units of 1/time.
38 HELM (2006):
Exercises
1. Figure 7 on page 37 shows on the same axes the graphs of y = sin x and y = 2 sin x.
(a) State in words how the graph of y = 2 sin x relates to the graph of y = sin x
1 1 1
(b) Sketch the graphs of (i) y = sin x, (ii) y = sin x +
2 2 2
x
2. Figure 8 on page 37 shows on the same axes the graph y = cos x and y = cos
2
x
(a) State in words how the graph of y = cos x relates to the graph of y = cos
2
(b) Sketch graphs of (i) y = cos 2x, (ii) y = 2 cos x
Answers
1. y = sin 2x has the same form as y = sin x but all the y values are doubled. The graph is
‘stretched’ vertically.
x
2. y = cos has the same form as y = cos x but all the y values are halved. The graph is
2
‘shrunk’ vertically.
2. Oscillating functions: modelling tides

We consider how the function
h = 3.2 sin(2.7t + 8.5)
might be used to model the rise and fall of the tide in a harbour.
Figure 9 shows a graph of this function for (0 ≥ t ≥ 5).
4
height
2
0 1 2 3 4 5 time
2
Figure 9
We consider some aspects of this graph and model. It seems reasonable to suppose that the tide
creates an oscillation of the water level in the harbour of hm about some mean value represented on
the graph by h = 0. There seems to be a low tide near t = 1 and another low tide just after t = 3.
Since we expect intervals of 12 to 14 hours between low tides around the U.K., this suggests that
time in this graph is specified in 6-hour intervals.
HELM (2006): 39
Task
Write down the amplitude, period and frequency of h = 3.2 sin(2.7t + 8.5)
Your solution
Answer
The amplitude of the change in water level in the harbour is 3.2 m. The period of the function is
given by 2π/2.7 = 2.3271 between successive high tides or successive low tides. This corresponds to
2.3271×6 hours = 13.96 hours between high tides.The frequency of the function is 2.7/2π = 0.4297.
The peak levels of the graph correspond to times when the sine function has the value 1. The lowest
points correspond to times when the sine function is −1. At these times the arguments of the sine
function (i.e. 2.7t + 8.5) are an odd number of π/2 starting at 3π/2 for the first low tide.
So far all of this may be deduced from the general form y = asin(bx) and from the modelling context.
However there is an additional term in the function being considered here. This is a constant 8.5
within the sine function. When t = 0 the presence of this constant means that the intercept on the
height axis is 3.2sin(8.5) = 2.56, implying that the water level is 2.56 m above the mean value at
the start of timing. The constant 8.5 has displaced the sine curve sideways. This constant is known
as the phase of the function. Phase is measured in radians as it is an angle.
As remarked earlier, at t = 0, this function has the value 3.2 sin(8.5). Since sin(8.5) = sin(8.5 −
2π) ≈ sin(2.2168), we can replace the constant 8.5 by 2.2168 without altering the values on the
graph. This means that the function
h = 3.2 sin(2.74t + 2.2168)
does just as well as the original function in representing the tidal variation in the harbour. We now
rewrite this latest form of the function, representing the variation of water level in the harbour, so
that time is measured in hours rather than in six-hourly invervals. The effect of changing the units
of time to hours from 6 hours is to decrease the coefficient of t in the sine function by a factor of 6,
so that the new function is
h = 3.2 sin(0.45t + 2.168). See Figure 10.
height
(metres)
2
0 5 10 15 20 25 30 time (hours)
Figure 10
40 HELM (2006):
We can use the latest form of the function to calculate the time of the first low tide assuming that
t = 0 corresponds to midnight.
At the first low tide, h = −3.2 and sin(0.45t + 2.2168) = −1,
3π
Using the fact that sin( ) = −1, we have
2
0.45t + 2.2168 = 3π/2, giving t = 5.5458 = 5.55 to 2 d.p.
so the first low tide is just before 6 a.m.
Task
For the above tide modelling situation, assume that t = 0 corresponds to midnight.
Calculate
(a) the time of the first high tide after midnight
(b) the times either side of midnight at which the water is at its mean level.
Your solution
Answer
(a) At the first high tide, h = 3.2 and sin(0.45t + 2.2168) = 1, so 0.45t + 2.2168 = 5π/2 giving
t = 12.5271 so the first high tide is a little before 1 p.m.
(b) When the water level is at the mean value,
sin(0.45t + 2.2168) = 0.
At the mean level before midnight, using the fact that sin(0) = 0 we have
0.45t = −2.2168 so t = −4.9262 = −4.93 to 2 d.p.
So this mean level occurs nearly 5 hours before midnight, i.e. about 7 p.m. the previous day.
The next mean level will occur one period, or 13.963 hours, later, at approximately 9 a.m.
HELM (2006): 41
There are various rules connected with sine and cosine functions that can be summarised at this
point.
(1) Placing a multiplier before sin x or cos x (e.g. 2 sin x) changes the amplitude without changing
the period.
(2) Placing a multiplier before x in sin x or cos x, (e.g. sin 3x), changes the period or frequency
without changing the amplitude.
(3) As with any function, the addition of a constant (e.g. 4 + sin x) raises or lowers the whole
graph of the sine or cosine function. It alters the mean value without changing the amplitude.
(4) Changing the sign within a cosine function has no effect, (e.g. cos(−x) = cos x).
(5) Changing the sign within a sine function changes the sign of the function, (e.g. sin(−x) =
− sin x).
(6) Placing a constant or altering the constant b in sin(ax + b) or cos(ax + b) changes the phase
and shifts the sine or cosine function along the x-axis.
Task
(a) Write down the amplitude and period of y = sin(3x)
(b) Write down the amplitude and frequency of y = 3sin(2x)
(c) Write down the amplitude, period and frequency of y = a sin(bx)
(d) Write down the amplitude, period, frequency and phase of
y = 4 sin(2x + 7).
(e) Write down an equivalent expression to that in (d) but with the phase less
than 2π.
Your solution
Answer
(a) amplitude = 1 period = 2π/3
2 1
(b) amplitude = 3 frequency = =
2π π
(c) amplitude = a period = 2π/b frequency = b/2π
2π 1
(d) amplitude = 4 period = =π frequency = phase = 7
2 π
(e) y = 4sin(2x + 7 − 2π) = 4 sin(2x + 0.7168)
42 HELM (2006):
Task
Write down a function relating water level (L m) in a harbour to time (T hours),
starting when the level is equal to the mean level of 5 m, that has an amplitude
of 2 m and has a period of twelve hours.
Your solution
Answer
2π π
In the general form y = a sin(bx + c) + d, the phase c = 0, the period = 12, so b = the
b 6
amplitude a = 2, the mean value d = 5.
π
L = 2 sin( T ) + 5 (T ≥ 0)
6
Task
The diagram shows a graph of a typical variation of the depth (d metres) of water
in a particular harbour with time (t hours) as the depth changes with the tide.
t
0 12.5 25
(a) Find a suitable equation for the curve in the diagram:
Your solution
HELM (2006): 43
Answer
Equation is of the form
π
h = a + b cos(ωt) (or h = a + b sin(ωt + ))
2
By inspection, a = 5 and b = 3.
2π 4π
The period T = 12.5 = so ω = (= 0.502655)
ω 25
4π
so the equation of the curve is h = 5 + 3 cos( t)
25
(b) A boat enters the harbour in late morning on a day when the high tide is at 2 p.m. The boat
needs a water depth of 4 m to sail safely. What advice would you give to its pilot about when to
leave the harbour if the boat is not to be forced to wait in the harbour through the evening low tide?
Your solution
Answer
Put h = 4 into the equation:
4π 1 4π
4 = 5 + 3 cos( t) implying − = cos( t)
25 3 25
Now, inverting the cosine:
4π 1
t = cos−1 (− ) = 1.91063 giving t = 3.80108 hours.
25 3
So the advice to the pilot should be that he needs to be clear of the harbour by 5:45 pm at the very
latest - and that he should allow a safety margin.
(c) State two modelling assumptions you have made:
Your solution
Answer
Assumptions likely are:
The tide on the day in question is typical.
No waves.
A sinusoidal function accurately models the effect of the tide on sea level.
44 HELM (2006):
Inverse Square Law
Modelling 5.4
Introduction
This Section describes how functions involving a constant numerator and a squared variable denom-
inator can be used in adding sound energies of different sources.
' $
• be familiar with polynomial functions

Prerequisites
• be able to use Pythagoras’ theorem
• be able to use the formula for solving
quadratics
&
%

• model inverse square problems
Learning Outcomes
• use a graphical method to solve a quadratic
On completion you should be able to . . . equation

HELM (2006): 45
Section 5.4: Inverse Square Law Modelling
1. Introduction
Many aspects of physics and engineering involve inverse square law dependence. For example grav-
itational forces and electrostatic forces vary with the inverse square of distance from the mass or
charge. The following short case study illustrates this and concerns the dependence of sound intensity
on distance from a source.
Sound intensity
Introduction
For a single source of sound power W (watts) the dependence of sound intensity magnitude I (W
m−2 ) on distance r (m) from a source is expressed as
W
I=
4πr2
The way in which sounds from different sources are added depends on whether or not there is
a phase relationship between them. There will be a phase relationship between two loudspeakers
connected to the same amplifier. A stereo system will sound best if the loudspeakers are in phase.
The loudspeaker sources are said to be coherent sources. Between such sources there can be
reinforcement or cancellation depending on position. Usually there is no phase relationship between
two separate items of industrial equipment. Such sources are called incoherent. For two such
incoherent sources A and B the combined sound intensity magnitude (IC W m−2 ) at a specific
point is given by the sum of the magnitudes of the intensities due to each source at that point. So
WA WB
IC = IA + IB = 2
+ 2
4πrA 4πrB
where WA and WB are the respective sound powers of the sources; rA and rB are the respective
distances from the point of interest. Note that sound intensity is directional. So if A and B are on
opposite sides of the receiver’s position their intensity contributions will have opposite directions.
Problem in words
With reference to the situation shown in Figure 11, given incoherent point sources A and B, with
sound powers 1.9 W and 4.1 W respectively, 6 m apart, find the sound intensity magnitude at points
C and D at distances p and q from the line joining A and B and find the locations of C, D and E
that correspond to sound intensity magnitudes of 0.02, 0.06 and 0.015 W m−2 respectively.
D
C
q
p
A B E
m/2
m n
Figure 11
46 HELM (2006):
(a) Write down an expression for the sound intensity magnitudes at point C due to the
independent sources A and B with powers WA and WB , taking advantage of the symmetry
of their locations about the line through C at right-angles to the line joining A and B.
(b) Find the expression for p in terms of IC , WA , WB and m.
(c) If WA = 1.9 W, WB = 4.1 W and m = 6m calculate the distance p at which the sound
intensity is 0.02 W m−2 ?
(d) Find an expression for the intensity magnitude at point D.
(e) Find the value for q such that the intensity magnitude at D is 0.06 W m−2 and the other
values are as in part (c).
(f) Find an equation in powers of n relating IE , (intensity magnitude at point E) WA , WB , n
and m.
(g) By plotting this function for IE = 0.015 W m−2 , m = 6 m, WA = 1.9 W, WB = 4.1
W, find the corresponding values for n.
(a) The combined sound intensity magnitude IC W m−2 is given by the sum of the intensity
magnitudes due to each source at C. Because of symmetry of the position of C with
−→ −−→
respect to A and B, write |AC| = |BC| = r, then
WA WB WA + WB
IC = IA + IB = 2
+ 2
=
4πrA 4πrB 4πr2
Using Pythagoras’ theorem,
m 2 WA + WB WA + WB
r2 = + p2 hence IC = 2 2
=
2 4π((m/2) + p ) π(m2 + 4p2 )
(b) Making p the subject of the last formula,
r
1 WA + WB
p=± ( ) − m2
2 πIC
The result that there are two possible values of p is a consequence of the symmetry of the
sound field about the line joining the two sources. The positive value gives the required
location of C above the line joining A and B in Figure 11. The negative value gives a
symmetrical location ‘below’ the line.
WA + WB (WA + WB )
Note also that if 0 = − m2 or IC = , then p = 0, i.e. C would
πIC πm2
be on the line joining A and B.
(c) Using the given values, p = 3.86 m.
p
(d) Using Pythagoras’ theorem again, the distance from A to D is given by q 2 + m2 . So
WA WB WA WB
ID = IA + IB = 2
+ 2
= +
4πrA 4πrB 4π(q + m ) 4πq 2
2 2
HELM (2006): 47
(e) Multiplying through by 4πq 2 (q 2 + m2 ) and collecting together like powers of q produces
a quartic equation,
4πID q 4 + [4πm2 ID − (WA + WB ])q 2 − WB m2 = 0.
Since the quartic equation contains only even powers of q, it can be regarded as a quadratic
equation in q 2 and this can be solved by the standard formula. Hence
2
p
−[4πm ID − (W A + W B )] ± [4πm2 ID − (WA + WB )]2 + 16πID WB m2
q2 =
8πID
−21.14 ± 29.87
Using the given values, q2 =
1.51
Since q must be real, the negative result can be ignored. Hence q ≈ 2.40 m.
(f) Using the same procedure as in (d) and (e),
WA WB WA WB
IE = IA + IB = 2
+ 2
= +
4πrA 4πrB 4π(m + n)2 4πn2
4πID n2 (m + n)2 IE = WA n2 + (m + n)2 WB = 0
A general expression for the distance n at which the intensity at point E is IE is given
by collecting like powers of n and is another quartic equation, i.e.
4πIE n4 + 8πIE mn3 + [4πIE m2 − (WA + WB )]n2 − 2mWB n − m2 WB = 0
Unfortunately this cannot be treated simply as a quadratic equation in n2 since there are
terms in odd powers of n. One way forward is to plot the curve corresponding to the
equation after substituting the given values, another is to use a numerical method such
as Newton-Raphson.
(g) Substitution of the given values produces the equation
0.1885n4 + 2.2619n2 + 0.7858n2 − 49.2n − 147.6 = 0.
The plot of the quartic equation in Figure 12 shows that there are two roots of interest.
Use of a numerical method for finding the roots of polynomials gives values of the roots
to any desired accuracy i.e. n ≈ 4.876 m and n ≈ −9.628 m.
f (n)
200
− 9.6 4.9
−15 − 10 −5 0 5 10 n
− 200
Figure 12
48 HELM (2006):
Interpretation
The result for part (g) implies that there are two locations for E along the line joining the two sources
where the intensity magnitude will have the given value. One position is about 3.6 m to the left of
source A and the other is about 4.9 m to the right of source B.
HELM (2006): 49
Contents 6
Exponential and
Logarithmic Functions
6.1 The Exponential Function 2
6.2 The Hyperbolic Functions 11
6.3 Logarithms 19
6.4 The Logarithmic Function 27
6.5 Modelling Exercises 38
6.6 Log-linear Graphs 58
Learning outcomes
In this Workbook you will learn about one of the most important functions in mathematics,
science and engineering - the exponential function. You will learn how to combine
exponential functions to produce other important functions, the hyperbolic functions,
which are related to the trigonometric functions.
You will also learn about logarithms and the logarithmic function which is the function
inverse to the exponential function. Finally you will learn what a log-linear graph is and
how it can be used to simplify the presentation of certain kinds of data.
The Exponential
Function 6.1
Introduction
In this Section we revisit the use of exponents. We consider how the expression ax is defined when a
is a positive number and x is irrational. Previously we have only considered examples in which x is a
rational number. We consider these exponential functions f (x) = ax in more depth and in particular
consider the special case when the base a is the exponential constant e where :
e = 2.7182818 . . .
We then examine the behaviour of ex as x → ∞, called exponential growth and of e−x as x → ∞
called exponential decay.
#
• have a good knowledge of indices and their
laws
Prerequisites
Before starting this Section you should . . . • have knowledge of rational and irrational
numbers
"
' !
$
x
• approximate a when x is irrational
• describe the behaviour of ax : in particular the

Learning Outcomes exponential function ex
On completion you should be able to . . . • understand the terms exponential growth
and exponential decay
& %
2 HELM (2006):
Workbook 6: Exponential and Logarithmic Functions
®
1. Exponents revisited
We have seen (in 1.2) the meaning to be assigned to the expression ap where a is a positive
number. We remind the reader that ‘a’ is called the base and ‘p’ is called the exponent (or power
or index). There are various cases to consider:
If m, n are positive integers
• an = a × a × · · · × a with n terms
• a1/n means the nth root of a. That is, a1/n is that positive number which satisfies
(a1/n ) × (a1/n ) × · · · × (a1/n ) = a
where there are n terms on the left hand side.
• am/n = (a1/n ) × (a1/n ) × · · · × (a1/n ) where there are m terms.
1
• a−n =
an
For convenience we again list the basic laws of exponents:
Key Point 1
am
am an = am+n n
= am−n (am )n = amn
a
a1 = a, and if a 6= 0 a0 = 1
Example 1
pn−2 pm
Simplify the expression 3 2m
pp
Solution
First we simplify the numerator:
pn−2 pm = pn−2+m
Next we simplify the denominator:
p3 p2m = p3+2m
Now we combine them and simplify:
pn−2 pm pn−2+m
= = pn−2+m p−3−2m = pn−2+m−3−2m = pn−m−5
p3 p2m p3+2m
HELM (2006): 3
Section 6.1: The Exponential Function
Task
bm−n b3
Simplify the expression
b2m
First simplify the numerator:

Your solution
bm−n b3 =
Answer
bm−n b3 = bm+3−n
Now include the denominator:
Your solution
bm−n b3 bm+3−n
= =
b2m b2m
Answer
bm+3−n
2m
= bm+3−n−2m = b3−m−n
b
Task
(5am )2 a2
Simplify the expression
(a3 )2
Simplify the numerator:

Your solution
(5am )2 a2 =
Answer
(5am )2 a2 = 25a2m a2 = 25a2m+2
Now include the denominator:

Your solution
(5am )2 a2 25a2m+2
= =
(a3 )2 a6
Answer
(5am )2 a2 25a2m+2
= = 25a2m+2−6 = 25a2m−4
(a3 )2 a6
4 HELM (2006):
®
ax when x is any real number
So far we have given the meaning of ap where p is an integer or a rational number, that is, one which
can be written as a quotient of integers. So, if p is rational, then
m
p= where m, n are integers
n
Now consider x as a real number which cannot√ be written as a rational number. Two common
examples of these irrational numbers are 2 and π. What we shall do is approximate x by a
rational number by working to a fixed number of decimal places. For example if
x = 3.14159265 . . .
then, if we are working to 3 d.p. we would write
x ≈ 3.142
and this number can certainly be expressed as a rational number:
3142
x ≈ 3.142 =
1000
so, in this case
3142
ax = a3.14159... ≈ a3.142 = a 1000
3142
and the final term: a 1000 can be determined in the usual way by calculator. Henceforth we shall
therefore assume that the expression ax is defined for all positive values of a and for all real values
of x.
Task
By working to 3 d.p. find, using your calculator, the value of 3π/2 .
π
First, approximate the value of :
2
Your solution
π
≈ to 3 d.p.
2
Answer
π 3.1415927 . . .
≈ = 1.5707963 · · · ≈ 1.571
2 2
Now determine 3π/2 :
Your solution
3π/2 ≈
Answer
3π/2 ≈ 31.571 = 5.618 to 3 d.p.
HELM (2006): 5
2. Exponential functions
For a fixed value of the base a the expression ax clearly varies with the value of x: it is a function of
x. We show in Figure 1 the graphs of (0.5)x , (0.3)x , 1x , 2x and 3x .
The functions ax (as different values are chosen for a) are called exponential functions. From the
graphs we see (and these are true for all exponential functions):
If a > b > 0 then

ax > b x if x > 0 and ax < b x if x < 0
y
(0.3)x
(0.5)x 3x 2x
1x
x
Figure 1: y = ax for various values of a
The most important and widely used exponential function has the particular base e = 2.7182818 . . . .
It will not be clear to the reader why this particular value is so important. However, its importance
will become clear as your knowledge of mathematics increases. The number e is as important as the
number π and, like π, is also irrational. The approximate value of e is stored in most calculators.
There are numerous ways of calculating the value of e. For example, it can be shown that the value
of e is the end-point of the sequence of numbers:
1 2 3 16 64
2 3 4 17 65
, , , ..., , ..., ,...
1 2 3 16 64
which, in decimal form (each to 6 d.p.) are
2.000000, 2.250000, 2.370370, ..., 2.637929, . . . , 2.697345, ...
This is a slowly converging sequence. However, it does lead to a precise definition for the value of e:
n
n+1
e = lim
n→∞ n
6 HELM (2006):
®
An quicker way of calculating e is to use the (infinite) series:

1 1 1 1 1
e=1+ + + + + ··· + + ...
1! 2! 3! 4! n!
where, we remember,
n! = n × (n − 1) × (n − 2) × . . . (3) × (2) × (1)
(This is discussed more fully in 16: Sequences and Series.)
Although all functions of the form ax are called exponential functions we usually refer to ex as the
exponential function.
Key Point 2
ex is the exponential function where e = 2.71828 . . .
y
ex
x
Figure 2: y = ex
Exponential functions (and variants) appear in various areas of mathematics and engineering. For
example, the shape of a hanging chain or rope, under the effect of gravity, is well described by a
2
combination of the exponential curves ekx , e−kx . The function e−x plays a major role in statistics;
it being fundamental in the important normal distribution which describes the variability in many
naturally occurring phenomena. The exponential function e−kx appears directly, again in the area of
statistics, in the Poisson distribution which (amongst other things) is used to predict the number of
events (which occur randomly) in a given time interval.
From now on, when we refer to an exponential function, it will be to the function ex (Figure 2) that
we mean, unless stated otherwise.
HELM (2006): 7
Task
Use a calculator to determine the following values correct to 2 d.p.
(a) e1.5 , (b) e−2 , (c) e17 .
Your solution
(a) e1.5 = (b) e−2 = (c) e17 =
Answer
(a) e1.5 = 4.48, (b) e−2 = 0.14, (c) e17 = 2.4 × 107
Task
e2.7 e−3(1.2)
Simplify the expression and determine its numerical value to 3 d.p.
e2
First simplify the expression:

Your solution
e2.7 e−3(1.2)
=
e2
Answer
e2.7 e−3(1.2)
= e2.7 e−3.6 e−2 = e2.7−3.6−2 = e−2.9
e2
Now evaluate its value to 3 d.p.:
Your solution
e−2.9 =
Answer
0.055
3. Exponential growth
If a > 1 then it can be shown that, no matter how large K is:
ax
→ ∞ as x → ∞
xK
That is, if K is fixed (though chosen as large as desired) then eventually, as x increases, ax will become
larger than the value xK provided a > 1. The growth of ax as x increases is called exponential
growth.
8 HELM (2006):
®
Task
A function f (x) grows exponentially and is such that f (0) = 1 and f (2) = 4.
Find the exponential curve that fits through these points. Assume the function
is f (x) = ekx where k is to be determined from the given information. Find the
value of k.
First, find f (0) and f (2) by substituting in f (x) = ekx :
Your solution
When x = 0 f (0) = e0 = 1
When x = 2, f (2) = 4 so e2k = 4
By trying values of k: 0.6, 0.7, 0.8, find the value such that e2k ≈ 4:
Your solution
e2(0.6) = e2(0.7) = e2(0.8) =
Answer
e2(0.6) = 3.32 (too low) e2(0.7) = 4.055 (too high)
Now try values of k: k = 0.67, 0.68, 0.69:

Your solution
e2(0.67) = e2(0.68) = e2(0.69) =
Answer
e2(0.67) = 3.819 (low) e2(0.68) = 3.896 (low) e2(0.69) = 3.975 (low)
Next try values of k = 0.691, 0.692:

Your solution
e2(0.691) = e2(0.692) = e2(0.693) =
Answer
e2(0.691) = 3.983, (low) e2(0.692) = 3.991 (low) e2(0.693) = 3.999 (low)
Finally, state the exponential function with k to 3 d.p. which most closely satisfies the conditions:
Your solution
y=
Answer
The exponential function is e0.693x .
We shall meet, in Section 6.4, a much more efficient way of finding the value of k.
HELM (2006): 9
4. Exponential decay
As we have noted, the behaviour of ex as x → ∞ is called exponential growth. In a similar manner
we characterise the behaviour of the function e−x as x → ∞ as exponential decay. The graphs of
ex and e−x are shown in Figure 3.
y
e−x ex
Figure 3: y = ex and y = e−x

Exponential growth is very rapid and similarly exponential decay is also very rapid. In fact e−x tends
to zero so quickly as x → ∞ that, no matter how large the constant K is,
xK e−x → 0 as x → ∞
The next Task investigates this.
Task
Choose K = 10 in the expression xK e−x and calculate xK e−x using your calculator
for x = 5, 10, 15, 20, 25, 30, 35.
Your solution
x 5 10 15 20 25 30 35
10 −x
x e
Answer
x 5 10 15 20 25 30 35
10 −x
x e 6.5 × 104 4.5 × 105 1.7 × 105 2.1 × 104 1324 55 1.7
The topics of exponential growth and decay are explored further in Section 6.5.
Exercises
1. Find, to 3 d.p., the values of
(a) 2−8 (b) (5.1)4 (c) (2/10)−3 (d) (0.111)6 (e) 21/2 (f) π π (g) eπ/4 (h) (1.71)−1.71
2. Plot y = x3 and y = ex for 0 < x < 7. For which integer values of x is ex > x3 ?
Answers
1. (a) 0.004 (b) 676.520 (c) 125 (d) 0.0 (e) 1.414 (f) 36.462 (g) 2.193 (h) 0.400
2. For integer values of x, ex > x3 if x ≥ 5
10 HELM (2006):
®
The Hyperbolic
Functions 6.2
Introduction
The hyperbolic functions sinh x, cosh x, tanh x etc are certain combinations of the exponential
functions ex and e−x . The notation implies a close relationship between these functions and the
trigonometric functions sin x, cos x, tan x etc. The close relationship is algebraic rather than geo-
metrical. For example, the functions cosh x and sinh x satisfy the relation
cosh2 x − sinh2 x ≡ 1
which is very similar to the trigonometric identity cos2 x + sin2 x ≡ 1. (In fact every trigonometric
identity has an equivalent hyperbolic function identity.)
The hyperbolic functions are not introduced because they are a mathematical nicety. They arise
naturally and sufficiently often to warrant sustained study. For example, the shape of a chain hanging
under gravity is well described by cosh and the deformation of uniform beams can be expressed in
terms of tanh.
' $
• have a good knowledge of the exponential
function
Prerequisites • have knowledge of odd and even functions
Before starting this Section you should . . . • have familiarity with the definitions of
tan, sec, cosec, cot and of trigonometric
identities
&
' %
$
• explain how hyperbolic functions are defined
in terms of exponential functions
Learning Outcomes • obtain and use hyperbolic function identities
• manipulate expressions involving hyperbolic
functions
& %
HELM (2006): 11
Section 6.2: The Hyperbolic Functions
1. Even and odd functions
Constructing even and odd functions
A given function f (x) can always be split into two parts, one of which is even and one of which is
1 1
odd. To do this write f (x) as [f (x) + f (x)] and then simply add and subtract f (−x) to this to
2 2
give
1 1
f (x) = [f (x) + f (−x)] + [f (x) − f (−x)]
2 2
1 1
The term [f (x) + f (−x)] is even because when x is replaced by −x we have [f (−x) + f (x)]
2 2
1
which is the same as the original. However, the term [f (x) − f (−x)] is odd since, on replacing x
2
1 1
by −x we have [f (−x) − f (x)] = − [f (x) − f (−x)] which is the negative of the original.
2 2
Example 2
Separate x3 + 2x into odd and even parts.
Solution
f (x) = x3 + 2x
f (−x) = (−x)3 + 2−x = −x3 + 2−x
Even part:
1 1 1
(f (x) + f (−x)) = (x3 + 2x − x3 + 2−x ) = (2x + 2−x )
2 2 2
Odd part:
1 1 1
(f (x) − f (−x)) = (x3 + 2x + x3 − 2−x ) = (2x3 + 2x − 2−x )
2 2 2
Task
Separate the function x2 − 3x into odd and even parts.
First, define f (x) and find f (−x):

Your solution
f (x) = f (−x) =
Answer
f (x) = x2 − 3x , f (−x) = x2 − 3−x
12 HELM (2006):
®
1 1
Now construct [f (x) + f (−x)], [f (x) − f (−x)]:
2 2
Your solution
1 1
[f (x) + f (−x)] = [f (x) − f (−x)] =
2 2
Answer
1 1
[f (x) + f (−x)] = (x2 − 3x + x2 − 3−x )
2 2
1
= x2 − (3x + 3−x ). This is the even part of f (x).
2
1 1
[f (x) − f (−x)] = (x2 − 3x − x2 + 3−x )
2 2
1
= (3−x − 3x ). This is the odd part of f (x).
2
The odd and even parts of the exponential function

Using the approach outlined above we see that the even part of ex is
1 x
(e + e−x )
2
and the odd part of ex is
1 x
(e − e−x )
2
We give these new functions special names: cosh x (pronounced ‘cosh’ x) and sinh x (pronounced
‘shine’ x).
Key Point 3
Hyperbolic Functions
1
cosh x ≡ (ex + e−x )
2
1
sinh x ≡ (ex − e−x )
2
These two functions, when added and subtracted, give

cosh x + sinh x ≡ ex and cosh x − sinh x ≡ e−x
The graphs of cosh x and sinh x are shown in Figure 4.
HELM (2006): 13
y
e−x ex
cosh x
sinh x
Figure 4: sinh x and cosh x

Note that cosh x > 0 for all values of x and that sinh x is zero only when x = 0.
2. Hyperbolic identities
The hyperbolic functions cosh x, sinh x satisfy similar (but not exactly equivalent) identities to
those satisfied by cos x, sin x. We note first some basic notation similar to that employed with
trigonometric functions:
coshn x means (cosh x)n sinhn x means (sinh x)n n 6= −1
1 1
In the special case that n = −1 we do not use cosh−1 x and sinh−1 x to mean and
cosh x sinh x
respectively. The notation cosh−1 x and sinh−1 x is reserved for the inverse functions of cosh x
and sinh x respectively.
Task
Show that cosh2 x − sinh2 x ≡ 1 for all x.
(a) First, express cosh2 x in terms of the exponential functions ex , e−x :

Your solution
2
2 1 x −x
cosh x ≡ (e + e ) ≡
2
Answer
1 x 1 1 1
(e + e−x )2 ≡ [(ex )2 + 2ex e−x + (e−x )2 ] ≡ [e2x + 2ex−x + e−2x ] ≡ [e2x + 2 + e−2x ]
4 4 4 4
14 HELM (2006):
®
(b) Similarly, express sinh2 x in terms of ex and e−x :

Your solution
2
2 1 x −x
sinh x ≡ (e − e ) ≡
2
Answer
1 x 1 1 1
(e − e−x )2 ≡ [(ex )2 − 2ex e−x + (e−x )2 ] ≡ [e2x − 2ex−x + e−2x ] ≡ [e2x − 2 + e−2x ]
4 4 4 4
(c) Finally determine cosh2 x − sinh2 x using the results from (a) and (b):
Your solution
cosh2 x − sinh2 x ≡
Answer
1 1
cosh2 x − sinh2 x ≡ [e2x + 2 + e−2x ] − [e2x − 2 + e−2x ] ≡ 1
4 4
As an alternative to the calculation in this Task we could, instead, use the relations
ex ≡ cosh x + sinh x e−x ≡ cosh x − sinh x
and remembering the algebraic identity (a + b)(a − b) ≡ a2 − b2 , we see that
(cosh x + sinh x)(cosh x − sinh x) ≡ ex e−x ≡ 1 that is cosh2 x − sinh2 x ≡ 1
Key Point 4
The fundamental identity relating hyperbolic functions is:
cosh2 x − sinh2 x ≡ 1
This is the hyperbolic function equivalent of the trigonometric identity: cos2 x + sin2 x ≡ 1
HELM (2006): 15
Task
Show that cosh(x + y) ≡ cosh x cosh y + sinh x sinh y.
First, express cosh x cosh y in terms of exponentials:
Your solution
ex + e−x ey + e−y

cosh x cosh y ≡ ≡
2 2
Answer
e + e−x e + e−y
x y
1 1
≡ [ex ey + e−x ey + ex e−y + e−x e−y ] ≡ (ex+y + e−x+y + ex−y + e−x−y )
2 2 4 4
Now express sinh x sinh y in terms of exponentials:

Yoursolution
ex − e−x ey − e−y

≡
2 2
Answer
e − e−x e − e−y
x y
1
≡ (ex+y − e−x+y − ex−y + e−x−y )
2 2 4
Now express cosh x cosh y + sinh x sinh y in terms of a hyperbolic function:

Your solution
cosh x cosh y + sinh x sinh y =
Answer
1
cosh x cosh y + sinh x sinh y ≡ (ex+y + e−(x+y) ) which we recognise as cosh(x + y)
2
16 HELM (2006):
®
Other hyperbolic function identities can be found in a similar way. The most commonly used are
listed in the following Key Point.
Key Point 5
Hyperbolic Identities
• cosh2 − sinh2 ≡ 1
• cosh(x + y) ≡ cosh x cosh y + sinh x sinh y
• sinh(x + y) ≡ sinh x cosh y + cosh x sinh y
• sinh 2x ≡ 2 sinh x cosh y
• cosh 2x ≡ cosh2 x + sinh2 x or cosh 2x ≡ 2 cosh2 −1 or cosh 2x ≡ 1 + 2 sinh2 x
3. Related hyperbolic functions

Given the trigonometric functions cos x, sin x related functions can be defined; tan x, sec x, cosec x
through the relations:
sin x 1 1 cos x
tan x ≡ sec x ≡ cosec x ≡ cot x ≡
cos x cos x sin x sin x
In an analogous way, given cosh x and sinh x we can introduce hyperbolic functions tanh x, sec h x,
cosech x and coth x. These functions are defined in the following Key Point:
Key Point 6
Further Hyperbolic Functions
sinh x
tanh x ≡
cosh x
1
sech x ≡
cosh x
1
cosech x ≡
sinh x
cosh x
coth x ≡
sinh x
HELM (2006): 17
Task
Show that 1 − tanh2 x ≡ sech2 x
Use the identity cosh2 x − sinh2 x ≡ 1:

Your solution
Answer
Dividing both sides by cosh2 x gives
sinh2 x 1
1− 2 ≡ implying (see Key Point 6) 1 − tanh2 x ≡ sech2 x
cosh x cosh2 x
Exercises
1. Express
(a) 2 sinh x + 3 cosh x in terms of ex and e−x .
(b) 2 sinh 4x − 7 cosh 4x in terms of e4x and e−4x .
2. Express
(a) 2ex − e−x in terms of sinh x and cosh x.
7ex
(b) in terms of sinh x and cosh x, and then in terms of coth x.
(ex − e−x )
(c) 4e−3x − 3e3x in terms of sinh 3x and cosh 3x.
3. Using only the cosh and sinh keys on your calculator (or ex key) find the values of
(a) tanh 0.35, (b) cosech 2, (c) sech 0.6.
Answers
5 1 5 9
1. (a) ex − e−x (b) − e4x − e−4x
2 2 2 2
7(cosh x + sinh x) 7
2. (a) cosh x + 3 sinh x, (b) , (coth x + 1) (c) cosh 3x − 7 sinh 3x
2 sinh x 2
3. (a) 0.3364, (b) 0.2757 (c) 0.8436
18 HELM (2006):
®

Logarithms 6.3
Introduction
In this Section we introduce the logarithm: loga b. The operation of taking a logarithm essentially
reverses the operation of raising a number to a power. We will formulate the basic laws satisfied by
all logarithms and learn how to manipulate expressions involving logarithms. We shall see that to
every law of indices there is an equivalent law of logarithms. Although logarithms to any positive
base are defined it is common practice to employ only two kinds of logarithms: logs to base 10 and
logs to base e.

Prerequisites • have a knowledge of exponents and of the

laws of indices

#
n
• invert b = a using logarithms
Learning Outcomes • simplify expressions involving logarithms
• change bases in logarithms
" !
HELM (2006): 19
Section 6.3: Logarithms
1. Logarithms
Logarithms reverse the process of raising a base ‘a’ to a power ‘n’. As with all exponentials, the base
should be a positive number.
If b = an then we write loga b = n.
Of course, the reverse statement is equivalent
If loga b = n then b = an
The expression loga b = n is read
“The log to base a of the number b is equal to n”
The term “log” is short for the word logarithm.
Example 3
Determine the log equivalents of
(a) 16 = 24 , (b) 16 = 42 , (c) 1000 = 103 ,
(d) 134.896 = 102.13 , (e) 8.414867 = e2.13
Solution
(a) Since 16 = 24 then log2 16 = 4
(b) Since 16 = 42 then log4 16 = 2
(c) Since 1000 = 103 then log10 1000 = 3
(d) Since 134.896 = 102.13 then log10 134.896 = 2.13
(e) Since 8.41467 = e2.13 then loge 8.414867 = 2.13
Key Point 7
If b = an then loga b = n
If loga b = n then b = an
20 HELM (2006):
®
Task
1
Find the log equivalent of (a) 100 = 102 (b) = 10−3
1000
Here, on the right-hand sides, the base is 10 in each case so:

Your solution
(a) 100 = 102 implies
1
(b) = 10−3 implies
1000
Answer
(a) log10 100 = 2
1
(b) log10 = −3
1000
Task
Find the log equivalent of (a) b = an , (b) c = am , (c) bc = an+m
(a) Here the base is a so:

Your solution
b = an implies n =
Answer
n = loga b
(b) Here the base is a so:

Your solution
c = am implies m =
Answer
m = loga c
(c) Here the base is a so:

Your solution
bc = an+m implies n + m =
Answer
n + m = loga (bc)
HELM (2006): 21
From the last Task we have found, using the property of indices, that
loga (bc) = n + m = loga b + loga c.
We conclude that the index law an am = an+m has an equivalent logarithm law
loga (bc) = loga b + loga c
In words: “The log of a product is the sum of logs.”
Indeed this property is one of the major advantages of using logarithms. They transform a product
of numbers (a relatively difficult operation) to a sum of numbers (a relatively easy operation).
Each index law has an equivalent logarithm law, true for any base, listed in the following Key Point:
Key Point 8
The laws of logarithms The laws of indices
1. loga (AB) = loga A + loga B 1. aA aB = aA+B
A
2. loga ( ) = loga A − loga B 2. aA /aB = aA−B
B
3. loga (Ak ) = k loga A 3. (aA )k = akA
4. loga (aA ) = A 4. aloga A = A
5. loga a = 1 5. a1 = a
6. loga 1 = 0 6. a0 = 1
2. Simplifying expressions involving logarithms

To simplify an expression involving logarithms their laws, given in Key Point 8, need to be used.
Example 4
10
Simplify: log10 2 − log10 4 + log10 (42 ) + log10 ( )
4
Solution
The third term log10 (42 ) simplifies to 2 log10 4 and the last term
10
log10 ( ) = log10 10 − log10 4 = 1 − log10 4
4
10
So log10 2−log10 4+log10 (42 )+log10 ( ) = log10 2−log10 4+2 log10 4+1−log10 4 = log10 2+1
4
22 HELM (2006):
®
Task
Simplify the expression:
1 10
log10 ( ) − log10 ( ) − log10 1000
10 27
1
(a) First simplify log10 ( ):
10
Your solution
1
log10 ( ) =
10
Answer
1
log10 ( ) = log10 1 − log10 10 = 0 − 1 = −1
10
10
(b) Now simplify log10 ( ):
27
Your solution
10
log10 ( ) =
27
Answer
10
log10 ( ) = log10 10 − log10 27 = 1 − log10 27
27
(c) Now simplify log10 1000:
Your solution
Answer
3
(d) Finally collect all the terms together from (a), (b), (c) and simplify:
Your solution
Answer
−1 − (1 − log10 27) + 3 = 1 + log10 27
3. Logs to base 10 and natural logs

In practice only two kinds of logarithms are commonly used, those to base 10, written log10 (or just
simply log) and those to base e, written loge or more usually ln (called natural logarithms). Most
scientific calculators will determine the logarithm to base 10 and to base e. For example,
log 13 = 1.11394 (implying 101.11394 = 13), ln 23 = 3.13549 (implying e3.13549 = 23)
HELM (2006): 23
Task
Use your calculator to determine (a) log 10, (b) log 1000000, (c) log 0.1
Your solution
(a) log 10 = (b) log 1000000 = (c) log 0.1 =
Answer
(a) 1, (b) 6, (c) −1.
Each of the above results could be determined directly, without the use of a calculator. For example:
Since loga a = 1 then log 10 (≡ log10 10) = 1.
Since loga Ak = k loga A then log 1000000 = log 106 = 6 log 10 = 6.
A
Since loga ( ) = loga A − loga B and loga 1 = 0 and loga a = 0, then
B
1
log 0.1 = log( ) = log 1 − log(10) = −1
10
Task
Use your calculator to determine
(a) ln 29.42, (b) ln e, (c) ln 0.1
Your solution
(a) ln 29.42 = (b) ln e = (c) ln 0.1 =
Answer
(a) ln 29.42 = 3.38167, (b) ln e = 1, (c) ln 0.1 = −2.30258
4. Changing base in logarithms

It is sometimes required to express the logarithm with respect to one base in terms of a logarithm
with respect to another base.
Now
b = an implies loga b = n
where we have used logs to base a. What happens if, for some reason, we want to use another base,
p say? We take logs (to base p) of both sides of b = an :
logp (b) = logp (an ) = n logp a (using one of the logarithm laws)
So
logp (b) logp (b)
n= that is loga b =
logp (a) logp (a)
This is the rule to be used when converting logarithms from one base to another.
24 HELM (2006):
®
Key Point 9
logp b
loga b =
logp a
For base 10 logs:

log(b)
loga b =
log(a)
For example,
log 7 0.8450980
log3 7 = = = 1.7712437
log 3 0.4771212
(Check, on your calculator, that 31.7712437 = 7).
For natural logs:
ln(b)
loga b =
ln(a)
For example,
ln 7 1.9459101
log3 7 = = = 1.7712437
ln 3 1.0986123
Of course, log3 7 cannot be determined directly on your calculator since logs to base 3 are not
available but it can be found using the above method.
Task
Use your calculator to determine the value of log21 7 using first base 10 then check
using base e.
Re-express log21 7 using base 10 then base e:

Your solution
log 7 ln 7
log21 7 = = log21 7 = =
log 21 ln 21
Answer
log 7 ln 7
log21 7 = = 0.6391511 log21 7 = = 0.6391511
log 21 ln 21
HELM (2006): 25
Example 5
Simplify the expression 10log x .
Solution
Let y = 10log x then take logs (to base 10) of both sides:
log y = log(10log x ) = (log x) log 10
where we have used: log Ak = k log A. However, since we are using logs to base 10 then log 10 = 1
and so
log y = log x implying y=x
Therefore, finally we conclude that
10log x = x
This is an important result true for logarithms of any base. It follows from the basic definition of the
logarithm.
Key Point 10
aloga x = x
Raising to the power and taking logs are inverse operations.
Exercises
1. Find the values of (a) log 8 (b) log 50 (c) ln 28
2. Simplify
(a) log 1 − 3 log 2 + log 16.

(b) 10 log x − 2 log x2 .
(c) ln(8x − 4) − ln(4x − 2).
(d) ln 10 log 7 − ln 7.
Answers
1. (a) 3 (b) 1.41096 (c) 3.033
2. (a) log 2, (b) 6 log x or log x6 , (c) ln 2, (d) 0
26 HELM (2006):
®
The Logarithmic
Function 6.4
Introduction
In this Section we consider the logarithmic function y = loga x and examine its important charac-
teristics. We see that this function is only defined if x is a positive number. We also see that the
log function is the inverse of the exponential function and vice versa. We show, through numerous
examples, how equations involving logarithms and exponentials can be solved.
' $
• have knowledge of inverse functions
Prerequisites • have knowledge of the laws of logarithms and

of the laws of indices
• be able to solve quadratic equations
&
# %
• explain the relation between the
logarithm and the exponential function
Learning Outcomes
• solve equations involving exponentials and
logarithms
" !
HELM (2006): 27
Section 6.4: The Logarithmic Function
1. The logarithmic function
In Section 6.3 we introduced the operation of taking logarithms which reverses the operation of
exponentiation.
If a > 0 and a 6= 1 then x = ay implies y = loga x
In this Section we consider the log function in more detail. We shall concentrate only on the functions
log x (i.e. to base 10) and ln x (i.e. to base e). The functions y = log x and y = ln x have similar
characteristics. We can never choose x as a negative number since 10y and ey are each always
positive. The graphs of y = log x and y = ln x are shown in Figure 5.
y 10 x
ex
ln x
log x
Figure 5: Logarithmic and exponential functions

From the graphs we see that both functions are one-to-one so each has an inverse function - the
inverse function of loga x is ax . Let us do this for logs to base 10.
2. Solving equations involving logarithms and exponentials

To solve equations which involve logarithms or exponentials we need to be aware of the basic laws
which govern both of these mathematical concepts. We illustrate by considering some examples.
Example 6
1
Solve for the variable x: (a) 3 = 10x , (b) 10x/4 = log 3, (c) =4
17 − ex
Solution
(a) Here we take logs (to base 10 because of the term 10x ) of both sides to get
log 3 = log 10x = x log 10 = x
where we have used the general property that loga Ak = k loga A and the specific property
that log 10 = 1. Hence x = log 3 or, in numerical form, x = 0.47712 to 5 d.p.
28 HELM (2006):
®
Solution (contd.)
(b) The approach used in (a) is used here. Take logs of both sides: log(10x/4 ) = log(log 3)
x
that is log 10 = log(log 3) = log(0.4771212) = −0.3213712
4
So, since log 10 = 1, we have x = 4(−0.3213712) = −1.28549 to 5 d.p.
(c) Here we simplify the expression before taking logs.
1
=4 implies 1 = 4(17 − ex )
17 − ex
or 4ex = 4(17) − 1 = 67 so ex = 16.75. Now taking natural logs of both sides
(because of the presence of the ex term) we have:
ln(ex ) = ln(16.75) = 2.8183983

1
But ln(ex ) = x ln e = x and so the solution to = 4 is x = 2.81840 to 5 d.p.
17 − ex
Task
Solve the equation (ex )2 = 50
First solve for ex by taking square roots of both sides:

Your solution
(ex )2 = 50 implies ex =
Answer √
(ex )2 = 50 implies ex = 50 = 7.071068. Here we have taken the positive value for the square
root since we know that exponential functions are always positive.
Now take logarithms to an appropriate base to find x:

Your solution
ex = 7.071068 implies x =
Answer
ex = 7.071068 implies x = ln(7.071068) = 1.95601 to 5 d.p.
HELM (2006): 29
Task
Solve the equation e2x = 17ex
First simplify the expression as much as possible (divide both sides by ex ):

Your solution
e2x
e2x = 17ex implies = 17 so
ex
Answer
e2x
x
= 17 implies e2x−x = 17 so ex = 17
e
Now complete the solution for x:
Your solution
ex = 17 implies x =
Answer
x = ln(17) = 2.8332133
Example 7
Find x if 10x − 5 + 6(10−x ) = 0
Solution
We first simplify this expression by multiplying through by 10x (to eliminate the term 10−x ):
10x (10x ) − 10x (5) + 10x (6(10−x )) = 0
or
(10x )2 − 5(10x ) + 6 = 0 since 10x (10−x ) = 100 = 1
We realise that this expression is a quadratic equation. Let us put y = 10x to give
y 2 − 5y + 6 = 0
Now, we can factorise to give
(y − 3)(y − 2) = 0 so that y = 3 or y = 2
For each of these values of y we obtain a separate value for x since y = 10x .
Case 1 If y = 3 then 3 = 10x implying x = log 3 = 0.4771212
Case 2 If y = 2 then 2 = 10x implying x = log 2 = 0.3010300
We conclude that the equation 10x − 5 + 6(10−x ) = 0 has two possible solutions for x: either
x = 0.4771212 or x = 0.3010300, to 7 d.p.
30 HELM (2006):
®
Task
Solve 2e2x − 7ex + 3 = 0.
First write this equation as a quadratic in the variable y = ex remembering that e2x ≡ (ex )2 :
Your solution
If y = ex then 2e2x − 7ex + 3 = 0 becomes
Answer
2y 2 − 7y + 3 = 0
Now solve the quadratic for y:

Your solution
2y 2 − 7y + 3 = 0 implies (2y )(y )=0
Answer
1
(2y − 1)(y − 3) = 0 therefore y = or y = 3
2
Finally, for each of your values of y, find x:

Your solution
1 1
If y = then = ex implies x =
2 2
If y = 3 then 3 = ex implies x =
Answer
x = −0.693147 or x = 1.0986123
Task
The temperature T , in degrees C, of a chemical reaction is given by the formula
T = 80e0.03t × t ≥ 0, where t is the time, in seconds.
Calculate the time taken for the temperature to reach 150◦ C .
Answer
ln(1.875)
150 = 80e0.03t ⇒ 1.875 = e0.03t ⇒ ln(1.875) = 0.03t ⇒ t=
0.03
This gives t = 20.95 to 2 d.p.
So the time is 21 seconds.
HELM (2006): 31
Arrhenius’ law
Introduction
Chemical reactions are very sensitive to temperature; normally, the rate of reaction increases as
temperature increases. For example, the corrosion of iron and the spoiling of food are more rapid
at higher temperatures. Chemically, the probability of collision between two molecules increases
with temperature, and an increased collision rate results in higher kinetic energy, thus increasing
the proportion of molecules that have the activation energy for the reaction, i.e. the minimum
energy required for a reaction to occur. Based upon his observations, the Swedish chemist, Svante
Arrhenius, proposed that the rate of a chemical reaction increases exponentially with temperature.
This relationship, now known as Arrhenius’ law, is written as

−Ea
k = k0 exp (1)
RT
where k is the reaction rate constant, k0 is the frequency factor, Ea is the activation energy, R is
the universal gas constant and T is the absolute temperature. Thus, the reaction rate constant, k,
depends on the quantities k0 and Ea , which characterise a given reaction, and are generally assumed
to be temperature independent.
Problem in words
In a laboratory, ethyl acetate is reacted with sodium hydroxide to investigate the reaction kinetics.
Calculate the frequency factor and activation energy of the reaction from Arrhenius’ Law, using the
experimental measurements of temperature and reaction rate constant in the table:
T 310 350
k 7.757192 110.9601

Given that k = 7.757192 s−1 at T = 310 K and k = 110.9601 s−1 at T = 350 K, use Equation (1)
to produce two linear equations in Ea and k0 . Solve these to find Ea and k0 . (Assume that the gas
constant R = 8.314 J K−1 mol−1 .)
Taking the natural logarithm of both sides of (1)

−Ea Ea
ln k = ln k0 exp = ln k0 −
RT RT
Now inserting the experimental data gives the two linear equations in Ea and k0
Ea
ln k1 = ln k0 − (2)
R T1
Ea
ln k2 = ln k0 − (3)
R T2
where k1 = 7.757192, T1 = 310 and k2 = 110.9601, T2 = 350.
32 HELM (2006):
®
Firstly, to find Ea , subtract Equation (2) from Equation (3)

Ea Ea Ea 1 1
ln k2 − ln k1 = − = −
R T1 R T2 R T1 T2
so that
R (ln k2 − ln k1 )
Ea =
1 1
−
T1 T2
and substituting the values gives
Ea = 60000 J mol−1 = 60 kJ mol−1
Secondly, to find k0 , from (2)

Ea Ea Ea
ln k0 = ln k1 + ⇒ k0 = exp ln k1 + = k1 exp
R T1 R T1 R T1
and substituting the values gives
k0 = 1.0 × 1011 s−1
Task
The reaction
2NO2 (g) −→ 2NO(g) + O2 (g)
has a reaction rate constant of 1.0 × 10−10 s−1 at 300 K and activation energy of
111 kJ mol−1 = 111 000 J mol−1 . Use Arrhenius’ law to find the reaction rate
constant at a temperature of 273 K.
Your solution
HELM (2006): 33
Answer
Rearranging Arrhenius’ equation gives

Ea
k0 = k exp
RT
Substituting the values gives k0 = 2.126 × 109 s−1
Now we use this value of k0 with Ea in Arrhenius’ equation (1) to find k at T = 273 K

−Ea
k = k0 exp = 1.226 × 10−12 s−1
RT
Task
For a chemical reaction with frequency factor k0 = 0.5 s−1 and ratio Ea /R = 800
K, use Arrhenius’ law to find the temperature at which the reaction rate constant
would be equal to 0.1 s−1 .
Your solution
Answer
Rearranging Equation (1)

k −Ea
= exp
k0 RT
Taking the natural logarithm of both sides

k −Ea
ln =
k0 RT
so that
−Ea Ea
T = =
R ln (k/k0 ) R ln (k0 /k)
Substituting the values gives T = 497 K
As a final example we consider equations involving the hyperbolic functions.
34 HELM (2006):
®
Example 8
Solve the equations
(a) cosh 3x = 1 (b) cosh 3x = 2 (c) 2 cosh2 x = 3 cosh 2x − 3
Solution
(a) From its graph we know that cosh x = 0 only when x = 0, so we need 3x = 0 which implies
x = 0.
e3x + e−3x
(b) cosh 3x = 2 implies =2 or e3x + e−3x − 4 = 0
2
Now multiply through by e (to eliminate the term e−3x ) to give
3x
e3x e3x + e3x e−3x − 4e3x = 0 or (e3x )2 − 4e3x + 1 = 0

This is a quadratic equation in the variable e3x so substituting y = e3x gives
√
y 2 − 4y + 1 = 0 implying y = 2 ± 3 so y = 3.7321 or 0.26795
1
e3x = 3.7321 implies x= ln 3.7321 = 0.439 to 3 d.p.
3
1
e3x = 0.26795 implies x= ln 0.26795 = −0.439 to 3 d.p.
3
(c) We first simplify this expression by using the identity: cosh 2x = 2 cosh2 −1. Thus the original
equation 2 cosh2 x = 3 cosh 2x − 3 becomes cosh 2x + 1 = 3 cosh 2x − 3 or, when written in terms
of exponentials:
e2x + e−2x e2x + e−2x
= 3( )−4
2 2
Multiplying through by 2e2x gives e4x + 1 = 3(e4x + 1) − 8e2x or, after simplifying:
e4x − 4e2x + 1 = 0
Writing y = e2x we easily obtain y 2 − 4y + 1 = 0 with solution (using the quadratic formula):
√
4 ± 16 − 4 √
y= =2± 3
2
√ √
If y = 2 + 3 then 2 + 3 = e2x implying x = 0.65848 to 5 d.p.
√ √
If y = 2 − 3 then 2 − 3 = e2x implying x = −0.65848 to 5 d.p.
HELM (2006): 35
Task
Find the solution for x if tanh x = 0.5.
First re-write tanh x in terms of exponentials:
Your solution
tanh x =
Answer
ex − e−x e2x − 1
tanh x = =
ex + e−x e2x + 1
Now substitute into tanh x = 0.5:
Your solution
e2x − 1
tanh x = 0.5 implies = 0.5 so, on simplifying, e2x =
e2x + 1
Answer
e2x − 1 2x 1 2x e2x 3
2x
= 0.5 implies (e − 1) = (e + 1) so = so, finally, e2x = 3
e +1 2 2 2
Now complete your solution by finding x:
Your solution
e2x = 3 so x =
Answer
1
x = ln 3 = 0.549306
2
Alternatively, many calculators can directly calculate the inverse function tanh−1 . If you have such
a calculator then you can use the fact that
tanh x = 0.5 implies x = tanh−1 0.5 to obtain directly x = 0.549306
36 HELM (2006):
®
Example 9
Solve for x if 3 ln x + 4 log x = 1.
Solution
This has logs to two different bases. So we must first express each logarithm in terms of logs to the
same base, e say. From Key Point 8
ln x
log x =
ln 10
So 3 ln x + 4 log x = 1 becomes
ln x 4
3 ln x + 4 =1 or (3 + ) ln x = 1
ln 10 ln 10
ln 10 2.302585
leading to ln x = = = 0.211096 and so
3 ln 10 + 4 10.907755
x = e0.211096 = 1.2350311
Exercises
1
1. Solve for the variable x: (a) π = 10x (b) 10−x/2 = 3 (c) =4
17 − π x
2. Solve the equations
(a) e2x = 17ex , (b) e2x − 2ex − 6 = 0, (c) cosh x = 3.
Answers
1. (a) x = log π = 0.497
(b) −x/2 = log 3 and so x = −2 log 3 = −0.954
log 16.75 1.224
(c) 17 − π x = 0.25 so π x = 16.75 therefore x = = = 2.462
log π 0.497
2. (a) Take logs of both sides: 2x = ln 17 + x ∴ x = ln 17 = 2.833
√
(b) Let y = ex then y 2 − 2y − 6 = 0 therefore y = 1 ± 7 √(we cannot take the negative sign
since exponentials can never be negative). Thus x = ln(1 + 7) = 1.2936.
√
x −x 2x x x 6 ± 36 − 4 √
(c) e + e = 6 therefore e − 6e + 1 = 0 so e = =3± 8
2
√ √
We have, finally x = ln(3 + 8) = 1.7627 or x = ln(3 − 8) = −1.7627
HELM (2006): 37

Modelling Exercises 6.5
Introduction
This Section provides examples and tasks employing exponential functions and logarithmic functions,
such as growth and decay models which are important throughout science and engineering.
' $
• be familiar with the laws of logarithms
Prerequisites • have knowledge of logarithms to base 10
Before starting this Section you should . . . • be able to solve equations involving
logarithms and exponentials
&
%

Learning Outcomes • develop exponential growth and decay models


38 HELM (2006):
®
1. Exponential increase
Task
(a) Look back at Section 6.2 to review the definitions of an exponential function
and the exponential function.
(b) List examples in this Workbook of contexts in which exponential functions

are appropriate.
Your solution
Answer
(a) An exponential function has the form y = ax where a > 0. T he exponential function has
the form y = ex where e = 2.718282......
(b) It is stated that exponential functions are useful when modelling the shape of a hanging chain
or rope under the effect of gravity or for modelling exponential growth or decay.
We will look at a specific example of the exponential function used to model a population increase.
Task
Given that
P = 12e0.1t (0 ≤ t ≤ 25)
where P is the number in the population of a city in millions at time t in years
answer these questions.
(a) What does this model imply about P when t = 0?
(b) What is the stated upper limit of validity of the model?
(c) What does the model imply about values of P over time?
(d) What does the model predict for P when t = 10? Comment on this.
(d) What does the model predict for P when t = 25? Comment on this.
HELM (2006): 39
Section 6.5: Modelling Exercises
Your solution
(a)
(b)
(c)
(d)
(e)
Answer
(a) At t = 0, P = 12 which represents the initial population of 12 million. (Recall that e0 = 1.)
(b) The time interval during which the model is valid is stated as (0 ≤ t ≤ 25) so the model is
intended to apply for 25 years.
(c) This is exponential growth so P will increase from 12 million at an accelerating rate.
(d) P (10) = 12e1 ≈ 33 million. This is getting very large for a city but might be attainable in
10 years and just about sustainable.
(e) P (25) = 12e2.5 ≈ 146 million. This is unrealistic for a city.
Note that exponential population growth of the form P = P0 ekt means that as t becomes large and
positive, P becomes very large. Normally such a population model would be used to predict values
of P for t > 0, where t = 0 represents the present or some fixed time when the population is known.
In Figure 6, values of P are shown for t < 0. These correspond to extrapolation of the model into
the past. Note that as t becomes increasingly negative, P becomes very small but is never zero or
negative because ekt is positive for all values of t. The parameter k is called the instantaneous
fractional growth rate.
P
30
P = 12e0.1 t
25
20
15
10
10 5 0 5 10 t
Figure 6: The function P = 12e0.01t
40 HELM (2006):
®
For the model P = 12ekt we see that k = 0.1 is unrealistic, and more realistic values would be
k = 0.01 or k = 0.02. These would be similar but k=0.02 implies a faster growth for t > 0 than
k = 0.01. This is clear in the graphs for k = 0.01 and k = 0.02 in Figure 7. The functions are
plotted up to 200 years to emphasize the increasing difference as t increases.
P
500 P = 12e0.02 t
250
P = 12e0.01 t
t
0 50 100 150 200
Figure 7: Comparison of the functions P = 12e0.01t and P = 12e0.02t

The exponential function may be used in models for other types of growth as well as population
growth. A general form may be written
y = aebx a > 0, b > 0, c≤x≤d
where a represents the value of y at x = 0. The value a is the intercept on the y-axis of a graphical
representation of the function. The value b controls the rate of growth and c and d represent limits
on x.
In the general form, a and b represent the parameters of the exponential function which can be
selected to fit any given modelling situation where an exponential function is appropriate.
2. Linearisation of exponential functions

This subsection relates to the description of log-linear plots covered in Section 6.6.
Frequently in engineering, the question arises of how the parameters of an exponential function
might be found from given data. The method follows from the fact that it is possible to ‘undo’
the exponential function and obtain a linear function by means of the logarithmic function. Before
showing the implications of this method, it may be necessary to remind you of some rules for
manipulating logarithms and exponentials. These are summarised in Table 1 on the next page, which
exactly matches the general list provided in Key Point 8 in Section 6.3 (page 22.)
HELM (2006): 41
Table 1: Rules for manipulating base e logarithms and exponentials
Number Rule Number Rule
1a ln(xy) = ln(x) + ln(y) 1b e × ey = ex+y
x
2a ln(x/y) = ln(x) − ln(y) 2b ex /ey = ex−y

3a ln(xy ) = y ln(x) 3b (ex )y = exy
4a ln(ex ) = x 4b eln(x) = x
5a ln(e) = 1 5b e1 = e
6a ln(1) = 0 6b e0 = 1
We will try ‘undoing’ the exponential in the particular example
P = 12e0.1t
We take the natural logarithm (ln) of both sides, which means logarithm to the base e. So
ln(P ) = ln(12e0.1t )
The result of using Rule 1a in Table 1 is
ln(P ) = ln(12) + ln(e0.1t ).
The natural logarithmic functions ‘undoes’ the exponential function, so by Rule 4a,
ln(e0.1t ) = 0.1t
and the original equation for P becomes
ln(P ) = ln(12) + 0.1t.
Compare this with the general form of a linear function y = ax + b.
y = ax + b
↓ ↓ ↓
ln(P ) = 0.1t + ln(12)
If we regard ln(P ) as equivalent to y, 0.1 as equivalent to the constant a, t as equivalent to x, and
ln(12) as equivalent to the constant b, then we can identify a linear relationship between ln(P ) and
t. A plot of ln(P ) against t should result in a straight line, of slope 0.1, which crosses the ln(P )
axis at ln(12). (Such a plot is called a log-linear or log-lin plot.) This is not particularly interesting
here because we know the values 12 and 0.1 already.
Suppose, though, we want to try using the general form of the exponential function
P = aebt (c ≤ t ≤ d)
to create a continuous model for a population for which we have some discrete data. The first thing
to do is to take logarithms of both sides
ln(P ) = ln(aebt ) (c ≤ t ≤ d).
Rule 1 from Table 1 then gives
ln(P ) = ln(a) + ln(ebt ) (c ≤ t ≤ d).
But, by Rule 4a, ln(ebt ) = bt, so this means that
ln(P ) = ln(a) + bt (c ≤ t ≤ d).
42 HELM (2006):
®
So, given some ‘population versus time’ data, for which you believe can be modelled by some version
of the exponential function, plot the natural logarithm of population against time. If the exponential
function is appropriate, the resulting data points should lie on or near a straight line. The slope of
the straight line will give an estimate for b and the intercept with the ln(P ) axis will give an estimate
for ln(a). You will have carried out a logarithmic transformation of the original data for P . We
say the original variation has been linearised.
A similar procedure will work also if any exponential function rather than the base e exponential
function is used. For example, suppose that we try to use the function
P = A × 2Bt (C ≤ t ≤ D),
where A and B are constant parameters to be derived from the given data. We can take natural
logarithms again to give
ln(P ) = ln(A × 2Bt ) (C ≤ t ≤ D).
Rule 1a from Table 1 then gives
ln(P ) = ln(A) + ln(2Bt ) (C ≤ t ≤ D).
Rule 3a then gives
ln(2Bt ) = Bt ln(2) = B ln(2) t
and so
ln(P ) = ln(A) + B ln(2) t (C ≤ t ≤ D).
Again we have a straight line graph with the same intercept as before, ln A, but this time with slope
B ln(2).
Task
The amount of money £M to which £1 grows after earning interest of 5% p.a.
for N years is worked out as
M = 1.05N
Find a linearised form of this equation.
Your solution
Answer
Take natural logarithms of both sides.
ln(M ) = ln(1.05N ).
Rule 3b gives
ln(M ) = N ln(1.05).
So a plot of ln(M ) against N would be a straight line passing through (0, 0) with slope ln(1.05).
HELM (2006): 43
The linearisation procedure also works if logarithms other than natural logarithms are used. We start
again with
P = A × 2Bt (C ≤ t ≤ D).
and will take logarithms to base 10 instead of natural logarithms. Table 2 presents the laws of
logarithms and indices (based on Key Point 8 page 22) interpreted for log10 .
Table 2: Rules for manipulating base 10 logarithms and exponentials
Number Rule Number Rule

1a log10 (AB) = log10 A + log10 B 1b 10A 10B = 10A+B
2a log10 (A/B) = log10 A − log10 B 2b 10A /10B = 10A−B
3a log10 (Ak ) = k log10 A 3b (10A )k = 10kA
4a log10 (10A ) = A 4b 10log10 A = A
5a log10 10 = 1 5b 101 = 10
6a log10 1 = 0 6b 100 = 1
Taking logs of P = A × 2Bt gives:
log10 (P ) = log10 (A × 2Bt ) (C ≤ t ≤ D).
Rule 1a from Table 2 then gives
log10 (P ) = log10 (A) + log10 (2Bt ) (C ≤ t ≤ D).
Use of Rule 3a gives the result
log10 (P ) = log10 (A) + B log10 (2) t (C ≤ t ≤ D).
Task
(a) Write down the straight line function corresponding to taking logarithms of
the general exponential function
P = aebt (c ≤ t ≤ d)
by taking logarithms to base 10.
(b) Write down the slope of this line.
Your solution
Answer
(a) log10 (P ) = log10 (a) + (b log10 (e))t (c ≤ t ≤ d)
(b) b log10 (e)
It is not usually necessary to declare the subscript 10 when indicating logarithms to base 10. If you
meet the term ‘log’ it will probably imply “to the base 10”. In the remainder of this Section, the
subscript 10 is dropped where log10 is implied.
44 HELM (2006):
®
3. Exponential decrease
Consider the value, £D, of a car subject to depreciation, in terms of the age A years of the car. The
car was bought for £10500. The function
D = 10500e−0.25A (0 ≤ A ≤ 6)
could be considered appropriate on the ground that (a) D had a fixed value of £10500 when
A = 0, (b) D decreases as A increases and (c) D decreases faster when A is small than when A is
large. A plot of this function is shown in Figure 8.
12000
10000
8000
D pounds
6000
4000
2000 A years
0 1 2 3 4 5 6
Figure 8: Plot of car depreciation over 6 years
Task
Produce the linearised model of D = 10500e−0.25A .
Your solution
Answer
ln D = ln 10500 + ln(e−0.25A )
so ln D = ln 10500 − 0.25A
HELM (2006): 45
Exponential decay of sound intensity
Introduction
The rate at which a quantity decays is important in many branches of engineering and science. A
particular example of this is exponential decay. Ideally the sound level in a room where there are
substantial contributions from reflections at the walls, floor and ceiling will decay exponentially once
the source of sound is stopped. The decay in the sound intensity is due to absorbtion of sound at the
room surfaces and air absorption although the latter is significant only when the room is very large.
The contributions from reflection are known as reverberation. A measurement of reverberation in
a room of known volume and surface area can be used to indicate the amount of absorption.
Problem in words
As part of an emergency test of the acoustics of a concert hall during an orchestral rehearsal,
consultants asked the principal trombone to play a single note at maximum volume. Once the sound
had reached its maximum intensity the player stopped and the sound intensity was measured for the
next 0.2 seconds at regular intervals of 0.02 seconds. The initial maximum intensity at time 0 was
1. The readings were as follows:
time 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
intensity 1 0.63 0.35 0.22 0.13 0.08 0.05 0.03 0.02 0.01 0.005
Draw a graph of intensity against time and, assuming that the relationship is exponential, find a
function which expresses the relationship between intensity and time.
If the relationship is exponential then it will be a function of the form
I = I0 10kt
and a log-linear graph of the values should lie on a straight line. Therefore we can plot the values
and find the gradient and the intercept of the resulting straight-line graph in order to find the values
for I0 and k.
k is the gradient of the log-linear graph i.e.
change in log10 (intensity)
k=
change in time
and I0 is found from where the graph crosses the vertical axis log10 (I0 ) = c
Figure 9(a) shows the graph of intensity against time.
46 HELM (2006):
®
We calculate the log10 (intensity) to create the table below:

time 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
log10 (intensity) 0 -0.22 -0.46 -0.66 -0.89 -1.1 -1.3 -1.5 -1.7 -2.0 -2.2
Figure 9(b) shows the graph of log (intensity) against time.
Intensity Log(Intensity)
0 (0, 0)
−1
(0.2, −2.2)
−2
0 0.1 0.2 Time 0 0.1 0.2 Time

(a) (b)
Figure 9: (a) Graph of sound intensity against time (b) Graph of log10 (intensity) against time
and a line fitted by eye to the data. The line goes through the points (0, 0) and (0.2, −2.2).
We can see that the second graph is approximately a straight line and therefore we can assume that
the relationship between the intensity and time is exponential and can be expressed as
I = I0 10kt .
The log10 of this gives
log10 (I) = log10 (I0 ) + kt.
From the graph (b) we can measure the gradient, k using
change in log10 (intensity)
k=
change in time
−2.2 − 0
giving k = = −11
0.2 − 0
The point at which it crosses the vertical axis gives
log10 (I0 ) = 0 ⇒ I0 = 100 = 1
Therefore the expression I = I0 10kt becomes
I = 10−11t
Interpretation
The data recorded for the sound intensity fit exponential decaying with time. We have used a
log-linear plot to obtain the approximate function:
I = 10−11t
HELM (2006): 47
4. Growth and decay to a limit
Consider a function intended to represent the speed of a parachutist after the opening of the parachute
where v m s−1 is the instantanous speed at time t s. An appropriate function is
v = 12 − 8e−1.25t (t ≥ 0),
We will look at some of the properties and modelling implications of this function. Consider first the
value of v when t = 0:
v = 12 − 8e0 = 12 − 8 = 4
This means the function predicts that the parachutist is moving at 4 m s−1 when the parachute
opens. Consider next the value of v when t is arbitrarily large. For such a value of t, 8e−1.25t would
be arbitrarily small, so v would be very close to the value 12. The modelling interpretation of this is
that eventually the speed becomes very close to a constant value, 12 m s−1 which will be maintained
until the parachutist lands.
The steady speed which is approached by the parachutist (or anything else falling against air resis-
tance) is called the terminal velocity. The parachute, of course, is designed to ensure that the
terminal velocity is sufficiently low (12 m s−1 in the specific case we have looked at here) to give a
reasonably gentle landing and avoid injury.
Now consider what happens as t increases from near zero. When t is near zero, the speed will be
near 4 m s−1 . The amount being subtracted from 12, through the term 8e−1.25t , is close to 8 because
e0 = 1. As t increases the value of 8e−1.25t decreases fairly rapidly at first and then more gradually
until v is very nearly 12. This is sketched in Figure 10. In fact v is never equal to 12 but gets
imperceptibly close as anyone would like as t increases. The value shown as a horizontal broken line
in Figure 10 is called an asymptotic limit for v.
15
10
1
v (m s )
0
3 5
t (s)
0 1 2 4
Figure 10: Graph of a parachutist’s speed against time

The model concerned the approach of a parachutist’s velocity to terminal velocity but the kind of
behaviour portrayed by the resulting function is useful generally in modelling any growth to a limit.
A general form of this type of growth-to-a-limit function is
y = a − be−kx (C ≤ x ≤ D)
where a, b and k are positive constants (parameters) and C and D represent values of the independent
variable between which the function is valid. We will now check on the properties of this general
function. When x = 0, y = a − be0 = a − b. As x increases the exponential factor e−kx gets smaller,
so y will increase from the value a − b but at an ever-decreasing rate. As be−kx becomes very small,
48 HELM (2006):
®
y, approaches the value a. This value represents the limit, towards which y grows. If a function of
this general form was being used to create a model of population growth to a limit, then a would
represent the limiting population, and a − b would represent the starting population.
There are three parameters, a, b, and k in the general form. Knowledge of the initial and limiting
population only gives two pieces of information. A value for the population at some non-zero time is
needed also to evaluate the third parameter k.
As an example we will obtain a function to describe a food-limited bacterial culture that has 300
cells when first counted, has 600 cells after 30 minutes but seems to have approached a limit of 4000
cells after 18 hours.
We start by assuming the general form of growth-to-a-limit function for the bacteria population, with
time measured in hours
P = a − be−kt (0 ≤ t ≤ 18).
When t = 0 (the start of counting), P = 300. Since the general form gives P = a − b when t = 0,
this means that
a − b = 300.
The limit of P as t gets large, according to the general form P = a − b−kt , is a, so a = 4000. From
this and the value of a − b, we deduce that b = 3700. Finally, we use the information that P = 600
when t (measuring time in hours) = 0.5. Substitution in the general form gives
600 = 4000 − 3700e−0.5k
3400 = 3700e−0.5k
3400
= e−0.5k
3700
Taking natural logs of both sides:

3400 34
ln = −0.5k so k = −2 ln( ) = 0.1691
3700 37
Note, as a check, that k turns out to be positive as required for a growth-to-a-limit behaviour. Finally
the required function may be written
P = 4000 − 3700e−0.1691t (0 ≤ t ≤ 18).
As a check we should substitute t = 18 in this equation. The result is P = 3824 which is close to
the required value of 4000.
HELM (2006): 49
Task
Find a function that could be used to model the growth of a population that
has a value of 3000 when counts start, reaches a value of 6000 after 1 year but
approaches a limit of 12000 after a period of 10 years.
(a) First find the modelling equation:

Your solution
Answer
Start with
P = a − be−kt (0 ≤ t ≤ 10).
where P is the number of members of the population at time t years. The given data requires that
a is 12000 and that a − b = 3000, so b = 9000.
The corresponding curve must pass through (t = 1, P = 6000) so
6000 = 12000 − 9000e−k
t
−k 12000 − 6000 2 −kt −k t 2
e = = so e = (e ) = (using Rule 3b, Table 1, page 42)
9000 3 3
So the population function is
t
2
P = 12000 − 9000 (0 ≤ t ≤ 10).
3
Note that P (10) according to this formula is approximately 11840, which is reasonably close to the
required value of 12000.
(b) Now sketch this function:
50 HELM (2006):
®
Your solution
Answer
P
12000
10000
5000
0 t (s)
0 2 4 6 8 10
5. Inverse square law decay
Inverse square law decay of electromagnetic power
Introduction
Engineers are concerned with using and intercepting many kinds of wave forms including electromag-
netic, elastic and acoustic waves. In many situations the intensity of these signals decreases with
the square of the distance. This is known as the inverse square law. The power received from a
beacon antenna is expected to conform to the inverse square law with distance.
Problem in words
Check whether the data in the table below confirms that the measured power obeys this behaviour
with distance.
Power received, W 0.393 0.092 0.042 0.021 0.013 0.008
Distance from antenna, m 1 2 3 4 5 6
HELM (2006): 51
A
Represent power by P and distance by r. To show that the data fit the function P = where
r2
A is a constant, plot log(P ) against log(r) (or plot the ‘raw’ data on log-log axes) and check
(a) how close the resulting graph is to that of a straight line

(b) how close the slope is to 2.
The values corresponding to log(P ) and log(r) are
log(P ) -0.428 -1.041 -1.399 -1.653 -1.851 -2.012
log(r) 0 0.301 0.499 0.602 0.694 0.778
These are plotted in Figure 11 and it is clear that they lie close to a straight line.
− 0.5
−1
log(P ) − 1.5
−2
−2.5 log(r)
0 0.2 0.4 0.6 0.8
Figure 11
The slope of a line through the first and third points can be found from
−1.399 − (−0.428)
= −2.035
0.499 − 0
The negative value means that the line slopes downwards for increasing r. It would have been possible
to use any pair of points to obtain a suitable line but note that the last point is least ‘in line’ with
A
the others. Taking logarithms of the equation P = n gives log(P ) = log(A) − n log(r)
r
The inverse square law corresponds to n = 2. In this case the data yield n = 2.035 ≈ 2. Where
log(r) = 0, log(P ) = log(A). This means that the intercept of the line with the log(P ) axis gives
the value of log(A) = −0.428. So A = 10 − 0.428 = 0.393.
Interpretation
If the power decreases with distance according to the inverse square law, then the slope of the line
should be −2. The calculated value of n = 2.035 is sufficiently close to confirm the inverse square
law. The values of A and n calculated from the data imply that P varies with r according to
0.4
P =
r2
The slope of the line on a log-log plot is a little larger than −2. Moreover the points at 5 m and 6 m
range fall below the line so there may be additional attenuation of the power with distance compared
with predictions of the inverse square law.
52 HELM (2006):
®
Exercises
1. Sketch the graphs of (a) y = et (b) y = et + 3 (c) y = e−t (d) y = e−t − 1
2. The figure below shows the graphs of y = et , y = 2et and y = e2t .
16 y e2t
14 2et
12
10
8
et
6
4
2
2 1 0 1 2 t
State in words how the graphs of y = 2et and y = e2t relate to the graph of y = et .
3. The figures below show graphs of y = −e−t , y = 4 − e−t and y = 4 − 3e−t .
y
y y
1 2
−1 0 t 4
4
−1 y = −e−t y = 4 − e−t 2 y = 4 − 3e−t
3
2 −1 0 1 2 t
−2 −2
1
−4
−3
t
−1 0 1 2
Use the above graphs to help you to sketch graphs of (a) y = 5 − e−t (b) y = 5 − 2e−t
4. (a) The graph (a) in the figure below has an equation of the form
y = A + e−kt , where A and k are constants. What is the value of A?

(b) The graph (b) below has an equation of the form y = Aekt where A and k are constants.
What is the value of A?
(c) Write down a possible form of the equation of the exponential graph (c) giving numerical
values to as many constants as possible.
(d) Write down a possible form of the equation of the exponential graph (d) giving numerical
values to as many constants as possible.
HELM (2006): 53
y y
2 -------------------------- 5
t t
(a) (b)
y
y
6 --------------------------------
3
2
1 ------------------------------
t t
(c) (d)
Answers
1.
y
et + 3
e−t et
e−t − 1
4
1
1
t
2 1 0 2
2. (a) y = 2et is the same shape as y = et but with all y values doubled.
(b) y = e2t is much steeper than y = et for t > 0 and much flatter for t < 0. Both pass
through (0, 1). Note that y = e2t = (et )2 so each value of y = e2t is the square of the
corresponding value of y = et .
y 5 − et y
6
4
4 − 3 et
2 3
3. (a) t (b) t
4. (a) 2 (b) 5 (c) y = 6 − 4e−kt (d) y = 1 + 2e−kt
54 HELM (2006):
®
6. Logarithmic relationships
Experimental psychology is concerned with observing and measuring human response to various
stimuli. In particular, sensations of light, colour, sound, taste, touch and muscular tension are
produced when an external stimulus acts on the associated sense. A nineteenth century German,
Ernst Weber, conducted experiments involving sensations of heat, light and sound and associated
stimuli. Weber measured the response of subjects, in a laboratory setting, to input stimuli measured
in terms of energy or some other physical attribute and discovered that:
(1) No sensation is felt until the stimulus reaches a certain value, known as the threshold value.
(2) After this threshold is reached an increase in stimulus produces an increase in sensation.
(3) This increase in sensation occurs at a diminishing rate as the stimulus is increased.
Task
(a) Do Weber’s results suggest a linear or non-linear relationship between sensa-
tion and stimulus? Sketch a graph of sensation against stimulus according
to Weber’s results.
(b) Consider whether an exponential function or a growth-to-a-limit function

might be an appropriate model.
Answer
(a) Non-linearity is required by observation (3).
10
S 5
0
0 2 4 6 8 10
P
(b) An exponential-type of growth is not appropriate for a model consistent with these experimen-
tal results, since we need a diminishing rate of growth in sensation as the stimulus increases.
A growth-to-a-limit type of function is not appropriate since the data, at least over the range
of Weber’s experiments, do not suggest that there is a limit to the sensation with continuing
increase in stimulus; only that the increase in sensation occurs more and more slowly.
A late nineteenth century German scientist, Gustav Fechner, studied Weber’s results. Fechner sug-
gested that an appropriate function modelling Weber’s findings would be logarithmic. He suggested
that the variation in sensation (S) with the stimulus input (P ) is modelled by
HELM (2006): 55
S = A log(P/T ) (0 < T ≤ 1)
where T represents the threshold of stimulus input below which there is no sensation and A is a
constant. Note that when P = T, log(P/T ) = log(1) = 0, so this function is consistent with item
(1) of Weber’s results. Recall also that log means logarithm to base 10, so when P = 10T, S =
A log(10) = A. When P = 100T, S = A log(100) = 2A. The logarithmic function predicts that
a tenfold increase in the stimulus input from T to 10T will result in the same change in sensation
as a further tenfold increase in stimulus input to 100T . Each tenfold change is stimulus results in
a doubling of sensation. So, although sensation is predicted to increase with stimulus, the stimulus
has to increase at a faster and faster rate (i.e. exponentially) to achieve a given change in sensation.
These points are consistent with items (2) and (3) of Weber’s findings. Fechner’s suggestion, that
the logarithmic function is an appropriate one for a model of the relationship between sensation and
stimulus, seems reasonable. Note that the logarithmic function suggested by Weber is not defined
for zero stimulus but we are only interested in the model at and above the threshold stimulus, i.e.
for values of the logarithm equal to and above zero. Note also that the logarithmic function is useful
for looking at changes in sensation relative to stimulus values other than the threshold stimulus.
According to Rule 2a in Table 2 on page 42, Fechner’s sensation function may be written
S = A log(P/T ) = A[log(P ) − log(T )] (P ≥ T > 0).
Suppose that the sensation has the value S1 at P1 and S2 at P2 , so that
S1 = A[log(P1 ) − log(T )] (P1 ≥ T > 0),
and
S2 = A[log(P2 ) − log(T )] (P2 ≥ T > 0).
If we subtract the first of these two equations from the second, we get
S2 − S1 = A[log(P2 ) − log(P1 )] = A log(P2 /P1 ),
where Rule 2a of Table 2 has been used again for the last step. According to this form of equation,
the change in sensation between two stimuli values depends on the ratio of the stimuli values.
We start with
S = A log(P/T ) (1 ≥ T > 0).
Divide both sides by A:
S P
= log (1 ≥ T > 0).
A T
‘Undo’ the logarithm on both sides by raising 10 to the power of each side:
P
10S/A = 10log(P/T ) = (1 ≥ T > 0), using Rule 4b of Table 2.
T
So P = T × 10S/A (1 ≥ T > 0) which is an exponential relationship between stimulus and
sensation.
A logarithmic relationship between sensation and stimulus therefore implies an exponential rela-
tionship between stimulus and sensation. The relationship may be written in two different forms with
the variables playing opposite roles in the two functions.
The logarithmic relationship between sensation and stimulus is known as the Weber-Fechner Law of
Sensation. The idea that a mathematical function could describe our sensations was startling when
56 HELM (2006):
®
first propounded. Indeed it may seem quite amazing to you now. Moreover it doesn’t always work.
Nevertheless the idea has been quite fruitful. Out of it has come much quantitative experimental
psychology of interest to sound engineers. For example, it relates to the sensation of the loudness of
sound. Sound level is expressed on a logarithmic scale. At a frequency of 1 kHz an increase of 10
dB corresponds to a doubling of loudness.
Task
x
Given a relationship between y and x of the form y = 3 log( ) (x ≥ 4), find
4
the relationship between x and y.
Your solution
Answer
One way of answering is to compare with the example preceding this task. We have y in place of
S, x in place of P , 3 in place of A, 4 in place of T . So it is possible to write down immediately
x = 4 × 10y/3 (y ≥ 0)
Alternatively we can manipulate the given expression algebraically.
Starting with y = 3 log(x/4), divide both sides by 3 to give y/3 = log(x/4).
Raise 10 to the power of each side to eliminate the log, so that 10y/3 = x/4.
Multiply both sides by 4 and rearrange, to obtain x = 4 × 10y/3 , as before.
The associated range is the result of the fact that x ≥ 4, so 10y/3 ≥ 1, so y/3 > 0 which means
y > 0.
HELM (2006): 57

Log-linear Graphs 6.6
Introduction
In this Section we employ our knowledge of logarithms to simplify plotting the relation between one
variable and another. In particular we consider those situations in which one of the variables requires
scaling because the range of its data values is very large in comparison to the range of the other
variable.
We will only employ logarithms to base 10. To aid the plotting process we explain how log-linear
graph paper is used. Unlike ordinary graph paper, one of the axes is scaled using logarithmic values
instead of the values themselves. By this process, values which range from (say) 1 to 1,000,000 are
scaled down to range over the values 0 to 6. We do not discuss log-log graphs, in which both data
sets require scaling, as the reader will easily be able to adapt the technique described here to those
situations.
' $
• be familiar with the laws of logarithms
Prerequisites • have knowledge of logarithms to base 10
Before starting this Section you should . . . • be able to solve equations involving
logarithms
&
%

• decide when to use log-linear graph paper
Learning Outcomes
• use log-linear graph paper to analyse
On completion you should be able to . . . functions of the form y = kapx

58 HELM (2006):
®
1. Logarithms and scaling

In this Section we shall work entirely with logarithms to base 10.
We are already familiar with a particular property of logarithms: log Ak = k log A.
Now, choosing A = 10 we see that: log 10k = k log 10 = k.
The effect of taking a logarithm is to replace a power: 10k (which could be very large) by the value
of the exponent k. Thus a range of numbers extending from 1 to 1,000,000 say, can be transformed,
by taking logarithms to base 10, into a range of numbers from 0 to 6. This approach is especially
useful in the exercise of plotting one variable against another in which one of the variables has a wide
range of values.
Example 10
x 1.0 1.1 1.2 1.3 1.4 1.5 1.6
Plot the following values (x, y)
y 1.0 2.14 4.3 8.16 14.8 25.6 42.9
Estimate the value of y when x = 1.35.
Solution
If we attempt to plot these values on ordinary graph paper in which both vertical and horizontal
scales are linear we find the large range in the y-values presents a problem. The values near the
lower end are bunched together and interpolating to find the value of y when x = 1.35 is difficult.
y
42.9
25.6
14.8
8.16
4.3
1.0 1.6 x
Figure 12
HELM (2006): 59
Section 6.6: Log-linear Graphs
Example 11
To alleviate the scaling problem in Example 10 employ logarithms to scale down
x 1 1.1 1.2 1.3 1.4 1.5 1.6
the y-values, giving:
log y 0 0.33 0.63 0.97 1.17 1.41 1.63
Plot these values and estimate the value of y when x = 1.35.
Solution
log y
1.63
1.41
1.17
0.91
0.63
0.33
x
1.0 1.2 1.4 1.6
Figure 13
This approach has spaced-out the vertical values allowing a much easier assessment for the value
of y at x = 1.35. From the graph we see that at x = 1.35 the ‘log y’ value is approximately 1.05.
Taking log y = 1.05 and inverting we get
y = 101.05 = 11.22
60 HELM (2006):
®
2. Log-linear graph paper

Ordinary graph paper has linear scales in both the horizontal (x) and vertical (y) directions. As we
have seen, this can pose problems if the range of one of the variables, y say, is very large. One way
round this is to take the logarithm of the y-values and re-plot on ordinary graph paper. Another
common approach is to use log-linear graph paper in which the vertical scale is a non-linear
logarithmic scale. Use of this special graph paper means that the original data can be plotted
directly without the need to convert to logarithms which saves time and effort.
In log-linear graph paper the vertical axis is divided into a number of cycles. Each cycle corresponds
to a jump in the data values by a factor of 10. For example, if the range of y-values extends from
(say) 1 to 100 (or equivalently 100 to 102 ) then 2-cycle log-linear paper would be required. If the
y-values extends from (say) 100 to 100,000 (or equivalently from 102 to 105 ) then 3-cycle log-linear
paper would be used. Some other examples are given in Table 3:
Table 3
y − values log y values no. of cycles

1 → 10 0→1 1
1 → 100 0→2 2
10 → 10, 000 1→4 3
1
10
→ 100 −1 →2 3
An example of 2-cycle log-linear graph paper is shown in Figure 14. We see that the horizontal scale
is linear. The vertical scale is divided by lines denoted by 1,2,3,. . . ,10,20,30,. . . ,100. In the first
cycle each of the horizontal blocks (separated by a slightly thicker line) is also divided according to
a log-linear scale; so, for example, in the range 1 → 2 we have 9 horizontal lines representing the
values 1.1, 1.2, . . . , 1.9. These subdivisions have been repeated (appropriately scaled) in blocks 2-3,
3-4, 4-5, 5-6, 6-7. The subdivisions have been omitted from blocks 7-8, 8-9, 9-10 for reasons of
clarity. On this graph paper, we have noted the positions of A : (1, 2), B : (1, 23), C : (4, 23), D :
(6, 2.5), E : (3, 61).
HELM (2006): 61
100
90
80
70
60 E
50
40
second cycle
30
B C
20
logarithmic scale
10
9
8
7
6
4
First cycle
3
D
2 A
1
1 2 3 4 5 6 7
linear scale
Figure 14
62 HELM (2006):
®
Task
On the 2-cycle log-linear graph paper (below) locate the positions of the points
F : (2, 21), G : (2, 51), H : (5, 3.5). [The correct positions are shown on the
graph on next page.]
log y
1
9
8
7
6
1
9
8
7
6
1 x
HELM (2006): 63
100
90
80
70
60
50 G
40
second cycle
30
F
20
logarithmic scale
10
9
8
7
6
4
H
First cycle
1
1 2 3 4 5 6 7
linear scale
64 HELM (2006):
®
Example 12
It is thought that the relationship between two variables x, y is exponential
y = kax
An experiment is performed and the following pairs of data values (x, y) were
obtained
x 1 2 3 4 5
y 5.9 12 26 49 96
Verify that the relation y = kax is valid by plotting values on log-linear paper to
obtain a set of points lying on a straight line. Estimate the values of k, a.
Solution
First we rearrange the relation y = kax by taking logarithms (to base 10).
∴ log y = log(kax ) = log k + x log a
So, if we define a new variable Y ≡ log y then the relationship between Y and x will be linear −
its graph (on log-linear paper) should be a straight line. The vertical intercept of this line is log k
and the gradient of the line is log a. Each of these can be obtained from the graph and the values
of a, k inferred.
When using log-linear graphs, the reader should keep in mind that, on the vertical axis, the values
are not as written but the logarithms of those values.
We have plotted the points and drawn a straight line (as best we can) through them - see Figure
15. (We will see in a later Workbook ( 31) how we might improve on this subjective approach
to fitting straight lines to data points). The line intersects the vertical axis at a value log(3.13) and
the gradient of the line is
log 96 − log 3.13 log(96/3.13) log 30.67
= = = 0.297
5−0 5 5
But the intercept is log k so
log k = log 3.13 implying k = 3.13
and the gradient is log a so
log a = 0.297 implying a = 100.297 = 1.98
We conclude that the relation between the x, y variables is well modelled by the
relation y = 3.13(1.98)x . If the points did not lie more-or-less on a straight line then we would
conclude that the relationship was not of the form y = kax .
HELM (2006): 65
log y
100
90
80
70
60
50
40
30
20
10
9
8
7
6
1 x
1 2 3 4 5 6 7
Figure 15
66 HELM (2006):
®
Task
Using a log-linear graph estimate the values of k, a if it is assumed
that y = ka−2x and the data values connecting x, y are:
x −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3
y 190 155 123 100 80 63 52
First take logs of the relation y = ka−2x and introduce an appropriate new variable:
Your solution
y = ka−2x implies log y = log(ka−2x ) =
introduce Y =
log y = log k − 2x log a. Let Y = log y then Y = log k + x(−2 log a). We therefore expect a linear
relation between Y and x (i.e. on log-linear paper).
Now determine how many cycles are required in your log-linear paper:
Your solution
The range of values of y is 140; from 5.2 × 10 to 1.9 × 102 . So 2-cycle log-linear paper is needed.
Now plot the data values directly onto log-linear paper (supplied on the next page) and decide
whether the relation y = ka−2x is acceptable:
Your solution
It is acceptable. On plotting the points a straight line fits the data well which is what we expect
from Y = log k + x(−2 log a).
Now, using knowledge of the intercept and the gradient, find the values of k, a:
Your solution
See the graph two pages further on. k ≈ 94 (intercept on x = 0 line). The gradient is
log 235 − log 52 log(235/52) 0.655
=− =− = −0.935
−0.4 − 0.3 0.7 0.7
But the gradient is −2 log a. Thus − 2 log a = −0.935 which implies a = 100.468 = 2.93
HELM (2006): 67
log y
1000
900
800
700
600
500
400
300
200
100
90
80
70
60
50
40
30
20
10
0.3
x
−0.3 −0.2 −0.1 0.0 0.1 0.2
Your solution to Task on page 67
68 HELM (2006):
®
log y
1000
900
800
700
600
500
400
300
200
100
90
80
70
60
50
40
30
20
10
0.0 0.1 0.2 0.3
x
−0.3 −0.2 −0.1
Answer to Task on page 67
HELM (2006): 69
Use the log-linear graph sheets supplied on the following pages for these Exercises.
Exercises
1. Estimate the values of k and a if y = kax represents the following set of data values:
x 0.5 1 2 3 4
y 5.93 8.8 19.36 42.59 93.70
2. Estimate the values of k and a if the relation y = k(a)−x is a good representation for the data
values:
x 2 2.5 3 3.5 4
y 7.9 3.6 1.6 0.7 0.3
Answers
1. k ≈ 4 a ≈ 2.2
2. k ≈ 200 a ≈ 5
70 HELM (2006):
®
log y
1
9
8
7
6
1
9
8
7
6
1 x
HELM (2006): 71
log y
1
9
8
7
6
1
9
8
7
6
1 x
72 HELM (2006):
®
log y
1
9
8
7
6
1
9
8
7
6
1 x
HELM (2006): 73
Contents 7
Matrices
7.1 Introduction to Matrices 2
7.2 Matrix Multiplication 15
7.3 Determinants 30
7.4 The Inverse of a Matrix 38
Learning outcomes
In this Workbook you will learn about matrices. In the first instance you will learn about the
algebra of matrices: how they can be added, subtracted and multiplied. You will learn
about a characteristic quantity associated with square matrices - the determinant. Using
knowledge of determinants you will learn how to find the inverse of a matrix. Also, a
second method for finding a matrix inverse will be outlined - the Gaussian elimination
method.
A working knowledge of matrices is a vital attribute of any mathematician,

engineer or scientist. You will find that matrices arise in many varied areas of science.
Introduction to
Matrices 7.1
Introduction
When we wish to solve large systems of simultaneous linear equations, which arise for example in the
problem of finding the forces on members of a large framed structure, we can isolate the coefficients
of the variables as a block of numbers called a matrix. There are many other applications matrices.
In this Section we develop the terminology and basic properties of a matrix.

Prerequisites • be familiar with the rules of number algebra


'
$
• express a system of linear equations in matrix
form
• recognise and use the basic terminology

Learning Outcomes associated with matrices
On completion you should be able to . . . • carry out addition and subtraction with two
given matrices or state that the operation is
not possible
& %
2 HELM (2006):
Workbook 7: Matrices
®
1. Applications of matrices
The solution of simultaneous linear equations is a task frequently occurring in engineering. In electrical
engineering the analysis of circuits provides a ready example.
However the simultaneous equations arise, we need to study two things:
(a) how we can conveniently represent large systems of linear equations
(b) how we might find the solution of such equations.
We shall discover that knowledge of the theory of matrices is an essential mathematical tool in this
area.
Representing simultaneous linear equations

Suppose that we wish to solve the following three equations in three unknowns x1 , x2 and x3 :
3x1 + 2x2 − x3 = 3
x1 − x2 + x3 = 4
2x1 + 3x2 + 4x3 = 5
We can isolate three facets of this system: the coefficients of x1 , x2 , x3 ; the unknowns x1 , x2 , x3 ;
and the numbers on the right-hand sides.
Notice that in the system
3x + 2y − z = 3
x−y+z = 4
2x + 3y + 4z = 5
the only difference from the first system is the names given to the unknowns. It can be checked that
the first system has the solution x1 = 2, x2 = −1, x3 = 1. The second system therefore has the
solution x = 2, y = −1, z = 1.
We can isolate the three facets of the first system by using arrays of numbers and of unknowns:
    
3 2 −1 x1 3
 1 −1 1   x2  =  4 
2 3 4 x3 5
Even more conveniently we represent the arrays with letters (usually capital letters)
AX = B
Here, to be explicit, we write
    
3 2 −1 x1 3
A =  1 −1 1  X =  x2  B= 4 
2 3 4 x3 5
Here A is called the matrix of coefficients, X is called the matrix of unknowns and B is called
the matrix of constants.
If we now append to A the column of right-hand sides we obtain the augmented matrix for the
system:
HELM (2006): 3
Section 7.1: Introduction to Matrices
 
3 2 −1 3
 1 −1 1 4 
2 3 4 5
The order of the entries, or elements, is crucial. For example, all the entries in the second row relate
to the second equation, the entries in column 1 are the coefficients of the unknown x1 , and those in
the last column are the constants on the right-hand sides of the equations.
In particular, the entry in row 2 column 3 is the coefficient of x3 in equation 2.
Representing networks
Shortest-distance problems are important in communications study. Figure 1 illustrates schematically
a system of four towns connected by a set of roads.
a b
c d
Figure 1
The system can be represented by the matrix
a b c d
 
a 0 1 0 0
b  1
 0 1 1 

c  0 1 0 1 
d 0 1 1 0
The row refers to the town from which the road starts and the column refers to the town where the
road ends. An entry of 1 indicates that two towns are directly connected by a road (for example b
and d) and an entry of zero indicates that there is no direct road (for example a and c). Of course,
if there is a road from b to d (say) it is also a road from d to b.
In this Section we shall develop some basic ideas about matrices.
2. Definitions
An array of numbers, rectangular in shape, is called a matrix. The first matrix below has 3 rows
and 2 columns and is said to be a ‘3 by 2’ matrix (written 3 × 2). The second matrix is a ‘2 by 4’
matrix (written 2 × 4).
 
1 4
 −2 3  1 2 3 4
5 6 7 9
2 1
The general 3 × 3 matrix can be written
 
a11 a12 a13
A =  a21 a22 a23 
a31 a32 a33
4 HELM (2006):
®
where aij denotes the element in row i, column j.

For example in the matrix:
 
0 −1 −3
A= 0 6 −12 
5 7 123
a11 = 0, a12 = −1, a13 = −3, ... a22 = 6, ... a32 = 7, a33 = 123
Key Point 1
The General Matrix
A general m × n matrix A has m rows and n columns.
The entries in the matrix A are called the elements of A.
In matrix A the element in row i and column j is denoted by aij .
A matrix with only one column is called a column vector (or column matrix).
   
x1 3
For example,  x2  and  4  are both 3 × 1 column vectors.
x3 5
A matrix with only one row is called a row vector (or row matrix). For example [2, −3, 8, 9] is a
1 × 4 row vector. Often the entries in a row vector are separated by commas for clarity.
Square matrices
When the number of rows is the same as the number of columns, i.e. m = n, the matrix is said to
be square and of order n (or m).
• In an n × n square matrix A, the leading diagonal (or principal diagonal) is the ‘north-west
to south-east’ collection of elements a11 , a22 , . . . , ann . The sum of the elements in the leading
diagonal of A is called the trace of the matrix, denoted by tr(A).
 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
A =  .. tr(A) = a11 + a22 + · · · + ann
 
.. .. .. 
 . . . . 
an1 an2 . . . ann
• A square matrix in which all the elements below the leading diagonal are zero is called an
upper triangular matrix, often denoted by U .
HELM (2006): 5
 
u11 u12 ... ... u1n
 0 u22 ... ... u2n 
U = uij = 0 when i > j
 
.. .. 
 0 0 ... . . 
0 0 ... 0 unn
• A square matrix in which all the elements above the leading diagonal are zero is called a lower
triangular matrix, often denoted by L.
 
l11 0 0 ... 0
 l21 l22 0 . . . 0 
L =  .. .. lij = 0 when i < j
 
 . . ... ... 0 

.
ln1 ln2 .. . . . lnn
• A square matrix where all the non-zero elements are along the leading diagonal is called a
diagonal matrix, often denoted by D.
 
d11 0 0 ... 0
 0 d22 0 . . . 0 
D= 0
 dij = 0 when i 6= j
0 ... ... 0 
0 0 0 . . . dnn
Some examples of matrices and their classification

1 2 3
A= is 2 × 3. It is not square.
4 5 6

1 2
B= is 2 × 2. It is square.
3 4
Also, tr(A) does not exist, and tr(B) = 1 + 4 = 5.
   
1 2 3 4 0 3
C =  0 −2 −5  and D =  0 −2 5  are both 3 × 3, square and upper triangular.
0 0 1 0 0 1
Also, tr(C) = 0 and tr(D) = 3.
   
1 0 0 −1 0 0
E = 2 −2
 0  and F =  1 4 0  are both 3 × 3, square and lower triangular.
3 −5 1 0 1 1
Also, tr(E) = 0 and tr(F ) = 4.
   
1 0 0 4 0 0
G= 0 2 0  and H =  0 2 0  are both 3 × 3, square and diagonal.
0 0 −3 0 0 0
Also, tr(G) = 0 and tr(H) = 6.
6 HELM (2006):
®
Task
Classify the following matrices (and, where possible, find the trace):
 
    1 2 3 4
1 2 1 2 3 4  5 6 7 8 
A= 3 4
  B=  5 6 7 8  C= 
 9 10 11 12 
5 6 −1 −3 −2 −4
13 14 15 16
Your solution
Answer
A is 3 × 2, B is 3 × 4, C is 4 × 4 and square.
The trace is not defined for A or B. However, tr(C) = 34.
Task
Classify the following matrices:
      
1 1 1 1 0 0 1 1 1 1 0 0
A= 1 1 1  B= 1 1 0  C= 0 1 1  D= 0 1 0 
1 1 1 1 1 1 0 0 1 0 0 1
Your solution
Answer
A is 3 × 3 and square, B is 3 × 3 lower triangular, C is 3 × 3 upper triangular and D is 3 × 3
diagonal.
Equality of matrices
As we noted earlier, the terms in a matrix are called the elements of the matrix.

1 2
The elements of the matrix A = are 1, 2, −1, −4
−1 −4
We say two matrices A, B are equal to each other only if A and B have the same number of rows
and the same number of columns and if each element of A is equal to the corresponding element of
B. When this is the case we write A = B. For example if the following two matrices are equal:

1 α 1 2
A= B=
−1 −β −1 −4
then we can conclude that α = 2 and β = 4.
HELM (2006): 7
The unit matrix
The unit matrix or the identity matrix, denoted by In (or, often, simply I), is the diagonal matrix
of order n in which all diagonal elements are 1.
 
1 0 0
1 0
Hence, for example, I2 = and I3 =  0 1 0 .
0 1
0 0 1
The zero matrix

The zero matrix or null matrix is the matrix all of whose elements are zero. There is a zero matrix
for every size. For example the 2 × 3 and 2 × 2 cases are:

0 0 0 0 0
, .
0 0 0 0 0
Zero matrices, of whatever size, are denoted by 0.
The transpose of a matrix

The transpose of a matrix A is a matrix where the rows of A become the columns of the new matrix
and the columns of A become its rows. For example
 
1 4
1 2 3
A= becomes  2 5 
4 5 6
3 6
The resulting matrix is called the transposed matrix of A and denoted AT . In the previous example
it is clear that AT is not equal to A since the matrices are of different sizes. If A is square n × n
then AT will also be n × n.
Example 1  
1 2 3
Find the transpose of the matrix B =  4 5 6 
7 8 9
Solution
Interchanging rows with columns we find
 
1 4 7
T
B = 2 5
 8 
3 6 9
Both matrices are 3 × 3 but B and B T are clearly different.
When the transpose of a matrix is equal to the original matrix i.e. AT = A, then we say that the
matrix A is symmetric. (This is because it has symmetry about the leading diagonal.)
In Example 1 B is not symmetric.
8 HELM (2006):
®
Example 2  
1 −2 3
Show that the matrix C =  −2 4 −5  is symmetric.
3 −5 6
Solution
Taking the transpose of C:
 
1 −2 3
C T =  −2 4 −5 .
3 −5 6
Clearly C T = C and so C is a symmetric matrix. Notice how the leading diagonal acts as a “mirror”;
for example c12 = −2 and c21 = −2. In general cij = cji for a symmetric matrix.
Task
Find the transpose of each of the following matrices. Which are symmetric?

1 2 1 1 1 1
A= , B= C=
3 4 −1 1 1 0
 
1 2
1 0
D= 4 5  E=
0 1
7 8
Your solution
Answer

T 1 3 T 1 −1 T 1 1
A = , B = C = = C, symmetric
2 4 1 1 1 0

1 4 7 1 0
DT = T
E = = E, symmetric
2 5 8 0 1
HELM (2006): 9
3. Addition and subtraction of matrices
Under what circumstances can we add two matrices i.e. define A + B for given matrices A, B?
Consider

1 2 5 6 9
A= and B=
3 4 7 8 10
There is no sensible way to define A + B in this case since A and B are different sizes.
However, if we consider
matrices
of the same
size
then addition can be defined in a very natural
1 2 5 6
way. Consider A = and B = . The ‘natural’ way to add A and B is to add
3 4 7 8
corresponding elements together:

1+5 2+6 6 8
A+B = =
3+7 4+8 10 12
In general if A and B are both m × n matrices, with elements aij and bij respectively, then their
sum is a matrix C, also m × n, such that the elements of C are
cij = aij + bij i = 1, 2, . . . , m j = 1, 2, . . . , n
In the above example
c11 = a11 + b11 = 1 + 5 = 6 c21 = a21 + b21 = 3 + 7 = 10 and so on.
Subtraction of matrices follows along similar lines:

1−5 2−6 −4 −4
D =A−B = =
3−7 4−8 −4 −4
4. Multiplication of a matrix by a number

There is also a natural way of defining the product of a matrix with a number. Using the matrix A
above, we note that

1 2 1 2 2 4
A+A= + =
3 4 3 4 6 8
What we see is that 2A (which is the shorthand notation for A + A) is obtained by multiplying every
element of A by 2.
In general if A is an m × n matrix with typical element aij then the product of a number k with A
is written kA and has the corresponding elements kaij .
Hence, again using the matrix A above,

1 2 7 14
7A = 7 =
3 4 21 28
Similarly:

−3 −6
−3A =
−9 −12
10 HELM (2006):
®
Task
For the following matrices find, where possible, A + B, A − B, B − A, 2A.

1 2 1 1
1. A = B=
3 4 1 1
   
1 2 3 1 1 1
2. A =  4 5 6  B =  −1 −1 −1 
7 8 9 1 1 1
   
1 2 3 1 2
3. A =  4 5 6  B= 3 4 
7 8 9 5 6
Your solution
Answer

2 3 0 1 0 −1 2 4
1. A + B = A−B = B−A= 2A =
4 5 2 3 −2 −3 6 8
     
2 3 4 0 1 2 0 −1 −2
2. A + B =  3 4 5  A−B = 5 6 7  B − A = −5 −6 −7 

8 9 10 6 7 8 −6 −7 −8
 
2 4 6
2A =  8 10 12 
14 16 18
 
2 4 6
3. None of A + B, A − B, B − A, are defined. 2A =  8 10 12 
14 16 18
HELM (2006): 11
5. Some simple matrix properties
Using the definition of matrix addition described above we can easily verify the following properties
of matrix addition:
Key Point 2
Basic Properties of Matrices

Matrix addition is commutative: A + B = B + A
Matrix addition is associative: A + (B + C) = (A + B) + C
The distributive law holds: k(A + B) = k A + k B
These Key Point results follow from the fact that aij + bij = bij + aij etc.
We can also show that the transpose of a matrix satisfies the following simple properties:
Key Point 3
Properties of Transposed Matrices
(A + B)T = AT + B T
(A − B)T = AT − B T
(AT )T = A
Example 3
T T 1 2 3
Show that (A ) = A for the matrix A =
4 5 6
Solution
 
1 4
T T T 1 2 3
A = 2 5 so that (A ) =
  =A
4 5 6
3 6
12 HELM (2006):
®
Task
1 2 1 −1
For the matrices A = , B= verify that
3 4 −1 1
(i) 3(A + B) = 3A + 3B (ii) (A − B)T = AT − B T .
Your solution
Answer
2 1 6 3 3 6
(i) A + B = ; 3(A + B) = ; 3A = ;
2 5 6 15 9 12

3 −3 6 3
3B = ; 3A + 3B = .
−3 3 6 15

0 3 T 0 4 T 1 3
(ii) A − B = ; (A − B) = ; A = ;
4 3 3 3 2 4

T 1 −1 T T 0 4
B = ; A −B = .
−1 1 3 3
HELM (2006): 13
Exercises
1. Find the coefficient matrix A of the system:
2x1 + 3x2 − x3 = 1
4x1 + 4x2 = 0
2x1 − x2 − x3 = 0
 
1 2 3
If B =  4 5 6  determine (3AT − B)T .
0 0 1
 
−1 4
1 2 3
2. If A = and B =  0 1  verify that 3(AT − B) = (3A − 3B T )T .
4 5 6
2 7
Answers
     
2 3 −1 2 4 2 6 12 6
1. A =  4 4 0  , AT =  3 4 −1  , 3AT =  9 12 −3 
2 −1 −1  −1 0 −1  −3 0 −3
5 10 3 5 5 −3
3AT − B =  5 7 −9  (3AT − B)T =  10 7 0 
−3 0 −4 3 −9 −4
     
1 4 2 0 6 0
2. AT =  2 5  , AT − B =  2 4  , 3(AT − B) =  6 12 
3 6 1 −1 3 −3

T −1 0 2 T 3 6 9 −3 0 6 6 6 3
B = , 3A − 3B = − =
4 1 7 12 15 18 12 3 21 0 12 −3
14 HELM (2006):
®

Matrix Multiplication 7.2

Introduction
When we wish to multiply matrices together we have to ensure that the operation is possible - and
this is not always so. Also, unlike number arithmetic and algebra, even when the product exists the
order of multiplication may have an effect on the result. In this Section we pick our way through the
minefield of matrix multiplication.

Prerequisites • understand the concept of a matrix and

associated terms.

'
$
• decide when the product AB exists
• recognise that AB 6= BA in most cases

Learning Outcomes
• carry out the multiplication AB
• explain what is meant by the identity matrix I
& %
HELM (2006): 15
Section 7.2: Matrix Multiplication
1. Multiplying row matrices and column matrices together
Let A be a 1 × 2 row matrix and B be a 2 × 1 column matrix:

c
A= a b B=
d
The product of these two matrices is written AB and is the 1 × 1 matrix defined by:

c
AB = a b × = [ac + bd]
d
Note that corresponding elements are multiplied together and the results are then added together.
For example

6
2 −3 × = [12 − 15] = [−3]
5
This matrix product is easily generalised to other row and column matrices. For example if C is a
1 × 4 row matrix and D is a 4 × 1 column matrix:
 
3
 3 
C = 2 −4 3 2 B=  −2 

5
then we define the product of C with D as
 
3
 3 
CD = 2 − 4 3 2 ×   −2
 = [6 − 12 − 6 + 10] = [−2]

5
The only requirement is that the number of elements of the row matrix is the same as the number
of elements of the column matrix.
2. Multiplying two 2×2 matrices

If A and B are two matrices then the product AB is obtained by multiplying the rows of A with the
columns of B in the manner described above. This will only be possible if the number of elements
in the rows of A is the same as the number of elements in the columns of B. In particular, we
define the product of two 2 × 2 matrices A and B to be another 2 × 2 matrix C whose elements are
calculated according to the following pattern

a b w x aw + by ax + bz
× =
c d y z cw + dy cx + dz
A B = C
The rule for calculating the elements of C is described in the following Key Point:
16 HELM (2006):
®
Key Point 4
Matrix Product
AB = C
The element in the ith row and j th column of C is obtained
by multiplying the ith row of A with the j th column of B.
We illustrate this construction for the abstract matrices A and B given above:
 
w x
a b a b
 y z 
a b w x   aw + by ax + bz
× =  = cw + dy cx + dz

c d y z 


w

x 
c d c d
y z
For example
 
2 4
 2 −1 2 −1
6  1
2 −1 2 4   −2 7
× =

 = −6 10

3 −2 6 1 

2 4 
3 −2 3 −2
6 1
Task
1 2 1 −1
Find the product AB where A = B=
3 4 −2 1
First write down row 1 of A, column 2 of B and form the first element in product AB:
Your solution
Answer
−1
[1, 2] and ; their product is 1 × (−1) + 2 × 1 = 1.
1
Now repeat the process for row 2 of A, column 1 of B:
Your solution
Answer
1
[3, 4] and . Their product is 3 × 1 + 4 × (−2) = −5
−2
HELM (2006): 17
Finally find the two other elements of C = AB and hence write down the matrix C:
Your solution
Answer
Row 1 column 1 is 1 × 1 + 2 × (−2) = −3. Row 2 column 2 is 3 × (−1) + 4 × 1 = 1

−3 1
C=
−5 1
Clearly, matrix multiplication is tricky and not at all ‘natural’. However, it is a very important
mathematical procedure with many engineering applications so must be mastered.
3. Some surprising results

We have already calculated the product AB where

1 2 1 −1
A= and B =
3 4 −2 1
Now complete the following task in which you are asked to determine the product BA, i.e. with the
matrices in reverse order.
Task
1 2 1 −1
For matrices A = and B = form the products of
3 4 −2 1
row 1 of B and column 1 of A row 1 of B and column 2 of A
row 2 of B and column 1 of A row 2 of B and column 2 of A
Now write down the matrix BA:

Your solution
Answer
row 1, column 1 is 1 × 1 + (−1) × 3 = −2 row 1, column 2 is 1 × 2 + (−1) × 4 = −2
row 2, column 1 is −2 × 1 + 1 × 3 = 1 row 2, column 2 is −2 × 2 + 1 × 4 = 0

−2 −2
BA is
1 0
It is clear that AB and BA are not in general the same. In fact it is the exception that AB = BA.
In the special case in which AB = BA we say that the matrices A and B commute.
18 HELM (2006):
®
Task
Calculate AB and BA where

a b 0 0
A= and B =
c d 0 0
Your solution
Answer
0 0
AB = BA =
0 0
We call B the 2 × 2 zero matrix written 0 so that A × 0 = 0 × A = 0 for any matrix A.
Now in the multiplication of numbers, the equation

ab = 0
implies that either a is zero or b is zero or both are zero. The following task shows that this is not
necessarily true for matrices.
Task
Carry out the multiplication AB where

1 1 1 −1
A= , B=
1 1 −1 1
Your solution
Answer
0 0
AB =
0 0
Here we have a zero product yet neither A nor B is the zero matrix! Thus the statement AB = 0
does not allow us to conclude that either A = 0 or B = 0.
HELM (2006): 19
Task
a b 1 0
Find the product AB where A = and B =
c d 0 1
Your solution
Answer
a b
AB = =A
c d

1 0
The matrix is called the identity matrix or unit matrix of order 2, and is usually denoted
0 1
by the symbol I. (Strictly we should write I2 , to indicate the size.) I plays the same role in matrix
multiplication as the number 1 does in number multiplication.
Hence
just as a × 1 = 1 × a = a for any number a, so AI = IA = A for any matrix A.
4. Multiplying two 3×3 matrices

The definition of the product C = AB where A and B are two 3 × 3 matrices is as follows
    
a b c r s t ar + bu + cx as + bv + cy at + bw + cz
C= d  e f   u v w  =  dr + eu + f x ds + ev + f y dt + ew + f z 
g h i x y z gr + hu + ix gs + hv + iy gt + hw + iz
This looks a rather daunting amount of algebra but in fact the construction of the matrix on the
right-hand side is straightforward if we follow the simple rule from Key Point 4 that the element in
the ith row and j th column of C is obtained by multiplying the ith row of A with the j th column of
B.
For example, to obtain the element in row 2, column 3 of C we take row 2 of A: [d, e, f ] and multiply
it with column 3 of B in the usual way to produce [dt + ew + f z].
By repeating this process we obtain every element of C.
20 HELM (2006):
®
  
Task 1 2 −1 2 −1 3
Calculate AB = 3
 4 0   1 −2 1 
1 5 −2 0 3 −2
First find the element in row 2 column 1 of the product:
Your solution
Answer  
2
Row 2 of A is (3, 4, 0) column 1 of B is  1 
0
The combination required is 3 × 2 + 4 × 1 + (0) × (0) = 10.
Now complete the multiplication to find all the elements of the matrix AB:
Your solution
Answer
In full detail, the elements of AB are:
 
1 × 2 + 2 × 1 + (−1) × 0 1 × (−1) + 2 × (−2) + (−1) × 3 1 × 3 + 2 × 1 + (−1) × (−2)
 3×2+4×1+0×0 3 × (−1) + 4 × (−2) + 0 × 3 3 × 3 + 4 × 1 + 0 × (−2) 
1 × 2 + 5 × 1 + (−2) × 0 1 × (−1) + 5 × (−2) + (−2) × 3 1 × 3 + 5 × 1 + (−2) × (−2)
 
4 −8 7
i.e. AB =  10 −11 13 
7 −17 12
 
1 0 0
The 3 × 3 unit matrix is: I =  0 1 0  and as in the 2 × 2 case this has the property that
0 0 1
AI = IA = A
 
0 0 0
The 3 × 3 zero matrix is  0 0 0 
0 0 0
HELM (2006): 21
5. Multiplying non-square matrices together
So far, we have just looked at multiplying 2 × 2 matrices and 3 × 3 matrices. However, products
between non-square matrices may be possible.
Key Point 5
General Matrix Products

The general rule is that an n × p matrix A can be multiplied
by a p × m matrix B to form an n × m matrix AB = C.
In words:
For the matrix product AB to be defined the number
of columns of A must equal the number of rows of B.
The elements of C are found in the usual way:
The element in the ith row and j th column of C is obtained
by multiplying the ith row of A with the j th column of B.
Example 4  
2 5
1 2 2
Find the product AB if A = and B =  6 1 
2 3 4
4 3
Solution
Since A is a 2 × 3 and B is a 3 × 2 matrix the product AB can be found and results in a 2 × 2
matrix.
     
2 5
  1 2 2
  6  1 2 2  1  
 
2 5  4 3 
1 2 2 
 = 22 13

AB = × 6 1 =
  
2 3 4      38 25
4 3 
 2 5  
 2 3 4  6  2 3 4  1  
4 3
22 HELM (2006):
®
Task
1 −2 2 4 1
Obtain the product AB if A = and B =
2 −3 6 1 0
Your solution
Answer
AB is a 2 × 3 matrix.
 
2 4 1
 1 −2 1 −2
6
1 −2  1 0
1 −2 2 4 1  
AB = × =
 
2 −3 6 1 0 

2

4

1 
2 −3 2 −3 2 −3
6 1 0

−10 2 1
=
−14 5 2
6. The rules of matrix multiplication

It is worth noting that the process of multiplication can be continued to form products of more than
two matrices.
Although two matrices may not commute (i.e. in general AB 6= BA) the associative law always
holds i.e. for matrices which can be multiplied,
A(BC) = (AB)C.
The general principle is keep the left to right order, but within that limitation any two adjacent
matrices can be multiplied.
It is important to note
that it is not always
possible to multiply together
any two given matrices.

1 2 a b c a + 2d b + 2e c + 2f
For example if A = and B = then AB = .
3 4 d e f 3a + 4d 3b + 4e 3c + 4f

a b c 1 2
However BA = is not defined since each row of B has three elements
d e f 3 4
whereas each column of A has two elements and we cannot multiply these elements in the manner
described.
HELM (2006): 23
 
Task 1 4
1 3 5 1 2
Given A = , B= , C= 2 5 
2 4 6 3 4
3 6
State which of the products AB, BA, AC, CA, BC, CB, (AB)C, A(CB) is defined and state
the size (n × m) of the product when defined.
Your solution
AB
BA
AC
CA
BC
CB
(AB)C
A(CB)
Answer
A B B A
not possible possible; result 2 × 3
2×3 2×2 2×2 2×3
A C C A
possible; result 2 × 2 possible; result 3 × 3
2×3 3×2 3×2 2×3
B C C B
not possible possible; result 3 × 2
2×2 3×2 3×2 2×2
A (C B)
(AB)C not possible, AB not defined. possible; result 2 × 2
2×3 3×2
We now list together some properties of matrix multiplication and compare them with corresponding
properties for multiplication of numbers.
Key Point 6
Matrix algebra Number algebra
A(B + C) = AB + AC a(b + c) = ab + ac
AB 6= BA in general ab = ba
A(BC) = (AB)C a(bc) = (ab)c
AI = IA = A 1.a = a.1 = a
A0 = 0A = 0 0.a = a.0 = 0
AB may not be possible ab is always possible
AB = 0 does not imply A = 0 or B = 0 ab = 0 → a = 0 or b = 0
24 HELM (2006):
®
Application of matrices to networks

A network is a collection of points (nodes) some of which are connected together by lines (paths).
The information contained in a network can be conveniently stored in the form of a matrix.
Example 5
Petrol is delivered to terminals T1 and T2 . They distribute the fuel to 3 storage
depots (S1 , S2 , S3 ). The network diagram below shows what fraction of the fuel
goes from each terminal to the three storage depots. In turn the 3 depots supply
fuel to 4 petrol stations (P1 , P2 , P3 , P4 ) as shown in Figure 2:
T1 T2
0.4 0.3
0.4
0.5 0.2 0.2
S1 S2 S3
0.1 0.6
0.6 0.2
0.2
0.5 0.4
0.2 0.2
P1 P2 P3 P4
Figure 2
Show how this situation may be described using matrices.
Solution
Denote the amount of fuel, in litres, flowing from T1 by t1 and from T2 by t2 and the quantity being
received at Si by si for i = 1, 2, 3. This situation is described in the following diagram:
T1 T2
0.4 0.3
0.4
0.5 0.2 0.2
S1 S2 S3
From this diagram we see that

   
s1 = 0.4t1 + 0.5t2 s1 0.4 0.5
t
s2 = 0.4t1 + 0.2t2 or, in matrix form: s2  = 0.4 0.2 1
t2
s3 = 0.2t1 + 0.3t2 s3 0.2 0.3
HELM (2006): 25
Solution (contd.)
In turn the 3 depots supply fuel to 4 petrol stations as shown in the next diagram:
S1 S2 S3
0.1 0.6
0.6 0.2
0.2
0.5 0.4
0.2 0.2
P1 P2 P3 P4
If the petrol stations receive p1 , p2 , p3 , p4 litres respectively then from the diagram we have:
   
p1 = 0.6s1 + 0.2s2 p1 0.6 0.2 0  
p2 = 0.2s1 + 0.5s2 p2  0.2 0.5 0  s1
or, in matrix form:  p3  = 0.2 0.2 0.4 s2
   
p3 = 0.2s1 + 0.2s2 + 0.4s3
s3
p4 = 0.1s2 + 0.6s3 p4 0 0.1 0.6
Combining the equations, substituting expressions for s1 , s2 , s3 in the equations for p1 , p2 , p3 , p4
we get:
p1 = 0.6s1 + 0.2s2
= 0.6(0.4t1 + 0.5t2 ) + 0.2(0.4t1 + 0.2t1 )
= 0.32t1 + 0.34t2
with similar results for p2 , p3 and p4 .

This is equivalent to combining the two networks. The results can be obtained more easily by
multiplying the matrices:
   
p1 0.6 0.2 0  
p2  0.2 s1
  =  0.5 0  s2 
p3  0.2 0.2 0.4
s3
p4 0 0.1 0.6
 
0.6 0.2 0  
0.2 0.4 0.5
0.5 0   t
0.4 0.2 1

= 
0.2 0.2 0.4 t2
0.2 0.3
0 0.1 0.6
   
0.32 0.34 0.32t1 + 0.34t2
0.28 0.20
 t1 = 0.28t1 + 0.20t2 
 
= 
0.24 0.26 t2 0.24t1 + 0.26t2 
0.16 0.20 0.16t1 + 0.20t2
26 HELM (2006):
®
Communication network
Problem in words
Figure 3 represents a communication network. Vertices a, b, f and g represent offices. Vertices c, d
and e represent switching centres. The numbers marked along the edges represent the number of
connections between any two vertices. Calculate the number of routes from a and b to f and g
c
3 2
a 2 1 f
6
4 d
3
1
1 1
b g
3 e 2
Figure 3: A communication network where a, b, f and g are offices

and c, d and e are switching centres

The number of routes from a to f can be calculated by taking the number via c plus the number via
d plus the number via e. In each case this is given by multiplying the number of connections along
the edges connecting a to c, c to f etc. This gives the result:
Number of routes from a to f = 3 × 2 + 4 × 6 + 1 × 1 = 31.
The nature of matrix multiplication means that the number of routes is obtained by multiplying the
matrix representing the number of connections from ab to cde by the matrix representing the number
of connections from cde to f g.
The matrix representing the number of routes from ab to cde is:
c d e
!
a 3 4 1
b 2 1 3
The matrix representing the number of routes from cde to f g is:
f g
c 2 1
 
d
6 3
e 1 2
HELM (2006): 27
The product of these two matrices gives the total number of routes.
 
2 1
3 4 1  6 3 = 3 × 2 + 4 × 6 + 1 × 1 3 × 1 + 4 × 3 + 1 × 2 31 17
=
2 1 3 2×2+1×6+3×1 2×1+1×3+3×2 13 11
1 2
Interpretation
We can interpret the resulting (product) matrix by labelling the columns and rows.
f g
!
a 31 17
b 13 11
Hence there are 31 routes from a to f , 17 from a to g, 13 from b to f and 11 from b to g.
28 HELM (2006):
®
Exercises

1 2 5 6 0 −1
1. If A = B= C= find
3 4 7 8 2 −3
(a) AB, (b) AC, (c) (A + B)C, (d) AC + BC(e) 2A − 3C

cos θ sin θ
2. If a rotation through an angle θ is represented by the matrix A = and a
− sin θ cos θ
cos φ sin φ
second rotation through an angle φ is represented by the matrix B = show
− sin φ cos φ
that both AB and BA represent a rotation through an angle θ + φ.
   
1 2 3 2 4
2 1
3. If A =  −1 −1 −1  , B =  −1 2  , C = , find AB and BC.
1 2
2 2 2 5 6
   
1 2 3 0
1 2 −1
4. If A = , B= 5 0 0 , C =  1 ,
0 −1 2
1 2 −1 −2
verify A(BC) = (AB)C.
 
2 3 −1
5. If A =  0 1 2  then show that AAT is symmetric.
4 5 6


0 1
11 0 0 1 2
6. If A = B= verify that (AB)T =  11 3  = B T AT
2 1 1 1 3
22 7
Answers

19 22 4 −7 16 −30
1. (a) AB = (b) AC = (c) (A + B)C =
43 50 8 −15 24 −46

16 −30 2 7
(d) AC + BC = (e)
24 −46 0 17

cos θ cos φ − sin θ sin φ cos θ sin φ + sin θ cos φ
2. AB =
− sin θ cos φ − cos θ sin φ − sin θ sin φ + cos θ cos φ

cos(θ + φ) sin(θ + φ)
=
− sin(θ + φ) cos(θ + φ)
which clearly represents a rotation through angle θ + φ. BA gives the same result.
   
15 26 8 10
3. AB =  −6 −12 , BC =  0 3 
12 24 16 17

−8
4. A(BC) = (AB)C =
8
HELM (2006): 29

Determinants 7.3
Introduction
Among other uses, determinants allow us to determine whether a system of linear equations has a
unique solution or not. The evaluation of a determinant is a key skill in engineering mathematics and
this Section concentrates on the evaluation of small size determinants. For evaluating larger sizes we
can often use some properties of determinants to help simplify the task.

Prerequisites • know what a matrix is


'
$
• evaluate a 2 × 2 determinant
• use the method of expansion along the top

Learning Outcomes row to evaluate a determinant
• use the properties of determinants to aid
their evaluation
& %
30 HELM (2006):
®
1. Determinant of a 2×2 matrix

a b a b
The determinant of the matrix A = is denoted by
(note the change from square
c d c d
brackets to vertical lines) and is defined to be the number ad − bc. That is:

a b
c d = ad − bc

We can use the notation det(A) or | A | or ∆ to denote the determinant of A.
Task
Find the determinants of the matrices

1 2 4 −1 0 0 1 0
A= , B= , C= D= ,
3 4 −2 −3 0 0 2 3

2 0 −1 0 1 2
E= , F = , G= .
0 4 0 −3 −2 −4
Your solution
Answer
| A |= 1 × 4 − 2 × 3 = −2 | B |= 4 × (−3) − (−1) × (−2) = −12 − 2 = −14
| C |= 0 | D |= 3 | E |= 8 | F |= 3 | G |= −4 + 4 = 0
2. Laplace expansion along the top row

This is a technique which can be used to evaluate determinants of any order. In principle, this method
can use any row or any column
as its starting point. We quote one example: using the top row.
4 1 1

Consider ∆ = 1 2 3 .
3 1 2
First we introduce the idea of a minor. Each element in this array of numbers has an associated
minor formed by removing the column and row in which the element lies and taking the determinant
of the remainder. For example consider element a23 = 3. We strike out the second row and the third
column:
4 1 1
4 1
1 2 3 to leave 3 1 = 4 − 3 = 1.

3 1 2
For the element a31 = 3 we strike out the third row and first column:
4 1 1
1 1

1 2 3 to leave
2 3
= 3 − 2 = 1.

3 1 2
HELM (2006): 31
Section 7.3: Determinants
Task
What is the minor of the element a22 = 2?
Your solution
Answer

4 1
3 2 =8−3=5

Next we introduce the idea of a cofactor. This is a minor with a sign attached. The appropriate
sign comes from the pattern of signs appropriate to a 3 × 3 array:
+ − +
− + −
+ − +
(i.e. positive signs on the leading diagonal and the signs ‘alternate’ everywhere else.)
Each element has a cofactor associated with it. The cofactor of element a11 is denoted by A11 , that
of a23 by A23 and so on.
To obtain the cofactor of an element of a 3 × 3 matrix we simply multiply the minor of that element
by the corresponding sign from the 3 × 3 array of signs.
Hence the cofactor corresponding to a23 is

4 1
A23 = − = −1
3 1

1 1
and the cofactor corresponding to a31 is A31 = + = 1.
2 3
Task
What is the cofactor of the element a22 ?
Your solution
Answer
The sign in the position of a22 in the array of signs is +
Hence, since the minor of this element is +5 the cofactor is A22 = +5.
Cofactors are important as it can be shown that the value of the determinant of a 3 × 3 matrix can
be found from the formula
∆ = a11 A11 + a12 A12 + a13 A13 .
32 HELM (2006):
®
In words “the determinant of a 3 × 3 matrix is obtained by multiplying each element of the first row
by its corresponding cofactor and then adding the three together”. (In fact this rule can be extended
to apply to any row or any column and to any order square matrix.)
Key Point 7
Evaluating General Determinants

n
X
If A is an n × n square matrix then : det(A) = aij Aij
j=1
In words:
The determinant of a square matrix is obtained by multiplying each element
of row i by its corresponding cofactor and then adding these products together.

4 1 1
In the case of ∆ = 1 2 3 we have a11 = 4, a12 = 1, a13 = 1,
3 1 2

2 3
A11 = + =4−3=1
1 2

1 3
A12 = − = −(2 − 9) = 7
3 2

1 2
A13 = + = 1 − 6 = −5
3 1
Hence ∆ = 4 × 1 + 1 × 7 + 1 × −5 = 6.
Alternatively, choosing to expand along the second row:
∆ = a21 A21 + a22 A22 + a23 A23

1 1
+2 4 1 + 3 − 4 1

= 1 − =6 as before.
1 2 3 2 3 1
HELM (2006): 33

Task 1 −1 3

Use expansion along the first row to find ∆ = 0 2 6
−2 1 5
Your solution
Answer
a11 = 1, a12 = −1, a13 = 3

2 6
A11 = + = 10 − 6 = 4
1 5

0 6
A12 = − = −(0 + 12) = −12
−2 5

0 2
A13 = + = 2 + 2 = 4.
−2 1
Hence ∆ = 1 × 4 + (−1) × (−12) + 3 × 4 = 4 + 12 + 12 = 28, as before.
3. Properties of determinants
Often, especially with determinants of large order, we can simplify the evaluation rules. In this Section
we quote some useful properties of determinants in general.
1. If two rows (or two columns) of a determinant are interchanged then the value of the determi-
nant is multiplied by (−1).

4 3 3 4
= 8 − 3 = 5 but (interchanging columns)
For example 2 1 = 3 − 8 = −5 and

1 2
1 2
(interchanging rows) = 3 − 8 = −5.
4 3
2. The determinant of a matrix A and the determinant of its transpose AT are equal.

1 2 1 3
3 4 2 4 = 4 − 6 = −2
=
34 HELM (2006):
®
3. If two rows (or two columns) of a matrix A are equal then it has zero determinant.
For example, the following determinant has two identical rows:

1 2 3
= 1× 2 3 + 2 × − 1 3 +3× 1 2

1 2 3
5 6 4 6 4 5
4 5 6
= −3 + 2 × (6) + 3 × (−3) = 0.
4. If the elements of one row (or one column) of a determinant are multiplied by k, then the
resulting determinant is k times the given determinant:

1 2 3 1 2 3

4 8 6 = 2 2 4 3 .

7 8 9 7 8 9
Note that if one row (or column) of a determinant is a multiple of another row (or column)
then the value of the determinant is zero. (This follows from properties 3 and 4.)
For example:

2 4 −1
2 1

4 1

4

2
4
2 1
= 2 ×
−8 2 + 4 × −
−4 2 − 1 ×
−4 −8

−4 −8 2
= 2(12) + 4(−12) − (−24) = 0
This is predictable as the 3rd row is (−2) times the first row.
5. If we add (or subtract) a multiple of one row (or column) to another, the value of the deter-
minant is unchanged.

1 2
Given , add (2 × row 1) to (row 2) gives
4 5

1 2 1 2
= = 9 − 12 = −3 = 1 2

4+2×1 5+2×2 6 9 4 5
6. The determinant of a lower triangular matrix, an upper triangular matrix or a diagonal matrix
is the product of the elements on the leading diagonal.
As an example, it is easily confirmed that each of the following determinants has the same
value 1 × 4 × 6 = 24.

1 2 3 1 0 0 1 0 0

0 4 5 , 2 4 0 , 0 4 0

0 0 6 3 5 6 0 0 6
HELM (2006): 35
Task
This task is in four parts. Consider

1 4 8 2

2 −1 1 −3
∆ =
0 2 4 2
0 3 6 3
(a) Use property 2 to find another matrix whose determinant is equal to ∆:

Your solution
Answer

1 2 0 0

4 −1 2 3
∆ = , by transposing the matrix.
8 1 4 6

2 −3 2 3
(b) Now expand along the top row to express ∆ as the sum of two products, each of a number and
a 3 × 3 determinant:
Your solution
Answer
−1 2 3 4 2 3

∆ = 1 × 1 4 6 −2× 8 4 6

−3 2 3 2 2 3
(c) Use the statement after property 4 to show that the second of the 3 × 3 determinants is zero:
Your solution
Answer
In the second 3 × 3 determinant, row 2 = 2×row 1 hence the determinant has value zero.
(d) Use the statement after property 4 to evaluate the first determinant:
Your solution
Answer
3
In the first 3 × 3 determinant column 3 = 2
× column 2. Hence this determinant is also zero.
Therefore ∆ = 0.
36 HELM (2006):
®
Exercises
1. Use Laplace expansion along the 1st row to determine

3 1 −4

6 9 −2

−1 2 1
Show that the same value is obtained if you choose any other row or column for your expansion.
2. Using any of the properties of determinants to minimise the arithmetic, evaluate

2 4 6 4
12 27 12
0 4 6 9
(a) 28 18 24
(b)
70 15 40 2 1 4 0

1 2 3 2
3. Find the cofactors of x, y, z in the determinant

1 1 1

2 3 4

x y z
4. Prove that, no matter what the values of x, y, z, are

y+z z+x x+y

x y z =0

1 1 1
Answers

9 −2 6 −2 6 9
1. 3
− 1
− 4
= 3(9 + 4) − 1(6 − 2) − 4(12 + 9) = −49
2 1 −1 1 −1 2
2. (a) Take out common factors in rows and columns

2 3 1 0 0 1

720 7 3 3 = 720 1 −6 3 using (−2C3 + C1 ) then (−3C3 + C2 ).
7 1 2 3 −5 2
The value of the determinant (expand along top row) is then easily found
as 720 × 13 = 9360.
(b) Zero since (row 1) is 2 × (row 4).
3. Cofactors of x, y, z are 1, −2, 1 respectively.
HELM (2006): 37

The Inverse of a Matrix 7.4
Introduction
1
In number arithmetic every number a (6= 0) has a reciprocal b written as a−1 or such that
a
ba = ab = 1. Some, but not all, square matrices have inverses. If a square matrix A has an inverse,
A−1 , then
AA−1 = A−1 A = I.
We develop a rule for finding the inverse of a 2 × 2 matrix (where it exists) and we look at two
methods of finding the inverse of a 3 × 3 matrix (where it exists).
Non-square matrices do not possess inverses so this Section only refers to square matrices.
#
• be familiar with the algebra of matrices
Prerequisites • be able to calculate a determinant
• know what a cofactor is
"
' !
$
• state the condition for the existence of an
inverse matrix
Learning Outcomes • use the formula for finding the inverse of

a 2 × 2 matrix
• find the inverse of a 3 × 3 matrix using row
operations and using the determinant method
& %
38 HELM (2006):
®
1. The inverse of a square matrix

We know that any non-zero number k has an inverse; for example 2 has an inverse 21 or 2−1 . The
inverse of the number k is usually written k1 or, more formally, by k −1 . This numerical inverse has
the property that
k × k −1 = k −1 × k = 1
We now show that an inverse of a matrix can, in certain circumstances, also be defined.
Given an n × n square matrix A, then an n × n square matrix B is said to be the inverse matrix
of A if
AB = BA = I
where I is, as usual, the identity matrix (or unit matrix) of the appropriate size.
Example 6
0 − 12

−1 1
Show that the inverse matrix of A = is B =
−2 0 1 − 12
Solution
All we need do is to check that AB = BA = I.

−1 1 1 0 −1 1 −1 1 0 −1 1 2 0 1 0
AB = ×2 =2 × =2 =
−2 0 2 −1 −2 0 2 −1 0 2 0 1
The reader should check that BA = I also.
We make three important remarks:
• Non-square matrices do not have inverses.

• The inverse of A is usually written A−1 .
• Not all square matrices have inverses.
Task
1 0 a b
Consider A = , and let B = be a possible inverse of A.
2 0 c d
(a) Find AB and BA:
Your solution
AB = BA =
HELM (2006): 39
Section 7.4: The Inverse of a Matrix
Answer
a b a + 2b 0
AB = , BA =
2a 2b c + 2d 0

1 0
(b) Equate the elements of AB to those of I = and solve the resulting equations:
0 1
Your solution
Answer
a = 1, b = 0, 2a = 0, 2b = 1. Hence a = 1, b = 0, a = 0, b = 21 . This is not possible!
Hence, we have a contradiction. The matrix A therefore has no inverse and is said to be a singular
matrix. A matrix which has an inverse is said to be non-singular.
• If a matrix has an inverse then that inverse is unique.
Suppose B and C are both inverses of A. Then, by definition of the inverse,
AB = BA = I and AC = CA = I
Consider the two ways of forming the product CAB
1. CAB = C(AB) = CI = C
2. CAB = (CA)B = IB = B.
Hence B = C and the inverse is unique.

• There is no such operation as division in matrix algebra.
B
We do not write but rather
A
A−1 B or BA−1 ,
depending on the order required.
• Assuming that the square matrix A has an inverse A−1 then the solution of
the system of equations AX = B is found by pre-multiplying both sides by A−1 .
AX = B
−1 −1
pre-multiplying by A : A (AX) = A−1 B,
using associativity: A−1 A)X = A−1 B
using A−1 A = I : IX = A−1 B,
using property of I : X = A−1 B which is the solution we seek.
40 HELM (2006):
®
2. The inverse of a 2×2 matrix

In this subsection we show how the inverse of a 2 × 2 matrix can be obtained (if it exists).
Task
Form the matrix products AB and BA where

a b d −b
A= and B =
c d −c a
Your solution
AB = BA =
Answer
ad − bc 0 1 0
AB = = (ad − bc) = (ad − bc)I
0 ad − bc 0 1

ad − bc 0
BA = = (ad − bc)I
0 ad − bc

1 d −b
You will see that had we chosen C = instead of B then both products AC
ad − bc −c a
and CA will be equal to I. This requires ad − bc 6= 0. Hence this matrix C is the inverse
of A.

1 0
However, note, that if ad − bc = 0 then A has no inverse. (Note that for the matrix A = ,
2 0
which occurred in the last task, ad − bc = 1 × 0 − 0 × 2 = 0 confirming, as we found, that A has
no inverse.)
Key Point 8
The Inverse of a 2××2 Matrix

a b
If ad − bc 6= 0 then the 2 × 2 matrix A = has a (unique) inverse given by
c d

−1 1 d −b
A =
ad − bc −c a
Note that ad − bc = |A|, the determinant of the matrix A.

In words: To find the inverse of a 2 × 2 matrix A we interchange the diagonal elements, change the
sign of the other two elements, and then divide by the determinant of A.
HELM (2006): 41
Task
Which of the following matrices has an inverse?

1 0 1 1 1 −1 1 0
A= , B= , C= , D=
2 3 −1 1 −2 2 0 1
Your solution
Answer
|A| = 1 × 3 − 0 × 2 = 3; |B| = 1 + 1 = 2; |C| = 2 − 2 = 0; |D| = 1 − 0 = 1.
Therefore, A, B and D each has an inverse. C does not because it has a zero determinant.
Task
Find the inverses of the matrices A, B and D in the previous Task.
Use Key Point 8:

Your solution
A−1 = B −1 = C −1 =
Answer
3 0 1 −1 1 0
A−1 = 13 , B −1
= 1
, D −1
= =D
−2 1 2 1 1 0 1

cos θ sin θ
It can be shown that the matrix A = represents an anti-clockwise rotation
− sin θ cos θ
through an angle θ in an xy-plane about the origin. The matrix B represents a rotation clockwise
through an angle θ. It is given therefore by

cos(−θ) sin(−θ) cos θ − sin θ
B= =
− sin(−θ) cos(−θ) sin θ cos θ
42 HELM (2006):
®
Task
Form the products AB and BA for these ‘rotation matrices’. Confirm that B is
the inverse matrix of A.
Your solution
AB =
BA =
Answer

cos θ sin θ cos θ − sin θ
AB =
− sin θ cos θ sin θ cos θ
cos2 θ + sin2 θ

− cos θ sin θ + sin θ cos θ 1 0
= = =I
− sin θ cos θ + cos θ sin θ sin2 θ + cos2 θ 0 1
Similarly, BA = I
Effectively: a rotation through an angle θ followed by a rotation through angle −θ is equivalent to
zero rotation.
HELM (2006): 43
3. The inverse of a 3×3 matrix - Gauss elimination method
It is true, in general, that if the determinant of a matrix is zero then that matrix has no inverse. If
the determinant is non-zero then the matrix has a (unique) inverse. In this Section and the next we
look at two ways of finding the inverse of a 3 × 3 matrix; larger matrices can be inverted by the same
methods - the process is more tedious and takes longer. The 2 × 2 case could be handled similarly
but as we have seen we have a simple formula to use.
The method we now describe for finding the inverse of a matrix has many similarities to a technique
used to obtain solutions of simultaneous equations. This method involves operating on the rows of
a matrix in order to reduce it to a unit matrix.
The row operations we shall use are
(i) interchanging two rows

(ii) multiplying a row by a constant factor
(iii) adding a multiple of one row to another.
Note that in (ii) and (iii) the multiple could be negative or fractional, or both.
The Gauss elimination method is outlined in the following Key Point:
Key Point 9
Matrix Inverse − Gauss Elimination Method
We use the result, quoted without proof, that:
if a sequence of row operations applied to a square matrix A reduces
it to the identity matrix I of the same size then the same sequence of
operations applied to I reduces it to A−1 .
Three points to note:
• If it is impossible to reduce A to I then A−1 does not exist. This will become evident by the
appearance of a row of zeros.
• There is no unique procedure for reducing A to I and it is experience which leads to selection
of the optimum route.
• It is more efficient to do the two reductions, A to I and I to A−1 , simultaneously.
44 HELM (2006):
®
Suppose we wish to find the inverse of the matrix

 
1 3 3
A= 1 4 3 
2 7 7
We first place A and I adjacent to each other.
   
1 3 3 1 0 0
 1 4 3   0 1 0 
2 7 7 0 0 1
Phase 1
 
1 ∗ ∗
We now proceed by changing the columns of A left to right to reduce A to the form  0 1 ∗ 
0 0 1
where ∗ can be any number. This form is called upper triangular.
First we subtract row 1 from row 2 and twice row 1 from row 3. ‘Row’ refers to both matrices.
       
1 3 3 1 0 0 1 3 3 1 0 0
 1 4 3   0 1 0  R2 − R1 ⇒  0 1 0   −1 1 0 
2 7 7 0 0 1 R3 − 2R1 0 1 1 −2 0 1
Now we subtract row 2 from row 3
       
1 3 3 1 0 0 1 3 3 1 0 0
 0 1 0   −1 1 0  ⇒  0 1 0   −1 1 0 
0 1 1 −2 0 1 R3 − R2 0 0 1 −1 −1 1
Phase 2
This consists of continuing the row operations to reduce the elements above the leading diagonal to
zero.
We proceed right to left. We subtract 3 times row 3 from row 1 (the elements in row 2 column 3 is
already zero.)
       
1 3 3 1 0 0 1 3 0 4 3 −3
 0 1 0   −1 1 0  ⇒  0 1 0   −1 1 0 
0 0 1 1 1 1 R1 − 3R3 0 0 1 −1 −1 1
Finally we subtract 3 times row 2 from row 1.
       
1 3 0 4 3 −3 1 0 0 7 0 −3
 0 1 0   −1 1 0  ⇒ 0 1 0   −1 1 0 
0 0 1 −1 −1 1 R1 − 3R2 0 0 1 −1 −1 1
 
7 0 −3
Then we have A−1 =  −1 1 0 
−1 −1 1
(This can be verified by showing that AA−1 = I or A−1 A = I.)
HELM (2006): 45
   
Task 0 1 1 1 0 0
Consider A =  2 3 −1  , I =  0 1 0 .
−1 2 1 0 0 1
Use the Gauss elimination method to obtain A−1 .
First interchange rows 1 and 2, then carry out the operation (row 3) + 21 (row 1):
Your solution
Answer
       
0 1 1 1 0 0 R1 ↔ R2 2 3 −1 0 1 0
 2 3 −1   0 1 0  ⇒  0 1 1   1 0 0 
−1 2 1 0 0 1 −1 2 1 0 0 1
      
2 3 −1 0 1 0 2 3 −1 0 1 0
 0 1 1   1 0 0  ⇒  0 1 1  1 0 0 
−1 2 1 0 0 1 R3 + 12 R1 7
0 2 1
2
0 21 1
Now carry out the operation (row 3) − 72 (row 2) followed by (row 1) − 31 (row 3)
and (row 2) + 13 (row 3):
Your solution
46 HELM (2006):
®
Answer
       
2 3 −1 0 1 0 2 3 −1 0 1 0
 0 1 1   1 0 0  ⇒  0 1 1   1 0 0 
7 1
0 2 2
0 12 1 R3 − 72 R2 0 0 −3 − 27 21 1
 7 5 1

 +6 +6 −3
R1 − 13 R3
    
2 3 −1 0 1 0 2 3 0  1


 1 0 0  R2 + 1 R3 1 1
⇒  −6
 0 1 1   0 1 0  
3 6 3 
0 0 −3 − 72 12 1 0 0 −3  
− 27 1
2
1
Next, subtract 3 times row 2 from row 1, then, divide row 1 by 2 and row 3 by (−3).
Finally identify A−1 :
Your solution
Answer
7 5
− 13 10 2
− 43
   
  6 6   6 6
2 3 0   R1 − 3R2 2 0 0  
 1 1 1
  1 1 1

 − ⇒  0 1  −6
 0 1 0   0  
 6 6 3  6 3 
0 0 −3   0 0 −3  
− 72 1
2
1 − 72 1
2
1
10 2
− 43 5 1
− 23
   
  6 6   6 6
2 0 0   R1 ÷ 2 1 0 0  
 1 1 1
  1 1 1

 −
 −6
⇒  0 1 0 
 0 1 0   
 6 6 3  6 3 
0 0 −3   R3 ÷ (−3) 0 0 1  
− 72 1
2
1 7
6
− 61 − 13
5 1
− 23
 
6 6  

 1

 1 5 1 −4
1 1
Hence A−1  −6
= 6 3
 =  −1
 6 1 2 
  7 −1 −2
7
6
− 61 − 13
HELM (2006): 47
4. The inverse of a 3×3 matrix - determinant method
This method which employs determinants, is of importance from a theoretical perspective. The
numerical computations involved are too heavy for matrices of higher order than 3 × 3 and in such
cases the Gauss elimination approach is prefered.
To obtain A−1 using the determinant approach the steps in the following keypoint are followed:
Key Point 10
Matrix Inverse − the Determinant Method
Given a square matrix A:
• Find |A|. If |A| = 0 then A−1 does not exist. If |A| 6= 0 we can proceed to find the inverse
matrix, as follows.
• Replace each element of A by its cofactor (see Section 7.3).
• Transpose the result to form the adjoint matrix, denoted by adj(A)

1
• Then calculate A−1 = adj(A).
|A|
 
Task 0 1 1
Find the inverse of A =  2 3 −1 . This will require five stages.
−1 2 1
(a) First find |A|:

Your solution
Answer
|A| = 0 × 5 + 1 × (−1) + 1 × 7 = 6
48 HELM (2006):
®
(b) Now replace each element of A by its minor:
Your solution
Answer
 
3 −1 2 −1 2 3

 2 1 −1 1 −1 2 
 
   

 1 1

0 1 0 1 
 5 1 7
−1 1 −1 2  = −1 1 1 
  
 2 1



 −4 −2 −2
 
 1 1 0 1 0 1 

3 −1 2 −1 2 3
(c) Now attach the signs from the array

+ − +
− + −
+ − +
(so that where a + sign is met no action is taken and where a − sign is met the sign is changed) to
obtain the matrix of cofactors:
Your solution
Answer
 
5 −1 7
 1 1 −1 
−4 2 −2
(d) Then transpose the result to obtain the adjoint matrix:
Your solution
HELM (2006): 49
Answer  
5 1 −4
Transposing, adj(A) =  −1 1 2 
7 −1 −2
(e) Finally obtain A−1 :

Your solution
Answer  
5 1 −4
1 1
A−1 = adj(A) =  −1 1 2  as before using Gauss elimination.
det(A) 6
7 −1 −2
Exercises
1. Find the inverses of the following matrices

1 2 −1 0 1 1
(a) (b) (c)
3 4 0 4 −1 1
2. Use the determinant method and also the Gauss elimination method to find the inverse of the
following matrices
   
2 1 0 1 1 1
(a) A =  1 0 0  (b) B =  0 1 1 
4 1 2 0 0 1
Answers
−1
" #
1

4 −2
0 1

1 −1

1. (a) − (b) 1 (c)
2 −3 1 0 2 1 1
4
 T  
0 −2 1 0 −2 0
−1 1 1
2. (a) A = − −2 4 2  = −  −2 4 0 
2 2
0 0 −1 1 2 −1
 T  
1 0 0 1 −1 0
(b) B −1 =  −1 1 0  = 0 1 −1 
0 −1 1 0 0 1
50 HELM (2006):
Contents 8
Matrix Solution of
Equations
8.1 Solution by Cramer’s Rule 2
8.2 Solution by Inverse Matrix Method 13
8.3 Solution by Gauss Elimination 22
Learning outcomes
In this Workbook you will learn to apply your knowledge of matrices to solve systems of
linear equations. Such systems of equations arise very often in mathematics, science
and engineering. Three basic techniques are outlined, Cramer's method, the inverse
matrix approach and the Gauss elimination method. The Gauss elimination method is,
by far, the most widely used (since it can be applied to all systems of linear equations).
However, you will learn that, for certain (usually small) systems of linear equations the
other two techniques may be better.
Solution by
Cramer’s Rule 8.1
Introduction
The need to solve systems of linear equations arises frequently in engineering. The analysis of electric
circuits and the control of systems are two examples. Cramer’s rule for solving such systems involves
the calculation of determinants and their ratio. For systems containing only a few equations it is a
useful method of solution.

Prerequisites • be able to evaluate 2 × 2 and 3 × 3

determinants

'
$
• state and apply Cramer’s rule to find the
solution of two simultaneous linear equations
• state and apply Cramer’s rule to find the

Learning Outcomes solution of three simultaneous linear
On completion you should be able to . . . equations
• recognise cases where the solution is not

unique or a solution does not exist
& %
2 HELM (2006):
Workbook 8: Matrix Solution of Equations
®
1. Solving two equations in two unknowns

If we have one linear equation
ax = b
in which the unknown is x and a and b are constants then there are just three possibilities:
b
• a 6= 0 then x = = a−1 b. In this case the equation ax = b has a unique solution for x.
a
• a = 0, b = 0 then the equation ax = b becomes 0 = 0 and any value of x will do. In this case
the equation ax = b has infinitely many solutions.
• a = 0 and b 6= 0 then ax = b becomes 0 = b which is a contradiction. In this case the equation
ax = b has no solution for x.
What happens if we have more than one equation and more than one unknown? We shall find that
the solutions to such systems can be characterised in a manner similar to that occurring for a single
equation; that is, a system may have a unique solution, an infinity of solutions or no solution at all.
In this Section we examine a method, known as Cramer’s rule and employing determinants, for solving
systems of linear equations.
Consider the equations
ax + by = e (1)
cx + dy = f (2)
where a, b, c, d, e, f are given numbers. The variables x and y are unknowns we wish to find. The
pairs of values of x and y which simultaneously satisfy both equations are called solutions. Simple
algebra will eliminate the variable y between these equations. We multiply equation (1) by d, equation
(2) by b and subtract:
first, (1) × d adx + bdy = ed
then, (2) × b bcx + bdy = bf

(we multiplied in this way to make the coefficients of y equal.)
Now subtract to obtain
(ad − bc)x = ed − bf (3)
Task
Starting with equations (1) and (2) above, eliminate x.
Your solution
HELM (2006): 3
Section 8.1: Solution by Cramer’s Rule
Answer
Multiply equation (1) by c and equation (2) by a to obtain
acx + bcy = ec and acx + ady = af.
Now subtract to obtain
(bc − ad)y = ec − af
If we multiply this last equation in the Task above by −1 we obtain
(ad − bc)y = af − ec (4)
Dividing equations (3) and (4) by ad − bc we obtain the solutions
ed − bf af − ec
x= , y= (5)
ad − bc ad − bc
There is of course one proviso: if ad − bc = 0 then neither x nor y has a defined value.
If we choose to express these solutions in terms of determinants we have the formulation for the
solution of simultaneous equations known as Cramer’s rule.

a b
If we define ∆ as the determinant and provided ∆ 6= 0 then the unique solution of the
c d
equations
ax + by = e
cx + dy = f
is by (5) given by

∆x ∆y e b
, ∆y = a e

x= , y= where ∆x =
∆ ∆ f d c f
Now ∆ is the determinant of coefficients

on the left-hand sides of the equations. In the expression
a
∆x the coefficients of x (i.e. which is column 1 of ∆) are replaced by the terms on the
c
e
right-hand sides of the equations (i.e. by ). Similarly in ∆y the coefficients of y (column 2 of
f
∆) are replaced by the terms on the right-hand sides of the equations.
4 HELM (2006):
®
Key Point 1
Cramer’s Rule for Two Equations

The unique solution to the equations:
ax + by = e
cx + dy = f
is given by:
∆x ∆y
x= , y=
∆ ∆
in which
a b e b a e
∆ = ∆x = , ∆y =
c d f d c f
If ∆ = 0 this method of solution cannot be used.
Task
Use Cramer’s rule to solve the simultaneous equations
2x + y = 7
3x − 4y = 5
Your solution
Answer
2 1
Calculating ∆ = = −11. Since ∆ 6= 0 we can proceed with Cramer’s solution.
3 −4

2 1
= −11 1 7 1 1 2 7
∆ = x = , y=
3 −4 ∆ 5 −4 ∆ 3 5
(−28 − 5) (10 − 21) −33 −11
i.e. x = , y= implying: x = =3, y= = 1.
(−11) (−11) −11 −11
You can check by direct substitution that these are the exact solutions to the equations.
HELM (2006): 5
Task
Use Cramer’s rule to solve the equations
2x − 3y = 6 2x − 3y = 6
(a) (b)
4x − 6y = 12 4x − 6y = 10
Your solution
Answer
2 −3
You should have checked first, since
4 −6

2 −3
4 −6 = −12 − (−12) = 0. Hence there is no unique solution in either case.

In the system (a) the second equation is twice the first so there are infinitely many solutions. (Here
we can give y any value we wish, t say; but then the x value is always (6 + 3t)/2. So for each value
of t there are values for x and y which simultaneously satisfy both equations. There is an infinite
number of possible solutions). In (b) the equations are inconsistent (since the first is 2x − 3y = 6
and the second is 2x − 3y = 5 which is not possible). Hence there are no solutions.
Notation
For ease of generalisation to larger systems we write the two-equation system in a different notation:
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2
Here the unknowns are x1 and x2 , the right-hand sides are b1 and b2 and the coefficients are aij
where, for example, a21 is the coefficient of x1 in equation two. In general, aij is the coefficient of
xj in equation i.
Cramer’s rule can then be stated as follows:

a11 a12
If
6= 0, then the equations
a21 a22
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2
have solution

b1 a12 a11 b1

b2 a22 a21 b2
x1 = , x2 = .
a11 a12 a11 a12
a21 a22 a21 a22
6 HELM (2006):
®
2. Solving three equations in three unknowns

Cramer’s rule can be extended to larger systems of simultaneous equations but the calculational effort
increases rapidly as the size of the system increases. We quote Cramer’s rule for a system of three
equations.
Key Point 2
Cramer’s Rule for Three Equations
The unique solution to the system of equations:
a11 x1 + a12 x2 + a13 x3 = b1

a21 x1 + a22 x2 + a23 x3 = b2
a31 x1 + a32 x2 + a33 x3 = b3
is
∆x1 ∆x2 ∆x3
x1 = , x2 = , x3 =
∆ ∆ ∆
in which
a11 a12 a13

∆ = a21 a22 a23
a31 a32 a33
and
b1 a12 a13 a11 b1 a13 a11 a12 b1

∆x1 = b2 a22 a23
∆x2 = a21 b2 a23
∆x3 = a21 a22 b2
b3 a32 a33 a31 b3 a33 a31 a32 b3
If ∆ = 0 this method of solution cannot be used.
Notice that the structure of the fractions is similar to that for the two-equation case. For example,
the determinant forming the numerator of x1 is obtained from the determinant of coefficients, ∆, by
replacing the first column by the right-hand sides of the equations.
Notice too the increase in calculation: in the two-equation case we had to evaluate three
2 × 2 determinants, whereas in the three-equation case we have to evaluate four 3 × 3 determi-
nants. Hence Cramer’s rule is not really practicable for larger systems.
HELM (2006): 7
Task
Use Cramer’s rule to solve the system
x1 − 2x2 + x3 = 3
2x1 + x2 − x3 = 5
3x1 − x2 + 2x3 = 12.
First check that ∆ 6= 0:

Your solution
Answer

1 −2 1

∆ = 2 1 −1 .
3 −1 2
Expanding along the top row,

1 −1 2 −1
+1× 2 1

∆ = 1 × − (−2) ×
−1 2 3 2 3 −1
= 1 × (2 − 1) + 2 × (4 + 3) + 1 × (−2 − 3)
= 1 + 14 − 5 = 10
Now find the value of x1 . First write down the expression for x1 in terms of determinants:
Your solution
Answer
3 −2 1

x1 = 5 1 −1 ÷ ∆
12 −1 2
Now calculate x1 explicitly:

Your solution
8 HELM (2006):
®
Answer
The numerator is found by expanding along the top row to be

1 −1 5 −1
+1× 5 1

3 × − (−2) ×
−1 2 12 2 12 −1
= 3 × 1 + 2 × 22 + 1 × (−17)
= 30
1
Hence x1 = × 30 = 3
10
In a similar way find the values of x2 and x3 :
Your solution
Answer

1 3 1
1
x2 = 2 5 −1
10

3 12 2

1 5 −1 2 −1 2 5
= 1× −3× + 1 ×
10 12 2 3 2 3 12
1
= {22 − 3 × 7 + 9} = 1
10
1 1 5 2 5 2 1
x3 = 1 × − (−2) × +3×
10 −1 12 3 12 3 −1
1
= {17 + 2 × 9 + 3 × (−5)} = 2
10
HELM (2006): 9
Stresses and strains on a section of material
Introduction
An important engineering problem is to determine the effect on materials of different types of loading.
One way of measuring the effects is through the strain or fractional change in dimensions in the
material which can be measured using a strain gauge.
Problem in words
In a homogeneous, isotropic and linearly elastic material, the strains (i.e. fractional displacements)
on a section of the material, represented by εx , εy , εz for the x-, y-, z-directions respectively, can
be related to the stresses (i.e. force per unit area), σx , σy , σz by the following system of equations.
1
εx = ( σx −vσy −vσz )
E
1
εy = ( −vσx +σy −vσz )
E
1
εz = ( −vσx −vσy +σz )
E
where E is the modulus of elasticity (also called Young’s modulus) and v is Poisson’s ratio which
relates the lateral strain to the axial strain.
Find expressions for the stresses σx , σy , σz , in terms of the strains εx , εy , and εz .

The given system of equations can be written as a matrix equation:
     
εx 1 −v −v σx
1  −v
 εy  = 1 −v   σy 
E
εz −v −v 1 σz
We can write this equation as
1
ε= Aσ
E
     
εx 1 −v −v σx
where ε =  εy  , A =  −v 1 −v  and σ =  σy 
εz −v −v 1 σz
This matrix equation must be solved to find the vector σ in terms of the vector ε and the inverse of
the matrix A.
10 HELM (2006):
®
1
ε= Aσ
E
Multiplying both sides of the expression by E we get
Eε = Aσ
Multiplying both sides by A −1 we find that:
A −1 Eε = A −1Aσ
But A −1A = I so this becomes
A−1 ε
σ = EA
To find expressions for the stresses σx , σy , σz , in terms of the strains εx , εy and εz , we must find
 A.
the inverse of the matrix 
1 −v −v
To find the inverse of  −v 1 −v  we first find the matrix of minors which is:
−v −v 1
 

1 −v −v −v −v 1


 −v 1 −v 1 −v −v 

   
  1 − v 2 −v − v 2 v 2 + v

 −v −v 1 −v 1 −v 
 =  −v − v 2 1 − v 2 −v − v 2  .
 −v 1 −v 1 −v −v 



 v 2 + v −v − v 2 1 − v 2
 

−v −v 1 −v 1 −v 

1 −v −v −v −v 1
We then apply the pattern of signs:
 
+ − +
− + −
+ − +
to obtain the matrix of cofactors
 
1 − v2 v + v2 v2 + v
 v + v2 1 − v2 v + v2  .
v2 + v v + v2 1 − v2
To find the adjoint we take the transpose of the above, (which is the same as the original matrix
since the matrix is symmetric)
 
1 − v2 v + v2 v2 + v
 v + v2 1 − v2 v + v2  .
v2 + v v + v2 1 − v2
The determinant of the original matrix is
1 × (1 − v 2 ) − v(v + v 2 ) − v(v 2 + v) = 1 − 3v 2 − 2v 3 .
HELM (2006): 11
Finally we divide the adjoint by the determinant to find the inverse, giving
 
1 − v2 v + v2 v + v2
1  v + v2 1 − v2 v + v2  .
2
1 − 3v − 2v 3
v + v2 v + v2 1 − v2
    
σx 1 − v2 v + v2 v + v2 εx
−1 E  v + v2
Now we found that σ = EA A ε so  σy  = 2 3
1 − v2 v + v 2   εy 
1 − 3v − 2v
σz v + v2 v + v2 1 − v2 εz
We can write this matrix equation as 3 equations relating the stresses σx , σy , σz , in terms of the
strains εx , εy and εz , by multiplying out this matrix expression, giving:
E
(1 − v 2 )εx + (v + v 2 )εy + (v + v 2 )εz

σx =
1 − 3v 2 − 2v 3
E
(v + v 2 )εx + (1 − v 2 )εy + (v + v 2 )εz

σy =
1 − 3v 2 − 2v 3
E
(v + v 2 )εx + (v + v 2 )εy + (1 − v 2 )εz

σz =
1 − 3v 2 − 2v 3
Interpretation
Matrix manipulation has been used to transform three simultaneous equations relating strain to stress
into simultaneous equations relating stress to strain in terms of the elastic constants. These would
be useful for deducing the applied stress if the strains are known. The original equations enable
calculation of strains if the applied stresses are known.
Exercises
1. Solve the following using Cramer’s rule:
2x − 3y = 1 2x − 5y = 2 6x − y = 0
(a) (b) (c)
4x + 4y = 2 −4x + 10y = 1 2x − 4y = 1
2. Using Cramer’s rule obtain the solutions to the following sets of equations:
2x1 + x2 − x3 = 0 x1 − x2 + x3 = 1
(a) x1 + x3 = 4 (b) −x1 + x3 = 1
x1 + x2 + x3 = 0 x1 + x2 − x3 = 0
Answers
1 1 3
1. (a) x = , y = 0 (b) ∆ = 0, no solution (c) x = − , y = −
2 22 11
8 4 1 3
2. (a) x1 = , x2 = −4, x3 = (b) x1 = , x2 = 1, x3 =
3 3 2 2
12 HELM (2006):
®
Solution by
Inverse Matrix Method 8.2
Introduction
The power of matrix algebra is seen in the representation of a system of simultaneous linear equations
as a matrix equation. Matrix algebra allows us to write the solution of the system using the inverse
matrix of the coefficients. In practice the method is suitable only for small systems. Its main use is
the theoretical insight into such problems which it provides.
' $
• be familiar with the basic rules of matrix
algebra
Prerequisites • be able to evaluate 2 × 2 and 3 × 3

determinants
• be able to find the inverse of 2 × 2 and 3 × 3
matrices
&
' %
$
• use the inverse matrix of coefficients to solve
a system of two linear simultaneous equations
• use the inverse matrix of coefficients to solve

Learning Outcomes a system of three linear simultaneous
On completion you should be able to . . . equations
• recognise and identify cases where there is no

solution or no unique solution
& %
HELM (2006): 13
Section 8.2: Solution by Inverse Matrix Method
1. Solving a system of two equations using the inverse matrix
If we have one linear equation
ax = b
b
in which the unknown is x and a and b are constants and a 6= 0 then x = = a−1 b.
a
What happens if we have more than one equation and more than one unknown? In this Section we
copy the algebraic solution x = a−1 b used for a single equation to solve a system of linear equations.
As we shall see, this will be a very natural way of solving the system if it is first written in matrix
form.
Consider the system
2x1 + 3x2 = 5
x1 − 2x2 = −1.
In matrix form this becomes

2 3 x1 5
= which is of the form AX = B.
1 −2 x2 −1
If A−1 exists then the solution is
X = A−1 B. (compare the solution x = a−1 b above)
Task
2 3
Given the matrix A = find its determinant. What does this tell you
1 −2
about A−1 ?
Your solution
Answer
|A| = 2 × (−2) − 1 × 3 = −7
6 0 then A−1 exists.
since |A| =
Now find A−1

Your solution
Answer
1 −2 −3 1 2 3
A−1 = =
(−7) −1 2 7 1 −2
14 HELM (2006):
®
Task
2 3 5
Solve the system AX = B where A = and B is .
1 −2 −1
Your solution
Answer
1−1 2 3 5 1 7 1
X=A B= = = . Hence x1 = 1, x2 = 1.
7 1 −2 −1 7 7 1
Task
Use the inverse matrix method to solve
2x1 + 3x2 = 3
5x1 + 4x2 = 11
Your solution
Answer
2 3 x1 3
AX = B is =
5 4 x2 11

−1 1 4 −3
|A| = 2 × 4 − 3 × 5 = −7 and A =−
7 −5 2
Using X = A−1 B:

x1 1 4 −3 3 1 4 × 3 − 3 × 11 1 −21 3
X= =− =− =− =
x2 7 −5 2 11 7 −5 × 3 + 2 × 11 7 7 −1
So x1 = 3, x2 = −1
HELM (2006): 15
Currents in two loops
In the circuit shown find the currents (i1 , i2 ) in the loops.
3v 6Ω
5Ω 3Ω
i1 i2
4v
Figure 1
Solution
We note that the current across the 3 Ω resistor (travelling top to bottom in the diagram) is given
by (i1 − i2 ). With this proviso we can apply Kirchhoff’s law:
In the left-hand loop 3(i1 − i2 ) + 5i1 = 3 → 8i1 − 3i2 = 3
In the right-hand loop 3(i2 − i1 ) + 6i2 = 4 → −3i1 + 9i2 = 4
In matrix form:

8 −3 i1 3
=
−3 9 i2 4

8 −3 1 9 3
The inverse of is so solving gives
−3 9 63 3 8

i1 1 9 3 3 1 39
= =
i2 63 3 8 4 63 41
39 41
so i1 = i2 =
63 63
16 HELM (2006):
®
2. Non-unique solutions
The key to obtaining a unique solution of the system AX = B is to find A−1 . What happens when
A−1 does not exist?
Consider the system
2x1 + 3x2 = 5 (1)
4x1 + 6x2 = 10 (2)
In matrix form this becomes

2 3 x1 5
= .
4 6 x2 10
Task
Identify the matrix A and hence find A−1 .
Your solution
Answer

2 3
A= and |A| = 2 × 6 − 4 × 3 = 0. Hence A−1 does not exist.
4 6
Looking at the original system we see that Equation (2) is simply Equation (1) with all coefficients
doubled. In effect we have only one equation for the two unknowns x1 and x2 . The equations are
consistent and have infinitely many solutions.
1
If we let x2 assume a particular value, t say, then x1 must take the value x1 = (5 − 3t) i.e. the
2
solution to the given equations is:
1
x2 = t, x1 = (5 − 3t), where t is called a parameter.
2
For each value of t there are unique values for x1 and x2 which satisfy the original system of equations.
For example, if t = 1, then x2 = 1, x1 = 1, if t = −3 then x2 = −3, x1 = 7 and so on.
Now consider the system
2x1 + 3x2 = 5 (3)
4x1 + 6x2 = 9 (4)
Since the left-hand sides are the same as those in the previous system then A is the same and again
A−1 does not exist. There is no solution to the Equations (3) and (4).
However, if we double Equation (3) we obtain
4x1 + 6x2 = 10,
which conflicts with Equation (4). There are thus no solutions to (3) and (4) and the equations are
said to be inconsistent.
HELM (2006): 17
Task
What can you conclude about the solutions of the following systems?
x1 − 2x2 = 1 3x1 + 2x2 = 7

(a) (b)
3x1 − 6x2 = 3 −6x1 − 4x2 = 5
First write the systems in matrix form and find |A|:

Your solution
(a)
(b)
Answer

1 −2 x1 1
(a) = |A| = −6 + 6 = 0;
3 −6 x2 3

3 2 x1 7
(b) = |A| = −12 + 12 = 0.
−6 −4 x2 5
Now compare the two equations in each system in turn:

Your solution
(a)
(b)
Answer
(a) The second equation is 3 times the first equation. There are infinitely many solutions of the
form x2 = t, x1 = 1 + 2t where t is arbitrary.
(b) If we multiply the first equation by (−2) we obtain −6x1 − 4x2 = −14 which is in conflict with
the second equation. The equations are inconsistent and have no solution.

It is much more tedious to use the inverse matrix to solve a system of three equations although in
principle, the method is the same as for two equations.
Consider the system
x1 − 2x2 + x3 = 3
2x1 + x2 − x3 = 5
3x1 − x2 + 2x3 = 12
We met this system in Section 8.1 where we found that |A| = 10. Hence A−1 exists.
18 HELM (2006):
®
Task
Find A−1 by the method of determinants.
First form the matrix where each element of A is replaced by its minor:
Your solution
Answer
 
1
−1 2 −1 2 1

 −1 2 3 2 3 −1 
 
   

 −2

1 −2
 1 7 −5
1 1 1 
 =  −3 −1

 −1

3 −1 5 .
2 3 2 



 1 −3 5
 
 −2 1 1 1 1 −2 

1 −1 2 −1 2 1
Now use the 3 × 3 array of signs to obtain the matrix of cofactors:

Your solution
Answer    
+ − + 1 −7 −5
The array of signs is  − + −  so that we obtain  3 −1 −5  .
+ − + 1 3 5
Now transpose this matrix and divide by |A| to obtain A−1 :

Your solution
Answer    
1 3 1 1 3 1
1 
Transposing gives  −7 −1 3  . Finally, A−1 = −7 −1 3  .
10
−5 −5 5 −5 −5 5
HELM (2006): 19
Now use X = A−1 B to solve the system of linear equations:
Your solution
Answer       
1 3 1 3 30 3
1  1 
X= −7 −1 3   5  = 10  =  1  Then x1 = 3, x2 = 1, x3 = 2.
10 10
−5 −5 5 12 20 2
Comparing this approach to the use of Cramer’s rule for three equations (in subsection 2 of Section
8.1) we can say that the two methods are both rather tedious!
Equations with no unique solution

If |A| = 0, A−1 does not exist and therefore it is easy to see that the system of equations has no
unique solution. But it is not obvious whether this is because the equations are inconsistent and have
no solution or whether they are consistent and have infinitely many solutions.
Task
Consider the systems
x1 − x2 + x3 = 4 x1 − x2 + x3 = 4
(a) 2x1 + 3x2 − 2x3 = 3 (b) 2x1 + 3x2 − 2x3 = 3
3x1 + 2x2 − x3 = 7 x1 − 11x2 + 9x3 = 13
In system (a) add the first equation to the second. What does this tell you about the system?
Your solution
Answer
The sum is 3x1 + 2x2 − x3 = 7, which is identical to the third equation. Thus, essentially, there
are only two equations x1 − x2 + x3 = 4 and 3x1 + 2x2 − x3 = 7. Now adding these two gives
4x1 + x2 = 11 or x2 = 11 − 4x1 and then
x3 = 4 − x1 + x2 = 4 − x1 + 11 − 4x1 = 15 − 5x1
Hence if we give x1 a value, t say, then x2 = 11 − 4t and x3 = 15 − 5t. Thus there is an infinite
number of solutions (one for each value of t).
20 HELM (2006):
®
In system (b) take the combination 5 times the first equation minus 2 times the second equation.
What does this tell you about the system?
Your solution
Answer
The combination is x1 − 11x2 + 9x3 = 14, which conflicts with the third equation. There is thus
no solution.
In practice, systems containing three or more linear equations are best solved by the method which
we shall introduce in Section 8.3.
Exercises
1. Solve the following using the inverse matrix method:
2x − 3y = 1 2x − 5y = 2 6x − y = 0
(a) (b) (c)
4x + 4y = 2 −4x + 10y = 1 2x − 4y = 1
2. Solve the following equations using matrix methods:
2x1 + x2 − x3 = 0 x1 − x2 + x3 = 1
(a) x1 + x3 = 4 (b) −x1 + x3 = 1
x1 + x2 + x3 = 0 x1 + x2 − x3 = 0
Answers
1 1 3
1. (a) x = , y = 0 (b) A−1 does not exist. (c) x = − , y = −
2 22 11
8 4 1 3
2. (a) x1 = , x2 = −4, x3 = (b) x1 = , x2 = 1, x3 =
3 3 2 2
HELM (2006): 21
Solution by
Gauss Elimination 8.3
Introduction
Engineers often need to solve large systems of linear equations; for example in determining the forces
in a large framework or finding currents in a complicated electrical circuit. The method of Gauss
elimination provides a systematic approach to their solution.

Prerequisites • be familiar with matrix algebra


'
$
• identify the row operations which allow the
reduction of a system of linear equations to
upper triangular form
Learning Outcomes • use back-substitution to solve a system of

equations in echelon form
• understand and use the method of Gauss
elimination to solve a system of three
simultaneous linear equations
& %
22 HELM (2006):
®

The easiest set of three simultaneous linear equations to solve is of the following type:
3x1 = 6,
2x2 = 5,
4x3 = 7
T
which obviously has solution [x1 , x2 , x3 ]T = 2, 52 , 74 or x1 = 2, x2 = 52 , x3 = 74 .

In matrix form AX = B the equations are

    
3 0 0 x1 6
 0 2 0   x2  =  5 
0 0 4 x3 7
where the matrix of coefficients, A, is clearly diagonal.
Task
Solve the equations
    
2 0 0 x1 8
 0 −1 0   x2  =  2  .
0 0 3 x3 −6
Your solution
Answer
[x1 , x2 , x3 ]T = [4, −2, −2]T .
The next easiest system of equations to solve is of the following kind:
3x1 + x2 − x3 = 0
2x2 + x3 = 12
3x3 = 6.
The last equation can be solved immediately to give x3 = 2.
Substituting this value of x3 into the second equation gives
2x2 + 2 = 12 from which 2x2 = 10 so that x2 = 5
Substituting these values of x2 and x3 into the first equation gives
3x1 + 5 − 2 = 0 from which 3x1 = −3 so that x1 = −1
Hence the solution is [x1 , x2 , x3 ]T = [−1, 5, 2]T .
This process of solution is called back-substitution.
In matrix form the system of equations is
    
3 1 −1 x1 0
 0 2 1   x2  =  12  .
0 0 3 x3 6
HELM (2006): 23
Section 8.3: Solution by Gauss Elimination
The matrix of coefficients is said to be upper triangular because all elements below the leading
diagonal are zero. Any system of equations in which the coefficient matrix is triangular (whether
upper or lower) will be particularly easy to solve.
Task
Solve the following system of equations by back-substitution.
    
2 −1 3 x1 7
 0 3 −1   x2 = 5  .
 
0 0 2 x3 2
Write the equations in expanded form:

Your solution
Answer
2x1 − x2 + 3x3 = 7
3x2 − x3 = 5
2x3 = 2
Now find the solution for x3 :

Your solution
x3 =
Answer
The last equation can be solved immediately to give x3 = 1.
Using this value for x3 , obtain x2 and x1 :
Your solution
x2 = x1 =
Answer
x2 = 2, x1 = 3. Therefore the solution is x1 = 3, x2 = 2 and x3 = 1.
Although we have worked so far with integers this will not always be the case and fractions will enter
the solution process. We must then take care and it is always wise to check that the equations
balance using the calculated solution.
24 HELM (2006):
®
2. The general system of three simultaneous linear equations

In the previous subsection we met systems of equations which could be solved by back-substitution
alone. In this Section we meet systems which are not so amenable and where preliminary work must
be done before back-substitution can be used.
Consider the system
x1 + 3x2 + 5x3 = 14
2x1 − x2 − 3x3 = 3
4x1 + 5x2 − x3 = 7
We will use the solution method known as Gauss elimination, which has three stages. In the first
stage the equations are written in matrix form. In the second stage the matrix equations are replaced
by a system of equations having the same solution but which are in triangular form. In the final
stage the new system is solved by back-substitution.
Stage 1: Matrix Formulation
The first step is to write the equations in matrix form:
    
1 3 5 x1 14
 2 −1 −3   x2  =  3  .
4 5 −1 x3 7
Then, for conciseness, we combine the matrix of coefficients with the column vector of right-hand
sides to produce the augmented matrix:
 
1 3 5 14
 2 −1 −3 3 
4 5 −1 7
If the general system of equations is written AX = B then the augmented matrix is written [A|B].
Hence the first equation
x1 + 3x2 + 5x3 = 14
is replaced by the first row of the augmented matrix,
1 3 5 | 14 and so on.
Stage 1 has now been completed. We will next triangularise the matrix of coefficients by means of
row operations. There are three possible row operations:
• interchange two rows;
• multiply or divide a row by a non-zero constant factor;
• add to, or subtract from, one row a multiple of another row.
Note that interchanging two rows of the augmented matrix is equivalent to interchanging the two
corresponding equations. The shorthand notation we use is introduced by example. To interchange
row 1 and row 3 we write R1 ↔ R3. To divide row 2 by 5 we write R2 ÷ 5. To add three times row
1 to row 2, we write R2 + 3R1. In the Task which follows you will see where these annotations are
placed.
Note that these operations neither create nor destroy solutions so that at every step the system of
equations has the same solution as the original system.
HELM (2006): 25
Stage 2: Triangularisation
The second stage proceeds by first eliminating x1 from the second and third equations using row
operations.
   
1 3 5 14 1 3 5 14
 2 −1 −3 3  R2 − 2 × R1 ⇒  0 −7 −13 −25 
4 5 −1 7 R3 − 4 × R1 0 −7 −21 −49
In the above we have subtracted twice row (equation) 1 from row (equation) 2.
In full these operations would be written, respectively, as
(2x1 − x2 − 3x3 ) − 2(x1 + 3x2 + 5x3 ) = 3 − 2 × 14 or −7x2 − 13x3 = −25
and
(4x1 + 5x2 − x3 ) − 4(x1 + 3x2 + 5x3 ) = 7 − 4 × 14 or −7x2 − 21x3 = −49.
Now since all the elements in rows 2 and 3 are negative we multiply throughout by −1:
   
1 3 5 14 1 3 5 14
 0 −7 −13 −25  R2 × (−1) ⇒  0 7 13 25 
0 −7 −21 −49 R3 × (−1) 0 7 21 49
Finally, we eliminate x3 from the third equation by subtracting equation 2 from equation 3
i.e. R3 − R2:
   
1 3 5 14 1 3 5 14
 0 7 13 25  ⇒  0 7 13 25 
0 7 21 49 R3 − R2 0 0 8 24
The system is now in triangular form.
Stage 3: Back Substitution
Here we solve the equations from bottom to top. At each step of the back substitution process we
encounter equations which only have a single unknown and so can be easily solved.
Task
Now complete the solution to the above system by back-substitution.
Your solution
26 HELM (2006):
®
Answer
In full the equations are
x1 + 3x2 + 5x3 = 14
7x2 + 13x3 = 25
8x3 = 24
From the last equation we see that x3 = 3.

Substituting this value into the second equation gives
7x2 + 39 = 25 or 7x2 = −14 so that x2 = −2.
Finally, using these values for x2 and x3 in equation 1 gives x1 − 6 + 15 = 14. Hence x1 = 5. The
solution is therefore [x1 , x2 , x3 ]T = [5, −2, 3]T
Check that these values satisfy the original system of equations.
Task
Solve
2x1 − 3x2 + 4x3 = 2

4x1 + x2 + 2x3 = 2
x1 − x2 + 3x3 = 3
Write down the augmented matrix for this system and then interchange rows 1 and 3:
Your solution
Answer
Augmented
 matrix  
2 −3 4 2 R1 ↔ R3 1 −1 3 3
 4 1 2 2  ⇒  4 1 2 2 
1 −1 3 3 2 −3 4 2
Now subtract suitable multiples of row 1 from row 2 and from row 3 to eliminate the x1 coefficient
from rows 2 and 3:
Your solution
Answer
   
1 −1 3 3 1 −1 3 3
 4 1 2 2  R2 − 4R1 ⇒  0 5 −10 −10 
2 −3 4 2 R3 − 2R1 0 −1 −2 −4
HELM (2006): 27
Now divide row 2 by 5 and add a suitable multiple of the result to row 3:
Your solution
Answer
     
1 −1 3 3 1 −1 3 3 1 −1 3 3
 0 5 −10 −10  R2 ÷ 5 ⇒ 0 1 −2 −2  ⇒ 0 1 −2 −2 
0 −1 −2 −4 0 −1 −2 −4 R3 + R2 0 0 −4 −6
Now complete the solution using back-substitution:

Your solution
Answer
The equations in full are
x1 − x2 + 3x3 = 3
x2 − 2x3 = −2
−4x3 = −6.
The last equation reduces to x3 = 32 .

Using this value in the second equation gives x2 − 3 = −2 so that x2 = 1.
9
Finally, x1 − 1 + = 3 so that x1 = − 12 .
2
T
The solution is therefore [x1 , x2 , x3 ]T = − 21 , 1, 32 .

You should check these values in the original equations to ensure that the equations balance.
Again we emphasise that we chose a particular set of procedures in Stage 1. This was chosen mainly
to keep the arithmetic simple by delaying the introduction of fractions. Sometimes we are courageous
and take fewer, harder steps.
An important point to note is that when in Stage 2 we wrote R2 − 2R1 against row 2; what we
meant is that row 2 is replaced by the combination (row 2) − 2×(row 1).
In general, the operation
row i − α × row j
means replace row i by the combination
row i − α × row j.
28 HELM (2006):
®
3. Equations which have an infinite number of solutions

Consider the following system of equations
x1 + x2 − 3x3 = 3
2x1 − 3x2 + 4x3 = −4
x1 − x2 + x3 = −1
In augmented form we have:

 
1 1 −3 3
 2 −3 4 −4 
1 −1 1 −1
Now performing the usual Gauss elimination operations we have
   
1 1 −3 3 1 1 −3 3
 2 −3 4 −4  R2 − 2 × R1 ⇒  0 −5 10 −10 
1 −1 1 −1 R3 − R1 0 −2 4 −4
Now applying R2 ÷ −5 and R3 ÷ −2 gives
 
1 1 −3 3
 0 1 −2 2 
0 1 −2 2
Then R2 − R3 gives
 
1 1 −3 3
 0 1 −2 2 
0 0 0 0
We see that all the elements in the last row are zero. This means that the variable x3 can take any
value whatsoever, so let x3 = t then using back substitution the second row now implies
x2 = 2 + 2x3 = 2 + 2t
and then the first row implies
x1 = 3 − x2 + 3x3 = 3 − (2 + 2t) + 3(t) = 1 + t
In this example the system of equations has an infinite number of solutions:
x1 = 1 + t, x2 = 2 + 2t, x3 = t or [x1 , x2 , x3 ]T = [1 + t, 2 + 2t, t]T
where t can be assigned any value. For every value of t these expressions for x1 , x2 and x3 will
simultaneously satisfy each of the three given equations.
Systems of linear equations arise in the modelling of electrical circuits or networks. By breaking
down a complicated system into simple loops, Kirchhoff’s law can be applied. This leads to a set of
linear equations in the unknown quantities (usually currents) which can easily be solved by one of
the methods described in this Workbook.
HELM (2006): 29
Currents in three loops
In the circuit shown find the currents (i1 , i2 , i3 ) in the loops.
4v
2Ω 3Ω
i1 i2
5v 6Ω
4Ω
i3
6v
Figure 2
Solution
Loop 1 gives
2(i1 ) + 3(i1 − i2 ) = 5 → 5i1 − 3i2 = 5
Loop 2 gives
6(i2 − i3 ) + 3(i2 − i1 ) = 4 → −3i1 + 9i2 − 6i3 = 4
Loop 3 gives
6(i3 − i2 ) + 4(i3 ) = 6 − 5 → −6i2 + 10i3 = 1
Note that in loop 3, the current generated by the 6v cell is positive and for the 5v cell negative in
the direction of the arrow.
In matrix form
    
5 −3 0 i1 5
−3 9 −6 i2 = 4
   
0 −6 10 i3 1
Solving gives
34 19 41
i1 = , i2 = , i3 =
15 9 30
30 HELM (2006):
®
Velocity of a rocket
The upward velocity of a rocket, measured at 3 different times, is shown in the following table
Time, t Velocity, v
(seconds) (metres/second)
5 106.8
8 177.2
12 279.2
The velocity over the time interval 5 ≤ t ≤ 12 is approximated by a quadratic expression as
v(t) = a1 t2 + a2 t + a3
Find the values of a1 , a2 and a3 .
Solution
Substituting the values from the table into the quadratic equation for v(t) gives:
    
106.8 = 25a1 + 5a2 + a3 25 5 1 a1 106.8
177.2 = 64a1 + 8a2 + a3 or  64 8 1 a2  = 177.2
279.2 = 144a1 + 12a2 + a3 144 12 1 a3 279.2
Applying one of the methods from this Workbook gives the solution as
a1 = 0.2905 a2 = 19.6905 a3 = 1.0857 to 4 d.p.
As the original values were all experimental observations then the values of the unknowns are all
approximations. The relation v(t) = 0.2905t2 + 19.6905t + 1.0857 can now be used to predict the
approximate position of the rocket for any time within the interval 5 ≤ t ≤ 12.
HELM (2006): 31
Exercises
Solve the following using Gauss elimination:
1.
2x1 + x2 − x3 = 0
x1 + x3 = 4
x1 + x2 + x3 = 0
2.
x1 − x2 + x3 = 1
−x1 + x3 = 1
x1 + x2 − x3 = 0
3.
x1 + x2 + x3 = 2
2x1 + 3x2 + 4x3 = 3
x1 − 2x2 − x3 = 1
4.
x1 − 2x2 − 3x3 = −1
3x1 + x2 + x3 = 4
11x1 − x2 − 3x3 = 10
You may need to think carefully about this system.
Answers
8 4
(1) x1 = , x2 = −4, x3 =
3 3
1 3
(2) x1 = , x2 = 1, x3 =
2 2
(3) x1 = 2, x2 = 1, x3 = −1
(4) infinite number of solutions: x1 = t, x2 = 11 − 10t, x3 = 7t − 7
32 HELM (2006):
Contents 9
Vectors
9.1 Basic Concepts of Vectors 2
9.2 Cartesian Components of Vectors 17
9.3 The Scalar Product 30
9.4 The Vector Product 45
9.5 Lines and Planes 54
Learning outcomes
In this Workbook you will learn what a vector is and how to combine vectors together using
the triangle law. You will be able to represent a vector by its Cartesian components. You
will be able to multiply vectors together using either the scalar product or the vector
product. You will be able to apply your knowledge of vectors to solve problems involving
forces and to geometric problems involving lines and planes.
Basic Concepts
of Vectors 9.1
Introduction
In engineering, frequent reference is made to physical quantities, such as force, speed and time. For
example, we talk of the speed of a car, and the force in a compressed spring. It is useful to separate
these physical quantities into two types. Quantities of the first type are known as scalars. These
can be fully described by a single number known as the magnitude. Quantities of the second type
are those which require the specification of a direction, in addition to a magnitude, before they
are completely described. These are known as vectors. Special methods have been developed for
handling vectors in calculations, giving rise to subjects such as vector algebra, vector geometry and
vector calculus. Quantities that are vectors must be manipulated according to certain rules, which
are described in this and subsequent Sections.

Prerequisites • be familiar with all the basic rules of algebra


'
$
• categorize a number of common physical
quantities as scalar or vector
• represent vectors by directed line segments

Learning Outcomes
• combine, or add, vectors using the triangle
On completion you should be able to . . . law
• resolve a vector into two perpendicular

components
& %
2 HELM (2006):
Workbook 9: Vectors
®
1. Introduction
It is useful to separate physical quantities into two types: the first are called scalars; the second are
known as vectors. A scalar is a quantity that can be described by a single number which can be
positive, negative or zero. An example of a scalar quantity is the mass of an object, so we might
state that ‘the mass of the stone is 3 kg’. It is important to give the units in which the quantity
is measured. Obvious examples of scalars are temperature and length, but there are many other
engineering applications in which scalars play an important role. For example, speed, work, voltage
and energy are all scalars.
On the other hand, vectors are quantities which require the specification of a magnitude and a
direction. An example of a vector quantity is the force applied to an object to make it move. When
the object shown in Figure 1 is moved by applying a force to it we achieve different effects depending
on the direction of the force.
Figure 1: A force is a vector quantity
In order to specify the force completely we must state not only its magnitude (its ‘strength’) but
also the direction in which the force acts. For example we might state that ‘a force of 5 newtons
is applied vertically from above’. Clearly this force would have a different effect from one applied
horizontally. The direction in which the force acts is crucial.
There are many engineering applications where vectors are important. Force, acceleration, velocity,
electric and magnetic fields are all described by vectors. Furthermore, when computer software is
written to control the position of a robot, the position is described by vectors.
Sometimes confusion can arise because words used in general conversation have specific technical
meanings when used in engineering calculations. An example is the use of the words ‘speed’ and
‘velocity’. In everyday conversation these words have the same meaning and are used interchangeably.
However in more precise language they are not the same. Speed is a scalar quantity described by
giving a single number in appropriate units; for example ‘the speed of the car is 40 kilometres per
hour’. On the other hand velocity is a vector quantity and must be specified by giving a direction
as well. For example ‘the velocity of the aircraft is 20 metres per second due north’.
In engineering calculations, the words speed and velocity cannot be used interchangeably. Similar
problems arise from use of the words ‘mass’ and ‘weight’. Mass is a scalar which describes the
amount of substance in an object. The unit of mass is the kilogramme. Weight is a vector, the
direction of which is vertically downwards because weight arises from the action of gravity. The unit
of weight is the newton. Displacement and distance are related quantities which can also cause
confusion. Whereas distance is a scalar, displacement is ‘directed distance’, that is, distance together
with a specified direction. So, referring to Figure 2, if an object is moved from point A to point B,
we can state that the distance moved is 10 metres, but the displacement is 10 metres in the direction
from A to B.
You will meet many other quantities in the course of your studies and it will be helpful to know which
are vectors and which are scalars. Some common quantities and their type are listed in Table 1. The
S.I. units in which these are measured are also shown.
HELM (2006): 3
Section 9.1: Basic Concepts of Vectors
A B
10 m
Figure 2: Displacement means directed distance
Table 1. Some common scalar and vector quantities

quantity type S.I. unit
distance scalar metre, m
mass scalar kilogramme, kg
temperature scalar kelvin, K
pressure scalar pascal, Pa
work scalar joule, J
energy scalar joule, J
displacement vector metre m
force vector newton, N
velocity vector metres per second, m s−1
acceleration vector metres per second per second, m s−2
Exercise
State which of the following are scalars and which are vectors:
(a) the volume of a petrol tank,
(b) a length measured in metres,
(c) a length measured in miles,
(d) the angular velocity of a flywheel,
(e) the relative velocity of two aircraft,
(f) the work done by a force,
(g) electrostatic potential,
(h) the momentum of an atomic particle.
Answer
(a), (b), (c) (f), (g) are scalars.
(d), (e), and (h) are vectors
4 HELM (2006):
Workbook 9: Vectors
®
2. The mathematical description of vector quantities

Because a vector has a direction as well as a magnitude we can represent a vector by drawing a
line. The length of the line represents the magnitude of the vector given some appropriate scale,
and the direction of the line represents the direction of the vector. We call this representation a
directed line segment. For example, Figure 3 shows a vector which represents a velocity of 3 m
s−1 north-west. Note that the arrow on the vector indicates the direction required.
scale: = 1 m s−1
Figure 3: A vector quantity can be represented by drawing a line
More generally, Figure 4 shows an arbitrary vector quantity.
B B
B
−→
AB
or a or a
A A
A
Figure 4: Representation of vectors
It is important when writing vectors to distinguish them from scalars. Various notations are used.
In Figure 4 we emphasise that we are dealing with the vector from A to B by using an arrow and
−→
writing AB. Often, in textbooks, vectors are indicated by using a bold typeface such as a . It is
difficult when handwriting to reproduce the bold face and so it is conventional to underline vector
−→
quantities and write a instead. So AB and a represent the same vector in Figure 4. We can also use
the notation AB. In general in this Workbook we will use underlining but we will also use the arrow
notation where it is particularly helpful.
Example 1
Figure 5 shows an object pulled by a force of 10 N at an angle of 60◦ to the
horizontal.
10 N
60o
Figure 5
Show how this force can be represented by a vector.
HELM (2006): 5
Solution
The force can be represented by drawing a line of length 10 units at an angle of 60◦ to the horizontal,
as shown below.
60o scale = 1 unit
Figure 6
We have labelled the force F . When several forces are involved they can be labelled F 1 , F 2 and so
on.
When we wish to refer simply to the magnitude (or length) of a vector we write this using the
−→
modulus sign as |AB|, or |a|, or simply a (without the underline.)
In general two vectors are said to be equal vectors if they have the same magnitude and same
−−→ −→
direction. So, in Figure 7 the vectors CD and AB are equal even though their locations differ.
D
A
C
Figure 7: Equal vectors
This is a useful and important property of vectors: a vector is defined only by its direction and
magnitude, not by its location in space. These vectors are often called free vectors.
The vector −a is a vector in the opposite direction to a, but has the same magnitude as a, as shown
in Figure 8.
B
a
B
A
−→
A −a = BA
Figure 8
−→ −→
Geometrically, if a = AB then −a = BA.
6 HELM (2006):
Workbook 9: Vectors
®
Exercises
1. An object is subject to two forces, one of 3 N vertically downwards, and one of 8 N, horizontally
to the right. Draw a diagram representing these two forces as vectors.
2. Draw a diagram showing an arbitrary vector F . On the diagram show the vector −F .
3. Vectors p and q are equal vectors. Draw a diagram to represent p and q.
4. If F is a vector, what is meant by F ?

Answers
1. 2. 3.
8N F
p
−F
3N q
4. F is the magnitude of F .
3. Addition of vectors
Vectors are added in a particular way known as the triangle law. To see why this law is appropriate
to add them this way consider the following example:
Example: The route taken by an automated vehicle

An unmanned vehicle moves on tracks around a factory floor carrying components from the store at
A to workers at C as shown in Figure 9.
C
workers
−→
AC
−
−→
BC
store
A −→ B
AB
Figure 9
The vehicle may arrive at C either directly or via an intermediate point B. The movement from A
−→
to B can be represented by a displacement vector AB. Similarly, movement from B to C can be
HELM (2006): 7
−−→ −→
represented by the displacement vector BC, and movement from A to C can be represented by AC.
Since travelling from A to B and then B to C is equivalent to travelling directly from A to C we
write
−→ −−→ −→
AB + BC = AC
−→ −−→
This is an example of the triangle law for adding vectors. We add vectors AB and BC by placing
−−→ −→ −→
the tail of BC at the head of AB and completing the third side of the triangle so formed (AC).
b
a
Figure 10: Two vectors a and b

Consider the more general situation in Figure 10. Suppose we wish to add b to a. To do this b is
translated, keeping its direction and length unchanged, until its tail coincides with the head of a.
Then the sum a + b is defined by the vector represented by the third side of the completed triangle,
that is c in Figure 11. Note, from Figure 11, that we can write c = a + b since going along a and
then along b is equivalent to going along c.
b
a
c
Figure 11: Addition of the two vectors of Figure 10 using the triangle law
Task
Using vectors a and b shown below, draw a diagram to find a + b. Find also b + a.
Is a + b the same as b + a?
a
Your solution
8 HELM (2006):
Workbook 9: Vectors
®
Answer
a
a+b b
yes a + b = b + a
It is possible, using the triangle law, to prove the following rules which apply to any three vectors a,
b and c:
Key Point 1
Vector Addition
a+b=b+a vector addition is commutative
a + (b + c) = (a + b) + c vector addition is associative
Example: Resultant of two forces acting upon a body

A force F 1 of 2 N acts vertically downwards, and a force F 2 of 3 N acts horizontally to the right,
upon the body shown in Figure 12.
F2 = 3 N
A
θ
R
F1 = 2 N
B C
Figure 12
We can use vector addition to find the combined effect or resultant of the two concurrent forces.
(Concurrent means that the forces act through the same point.) Translating F 2 until its tail touches
HELM (2006): 9
the head of F 1 , we complete the triangle ABC as shown. The vector represented by the third side is
the resultant, R. We write
R = F1 + F2
and say that R is the vector sum of F2 and F1 . The resultant force acts at an angle of θ below the
horizontal
√ where tan θ = 2/3, so that θ = 33.7◦ , and has magnitude (given by Pythagoras’ theorem)
13 N.
Example: Resolving a force into two perpendicular directions

In the previous Example we saw that two forces acting upon a body can be replaced by a single force
which has the same effect. It is sometimes useful to reverse this process and consider a single force
as equivalent to two forces acting at right-angles to each other.
Consider the force F inclined at an angle θ to the horizontal as shown in Figure 13.
F sin θ
θ θ
F cos θ
Figure 13
F can be replaced by two forces, one of magnitude F cos θ and one of magnitude F sin θ as shown.
We say that F has been resolved into two perpendicular components. This is sensible because
if we re-combine the two perpendicular forces of magnitudes F cos θ and F sin θ using the triangle
law we find F to be the resultant force.
For example, Figure 14 shows a force of 5 N acting at an angle of 30◦ to the x axis. It can be resolved
into two components, one directed along the x axis with magnitude 5 cos 30◦ and one perpendicular
to this of magnitude 5 sin 30◦ . Together, these two components have the same effect as the original
force.
5 sin 30◦ N
y
5N 5N
30◦ 30◦
5 cos 30◦ N
x
Figure 14
10 HELM (2006):
Workbook 9: Vectors
®
Task
Consider the force shown in the diagram below.
15 N
40o
Resolve this force into two perpendicular components, one horizontally to the right,
and one vertically upwards.
Your solution
Answer
Horizontal component is 15 cos 40◦ N = 11.50 N; vertical component is 15 sin 40◦ N = 9.64 N
The need to resolve a vector along a given direction occurs in other areas. For example, as a police
car or ambulance with siren operating passes by the pitch of the siren appears to increase as the
vehicle approaches and decrease as it goes away. This change in pitch is known as the Doppler
effect This effect occurs in any situation where waves are reflected from a moving object.
A radar gun produces a signal which is bounced off the target moving vehicle so that when it returns to
the gun, which also acts as a receiver, it has changed pitch. The speed of the vehicle can be calculated
from the change in pitch. The speed indicated on the radar gun is the speed directly towards or away
from the gun. However it is not usual to place oneself directly in front of moving vehicle when using the
radar gun (Figure 15(a).) Consequently the gun is used at an angle to the line of traffic (Figure 15(b).)
v
radar
gun
(a)
v
(b)
Figure 15
This means that it registers only the component of the velocity towards the gun. Suppose that the
true speed along the road is v. Then the component measured by the gun (v cos θ) is less than v.
HELM (2006): 11
Example 2
A safety inspector wishes to check the speed of a train along a straight piece of
track. She stands 10 m to the side of the track and uses a radar gun. If the
reading on the gun is to be within 5% of the true speed of the train, how far away
from the approaching train should the reading be taken?
Solution
θ
s
r d
radan
g u
Figure 16
For an error of 5%, the gun should read 0.95|v|. So
|v| cos θ = 0.95|v| or cos θ = 0.95
If the distance to the side of the track at which the gun is used is s m and the distance between
the radar gun and the train is d m, then
s
sin θ =
d
Here s = 10 so
√ 10
1 − cos2 θ = .
d
So, if cos θ = 0.95
10
d= p = 32.03
1 − (0.95)2
This means that the reading should be taken when the train is over 32 m from the radar gun to
ensure an error of less than 5%.
12 HELM (2006):
Workbook 9: Vectors
®
The force vectors on an aeroplane in steady flight

The forces acting on an aeroplane are shown in Figure 17.
L
T
α
path β
horizon D
Figure 17
The magnitude (strength) of the forces are indicated by
T : the thrust provided by the engines,
W : the weight,
D: the drag (acting against the direction of flight) and
L: the lift (taken perpendicular to the path.)
In a more realistic situation force vectors in three dimensions would need to be considered. These
are introduced later in this Workbook.
As the plane is in steady flight the sum of the forces in any direction is zero. (If this were not
the case, then, by Newton’s second law, the non-zero resultant force would cause the aeroplane to
accelerate.)
So, resolving forces in the direction of the path:
T cos α − D − W sin β = 0
Then, resolving forces perpendicular to the path:
T sin α + L − W cos β = 0
If the plane has mass 72 000 tonnes, the drag is 130 kN, the lift is 690 kN and β = 6o find the
magnitude of the thrust and the value of α to maintain steady flight. From these two equations we
see:
T cos α = D + W sin β = 130000 + (72000)(9.81) sin 6o = 203830.54
and
T sin α = W cos β − L = (72000)(9.81) cos 6o − 690000 = 12450.71
hence
T sin α 12450.71
tan α = = = 0.061084 → α = 3.500
T cos α 203830.54
and consequently, for the thrust:
T = 204210 N.
HELM (2006): 13
4. Subtraction of vectors
Subtraction of one vector from another is performed by adding the corresponding negative vector.
That is, if we seek a − b we form a + (−b). This is shown geometrically in Figure 18. Note that in
the right-hand diagram the arrow on b has been reversed to give −b.
−b
a
a+(−b)
b
a
Figure 18: Subtraction of a vector is performed by adding a negative vector
Exercises
1. Vectors p and q represent two perpendicular sides of a square. Find vector expressions which
represent the diagonals of the square.
2. In the rectangle ABCD, side AB is represented by the vector p and side BC is represented
by the vector q. State the physical significance of the vectors p − q and p + q.
3. An object is positioned at the origin of a set of axes. Two forces act upon it. The first has
magnitude 9 N and acts in the direction of the positive y axis. The second has magnitude 4 N
and acts in the direction of the negative x axis. Calculate the magnitude and direction of the
resultant force.
4. An object moves in the xy plane with a velocity of 15 m s−1 in a direction 48◦ above the
positive x axis. Resolve this velocity into two components, one along the x axis and one along
the y axis.
Answers
1. p + q, q − p. Acceptable answers are also −(p + q), p − q.
2. p + q is the diagonal AC, p − q is the diagonal DB.
√
3. Magnitude 97, at an angle 66◦ above the negative x axis.
4. 10.04 m s−1 along the x axis, and 11.15 m s−1 along the y axis.
14 HELM (2006):
Workbook 9: Vectors
®
5. Multiplying a vector by a scalar

If k is any positive scalar and a is a vector then ka is a vector in the same direction as a but k times
as long. If k is negative, ka is a vector in the opposite direction to a and k times as long. See Figure
19. The vector ka is said to be a scalar multiple of a.
a a
ka ka
if k is positive if k is negative
Figure 19: Multiplying a vector by a scalar

The vector 3a is three times as long as a and has the same direction. The vector 21 r is in the same
direction as r but is half as long. The vector −4b is in the opposite direction to b and four times as
long.
For any scalars k and l, and any vectors a and b, the following rules hold:
Key Point 2
k(a + b) = ka + kb
(k + l)a = ka + la
k(l)a = (kl)a
Task
Using the rules given in Key Point 2, simplify the following:
(a) 3a + 7a (b) 2(7b) (c) 4q + 4r
Your solution
Answer
(a) Using the second rule, 3a + 7a can be simplified to (3 + 7)a = 10a.
(b) Using the third rule 2(7b) = (2 × 7)b = 14b.
(c) Using the first rule 4q + 4r = 4(q + r).
HELM (2006): 15
Unit vectors
A vector which has a magnitude of 1 is called a unit vector. If a has magnitude 3, then a unit
vector in the direction of a is 13 a, as shown in Figure 20.
1a
3
Figure 20: A unit vector has length one unit

A unit vector in the direction of a given vector is found by dividing the given vector by its magnitude:
A unit vector in the direction of a is given the ‘hat’ symbol â.
Key Point 3
A unit vector can be found by dividing a vector by its modulus.
a
â =
|a|
Exercises
1. Draw an arbitrary vector r. On your diagram draw 2r, 4r, −r, −3r and 12 r.
−→ −−→
2. In triangle OAB the point P divides AB internally in the ratio m : n. If OA = a and OB = b
−→
depict this on a diagram and then find an expression for OP in terms of a and b.
Answers
1.
−r
r 4r
− 3r 1
r
2r 2
−→ m na + mb
2. OP = a + (b − a) = .
m+n m+n
16 HELM (2006):
Workbook 9: Vectors
®
Cartesian Components
of Vectors 9.2

Introduction
It is useful to be able to describe vectors with reference to specific coordinate systems, such as the
Cartesian coordinate system. So, in this Section, we show how this is possible by defining unit vectors
in the directions of the x and y axes. Any other vector in the xy plane can then be represented as a
combination of these basis vectors. The idea is then extended to three dimensional vectors. This
is useful because most engineering problems involve 3D situations.
' $
• be able to distinguish between a vector and a
scalar
Prerequisites • be able torepresent a vector as a directed line
Before starting this Section you should . . . segment
• understand the Cartesian coordinate system

&
' %
$
• explain the meaning of the unit vectors
i, j and k
• express two dimensional and three

Learning Outcomes dimensional vectors in Cartesian form
On completion you should be able to . . . • find the modulus of a vector expressed in
Cartesian form
• find a ‘position vector’

& %
HELM (2006): 17
Section 9.2: Cartesian Components of Vectors
1. Two-dimensional coordinate frames
Figure 21 shows a two-dimensional coordinate frame. Any point P in the xy plane can be defined in
terms of its x and y coordinates.
x P(x,y)
x
Figure 21
A unit vector pointing in the positive direction of the x-axis is denoted by i. (Note that it is common
practice to write this particular unit vector without the hat ˆ.) It follows that any vector in the
direction of the x-axis will be a multiple of i. Figure 22 shows vectors i, 2i, 5i and −3i. In general
a vector of length ` in the direction of the x-axis can be written ì.
− 3i
i
2i
5i
x
Figure 22: All these vectors are multiples of i
Similarly, a unit vector pointing in the positive y-axis is denoted by j. So any vector in the direction
of the y-axis will be a multiple of j. Figure 23 shows j, 4j and −2j. In general a vector of length `
in the direction of the y-axis can be written `j.
y
− 2j
j 4j
x
Figure 23: All these vectors are multiples of j
18 HELM (2006):
Workbook 9: Vectors
®
Key Point 4
i represents a unit vector in the direction of the positive x-axis

j represents a unit vector in the direction of the positive y-axis
Example 3
Draw the vectors 5i and 4j. Use your diagram and the triangle law of addition
to add these two vectors together. First draw the vectors 5i and 4j. Then, by
translating the vectors so that they lie head to tail, find the vector sum 5i + 4j.
Solution
y y
5i + 4j
4j 4j
5i 5i
x x
Figure 24
We now generalise the situation in Example 3. Consider Figure 25.

−→ −→
It shows a vector r = AB. We can regard r as being the resultant of the two vectors AC = ai, and
−−→
CB = bj. From the triangle law of vector addition
−→ −→ −−→
AB = AC + CB
= ai + bj
We conclude that any vector in the xy plane can be expressed in the form r = ai + bj. The numbers
a and b are called the components of r in the x and y directions. Sometimes, for emphasis, we will
use ax and ay instead of a and b to denote the components in the x- and y-directions respectively.
In that case we would write r = ax i + ay j.
HELM (2006): 19
y
r
bj
A C
ai
x
Figure 25: AB = AC + CB by the triangle law
Column vector notation

An alternative, useful, and often briefer notation is to write the vector r = ai + bj in column vector
notation as

a
r=
b
Task
(a) Draw an xy plane and show the vectors p = 2i + 3j, and q = 5i + j.
(b) Express p and q using column vector notation.
(c) By translating one of the vectors apply the triangle law to show the sum p + q.
(d) Express the resultant p + q in terms of i and j.
(a) Draw the xy plane and the required vectors. (They can be drawn from any point in the plane):
Your solution
(b) The column vector form of p is ( 23 ). Write down the column vector form of q:
Your solution
Answer
p = ( 23 ) q = ( 51 )
20 HELM (2006):
Workbook 9: Vectors
®
(c) Translate one of the vectors in part (a) so that they lie head to tail, completing the third side of
the triangle to give the resultant p + q:
Your solution
Answer
Note that the vectors have not been drawn to scale.
y y
2i + 3j
5i + j 7i + 4j
x x
(d) By studying your diagram note that the resultant has two components 7i, horizontally, and 4j
vertically. Hence write down an expression for p + q:
Your solution
Answer
7i + 4j
It is very important to note from the last task that vectors in Cartesian form can be added by simply
adding their respective i and j components.
Thus, if a = ax i + ay j and b = bx i + by j then
a + b = (ax + bx )i + (ay + by )j
A similar, and obvious, rule applies when subtracting:
a − b = (ax − bx )i + (ay − by )j.
HELM (2006): 21
Task
If a = 9i + 7j and b = 8i + 3j find (a) a + b (b) a − b
Your solution
Answer
(a) Simply add the respective components: 17i + 10j,
(b) Simply subtract the respective components: i + 4j
Now consider the special case when r represents the vector from the origin O to the point P (a, b).
This vector is known as the position vector of P and is shown in Figure 26.
Key Point 5
y
j
P (a, b)
r i
b
x
O a
Figure 26
−→
The position vector of P with coordinates (a, b) is r = OP = ai + bj
Unlike most vectors, position vectors cannot be freely translated. Because they indicate the position
of a point they are fixed vectors in the sense that the tail of a position vector is always located at
the origin.
22 HELM (2006):
Workbook 9: Vectors
®
Example 4
State the position vectors of the points with coordinates
(a) P (2, 4), (b) Q(−1, 5), (c) R(−1, −7), (d) S(8, −4).
Solution
(a) 2i + 4j. (b) −i + 5j. (c) −i − 7j. (d) 8i − 4j.
Example 5
Sketch the position vectors r1 = 3i + 4j, r2 = −2i + 5j, r3 = −3i − 2j.
Solution
The vectors are shown below. Note that all position vectors start at the origin.
y
(−2, 5)
(3, 4)
(−3, −2)
Figure 27
The modulus of any vector r is equal to its length. As we have noted earlier, the modulus of r is
usually denoted by |r|. When r = ai + bj the modulus can be obtained using Pythagoras’ theorem.
If r is the position vector of point P then the modulus is, clearly, the distance of P from the origin.
Key Point 6
√
If r = ai + bj then |r| = a2 + b 2
HELM (2006): 23
Example 6
Find the modulus of each of the vectors shown in Example 5.
Solution
√ √ p √ √
(a) |r1 | = |3i + 4j| = 32 + 42 = 25 = 5. (b) |r2 | = (−2)2 + 52 = 4 + 25 = 29.
p √ √
(c) |r3 | = (−3)2 + (−2)2 = 9 + 4 = 13
Task
Point A has coordinates (3, 5). Point B has coordinates (7, 8).
−→ −−→
(a) Draw a diagram which shows points A and B and draw the vectors OA and OB:
Your solution
Answer
B(7, 8)
y
(3, 5)
A
x
O
(b) State the position vectors of A and B:
24 HELM (2006):
Workbook 9: Vectors
®
Your solution
−→
OA =
−−→
OB =
Answer
−→ −−→
OA = a = 3i + 5j, OB = b = 7i + 8j
−→ −→ −−→
(c) Referring to your figure and using the triangle law you can write OA + AB = OB so that
−→ −−→ −→ −→
AB = OB − OA. Hence write down an expression for AB in terms of the unit vectors i and j:
Your solution
Answer
−→
AB = (7i + 8j) − (3i + 5j) = 4i + 3j
−→
(d) Calculate the length of AB = |4i + 3j|:
Your solution
Answer
−→ √ √
|AB| = 42 + 32 = 25 = 5
HELM (2006): 25
Exercises
1. Explain the distinction between a position vector, and a more general free vector.
2. What is meant by the symbols i and j ?
3. State the position vectors of the points with coordinates
(a) P (4, 7) , (b) Q(−3, 5), (c) R(0, 3), (d) S(−1, 0)
4. State the coordinates of the point P if its position vector is:
(a) 3i − 7j, (b) −4i, (c) −0.5i + 13j, (d) ai + bj
5. Find the modulus of each of the following vectors:

(a) r = 7i + 3j, (b) r = 17i, (c) r = 2i − 3j, (d) r = −3j,
(e) r = ai + bj, (f) r = ai − bj
6. Point P has coordinates (7, 8). Point Q has coordinates (−2, 4).
−→ −→
(a) Draw a sketch showing vectors OP , OQ
(b) State the position vectors of P and Q,
−→
(c) Find an expression for P Q,
−→
(d) Find |P Q|.
Answers
1. Free vectors may be translated provided their direction and length remain unchanged. Position
vectors must always start at the origin.
2. i is a unit vector in the direction of the positive x-axis. j is a unit vector in the direction of
the positive y-axis.
3. (a) 4i + 7j, (b) −3i + 5j, (c) 3j, (d) −i.
4. (a) (3, −7), (b) (−4, 0), (c) (−0.5, 13), (d) (a, b)
√ √ √ √
5. (a) 58, (b) 17, (c) 13, (d) 3, (e) a2 + b2 , (f) a2 + b2 .
−→ −→ −→ −→ √
6. (b) OP = 7i + 8j and OQ = −2i + 4j, (c) P Q = −9i − 4j, (d) |P Q| = 97.
26 HELM (2006):
Workbook 9: Vectors
®
2. Three-dimensional coordinate frames

The real world is three-dimensional and in order to solve many engineering problems it is necessary to
develop expertise in the mathematics of three-dimensional space. An important application of vectors
is their use to locate points in three dimensions. When two distinct points are known we can draw
a unique straight line between them. Three distinct points which do not lie on the same line form a
unique plane. Vectors can be used to describe points, lines, and planes in three dimensions. These
mathematical foundations underpin much of the technology associated with computer graphics and
the control of robots. In this Section we shall introduce the vector methods which underlie these
applications.
z z z
c
P(a,b,c)
r
k
O y O b y O y
j
a i
Q
x x
x
(a) (b) (c)
Figure 28
Figure 28(a) shows a three-dimensional coordinate frame. Note that the third dimension requires
the addition of a third axis, the z-axis. Although these three axes are drawn in the plane of the
paper you should remember that we are now thinking of three-dimensional situations. Just as in
two-dimensions the x and y axes are perpendicular, in three dimensions the x,y and z axes are all
perpendicular to each other. We say they are mutually perpendicular. There is no reason why we
could not have chosen the z-axis in the opposite direction. However, it is conventional to choose the
directions shown in Figure 28(a).
Any point in the three dimensional space can be defined in terms of its x, y and z coordinates.
Consider the point P with coordinates (a, b, c) as shown in Figure 28(b). The vector from the origin
−→
to the point P is known as the position vector of P , denoted OP or r. To arrive at P from O we can
think of moving a units in the x direction, b units in the y direction and c units in the z direction.
A unit vector pointing in the positive direction of the z-axis is denoted by k. See the Figure 28(c).
−→ −→
Noting that OQ = ai + bj and that QP = ck we can state
−→ −→ −→
r = OP = OQ + QP
= ai + bj + ck
We conclude that the position vector of the point with coordinates (a, b, c) is r = ai + bj + ck.
(We might, for convenience, sometimes use a subscript notation. For example we might refer to the
position vector r as r = rx i + ry j + rz k in which (rx , ry , rz ) have taken the place of (a, b, c).)
HELM (2006): 27
Key Point 7
If P has coordinates (a, b, c) then its position vector is
−→
r = OP = ai + bj + ck
Task
State the position vector of the point with coordinates (9, −8, 6).
Your solution
Answer
9i − 8j + 6k
−→
The modulus of the vector OP is equal to the distance OP , which can be obtained by Pythagoras’
theorem:
Key Point 8
If r = ai + bj + ck then
√
|r| = a2 + b 2 + c 2
Task
Find the modulus of the vector r = 4i + 2j + 3k.
Your solution
Answer
√ √ √
|r| = 42 + 22 + 32 = 16 + 4 + 9 = 29
28 HELM (2006):
Workbook 9: Vectors
®
Example 7
Points A, B and C have coordinates (−1, 1, 4), (8, 0, 2) and (5, −2, 11) respec-
tively.
(a) Find the position vectors of A, B and C.
−→ −−→
(b) Find AB and BC.
−→ −−→
(c) Find |AB| and |BC|.
Solution
(a) Denoting the position vectors of A, B and C by a, b and c respectively, we find
a = −i + j + 4k,b = 8i + 2k, c = 5i − 2j + 11k

−→ −−→
(b) AB = b − a = 9i − j − 2k. BC = c − b = −3i − 2j + 9k.
−→ p √ −−→ p √
(c) |AB| = 92 + (−1)2 + (−2)2 = 86. |BC| = (−3)2 + (−2)2 + 92 = 94.
Exercises
1. State the position vector of the point with coordinates (4, −4, 3).
2. Find the modulus of each of the following vectors.

(a) 7i + 2j + 3k, (b) 7i − 2j + 3k, (c) 2j + 8k, (d) −i − 2j + 3k, (e) ai + bj + ck
3. Points P , Q and R have coordinates (9, 1, 0), (8, −3, 5), and (5,5,7) respectively.
(a) Find the position vectors p, q, r of P , Q and R,
−→ −→
(b) Find P Q and QR
−→ −→
(c) Find |P Q| and |QR|.
Answers
1. 4i − 4j + 3k
√ √ √ √ √
2. (a) 62, (b) 62, (c) 68, (d) 14, (e) a2 + b2 + c2 .
3. (a) p = 9i + j, q = 8i − 3j + 5k, r = 5i + 5j + 7k.

−→ −→
(b) P Q = −i − 4j + 5k, QR = −3i + 8j + 2k
−→ √ −→ √
(c) |P Q| = 42, |QR| = 77
HELM (2006): 29

The Scalar Product 9.3
Introduction
There are two kinds of multiplication involving vectors. The first is known as the scalar product
or dot product. This is so-called because when the scalar product of two vectors is calculated the
result is a scalar. The second product is known as the vector product. When this is calculated the
result is a vector. The definitions of these products may seem rather strange at first, but they are
widely used in applications. In this Section we consider only the scalar product.
' $
• know that a vector can be represented as a
directed line segment
Prerequisites • know how to express a vector in Cartesian
Before starting this Section you should . . . form
• know how to find the modulus of a vector

&
' %
$
• calculate, from its definition, the scalar
product of two given vectors
• calculate the scalar product of two vectors

given in Cartesian form
Learning Outcomes
• use the scalar product to find the angle
between two vectors
• use the scalar product to test whether two

vectors are perpendicular
& %
30 HELM (2006):
Workbook 9: Vectors
®
1. Definition of the scalar product

Consider the two vectors a and b shown in Figure 29.
b
θ
a
Figure 29: Two vectors subtend an angle θ

Note that the tails of the two vectors coincide and that the angle between the vectors is labelled θ.
Their scalar product, denoted by a · b, is defined as the product |a| |b| cos θ. It is very important to
use the dot in the formula. The dot is the specific symbol for the scalar product, and is the reason
why the scalar product is also known as the dot product. You should not use a × sign in this
context because this sign is reserved for the vector product which is quite different.
The angle θ is always chosen to lie between 0 and π, and the tails of the two vectors must coincide.
Figure 30 shows two incorrect ways of measuring θ.
b
b
θ a
a θ
Figure 30: θ should not be measured in these ways
Key Point 9
The scalar product of a and b is: a · b = |a| |b| cos θ
We can remember this formula as:
“The modulus of the first vector, multiplied by the modulus of the second vector,
multiplied by the cosine of the angle between them.”
Clearly b · a = |b| |a| cos θ and so

a · b = b · a.
Thus we can evaluate a scalar product in any order: the operation is commutative.
HELM (2006): 31
Section 9.3: The Scalar Product
Example 8
Vectors a and b are shown in the Figure 31. Vector a has modulus 6 and vector b
has modulus 7 and the angle between them is 60◦ . Calculate a.b.
60 o
a
Figure 31
Solution
The angle between the two vectors is 60◦ . Hence
a · b = |a| |b| cos θ = (6)(7) cos 60◦ = 21
The scalar product of a and b is 21. Note that a scalar product is always a scalar.
Example 9
Find i · i where i is the unit vector in the direction of the positive x axis.
Solution
Because i is a unit vector its modulus is 1. Also, the angle between i and itself is zero. Therefore
i.i = (1)(1) cos 0◦ = 1
So the scalar product of i with itself equals 1. It is easy to verify that j.j = 1 and k.k = 1.
Example 10
Find i · j where i and j are unit vectors in the directions of the x and y axes.
Solution
Because i and j are unit vectors they each have a modulus of 1. The angle between the two vectors
is 90◦ . Therefore
i · j = (1)(1) cos 90◦ = 0
That is i · j = 0.
32 HELM (2006):
Workbook 9: Vectors
®
The following results are easily verified:
Key Point 10
i·i = j·j =k·k =1

i·j = j·i=0
i·k = k·i=0
j·k = k·j =0
Generally, whenever any two vectors are perpendicular to each other their scalar product is zero
because the angle between the vectors is 90◦ and cos 90◦ = 0.
Key Point 11
The scalar product of perpendicular vectors is zero.
2. A formula for finding the scalar product

We can use the results summarized in Key Point 10 to obtain a formula for finding a scalar product
when the vectors are given in Cartesian form. We consider vectors in the xy plane. Suppose
a = a1 i + a2 j and b = b1 i + b2 j. Then
a · b = (a1 i + a2 j) · (b1 i + b2 j)
= a1 i · (b1 i + b2 j) + a2 j · (b1 i + b2 j)
= a1 b 1 i · i + a1 b 2 i · j + a2 b 1 j · i + a2 b 2 j · j
Using the results in Key Point 10 we can simplify this to give the following formula:
HELM (2006): 33
Key Point 12
If a = a1 i + a2 j and b = b1 i + b2 j then
a · b = a1 b1 + a2 b2
Thus to find the scalar product of two vectors their i components are multiplied together, their j
components are multiplied together and the results are added.
Example 11
If a = 7i + 8j and b = 5i − 2j, find the scalar product a · b.
Solution
We use Key Point 12:
a · b = (7i + 8j) · (5i − 2j) = (7)(5) + (8)(−2) = 35 − 16 = 19
The formula readily generalises to vectors in three dimensions as follows:
Key Point 13
If a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k then
a · b = a1 b 1 + a2 b 2 + a3 b 3
Example 12
If a = 5i + 3j − 2k and b = 8i − 9j + 11k, find a · b.
Solution
We use the formula in Key Point 13:
a · b = (5)(8) + (3)(−9) + (−2)(11) = 40 − 27 − 22 = −9
Note again that the result is a scalar: there are no i’s, j’s, or k’s in the answer.
34 HELM (2006):
Workbook 9: Vectors
®
Task
If p = 4i − 3j + 7k and q = 6i − j + 2k, find p · q.
Use Key Point 13:
Your solution
Answer
41
Task
If r = 3i + 2j + 9k find r · r. Show that this is the same as |r|2 .
Your solution
Answer
r · r = (3i + 2j + 9k) · (3i + 2j + 9k) = 3i · 3i + 3i · 2j + · · · = 9 + 0 + · · · = 94.
√ √
|r| = 9 + 4 + 81 = 94, hence |r|2 = r · r.
The above result is generally true:
Key Point 14
For any vector r, |r|2 = r · r
HELM (2006): 35
3. Resolving one vector along another
The scalar product can be used to find the component of a vector in the direction of another vector.
Consider Figure 32 which shows two arbitrary vectors a and n. Let n̂ be a unit vector in the direction
of n.
n̂ Q n
O θ
⎩
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎧
projection of a onto n
Figure 32
Study the figure carefully and note that a perpendicular has been drawn from P to meet n at Q.
The distance OQ is called the projection of a onto n. Simple trigonometry tells us that the length
of the projection is |a| cos θ. Now by taking the scalar product of a with the unit vector n̂ we find
a · n̂ = |a| |n̂| cos θ = |a| cos θ (since |n̂| = 1)
We conclude that
Key Point 15
Resolving One Vector Along Another
a · n̂ is the component of a in the direction of n
Example 13
Figure 33 shows a plane containing the point A which has position vector a. The
vector n̂ is a unit vector perpendicular to the plane (such a vector is called a
normal vector). Find an expression for the perpendicular distance, `, of the plane
from the origin.
n̂
n̂
A
a
O
Figure 33
36 HELM (2006):
Workbook 9: Vectors
®
Solution
From the diagram we note that the perpendicular distance ` of the plane from the origin is the
projection of a onto n̂ and, using Key Point 15, is thus a · n̂.
4. Using the scalar product to find the angle between vectors

We have two distinct ways of calculating the scalar product of two vectors. From Key Point 9
a · b = |a| |b| cos θ whilst from Key Point 13 a · b = a1 b1 + a2 b2 + a3 b3 . Both methods of calculating
the scalar product are entirely equivalent and will always give the same value for the scalar product.
We can exploit this correspondence to find the angle between two vectors. The following example
illustrates the procedure to be followed.
Example 14
Find the angle between the vectors a = 5i + 3j − 2k and b = 8i − 9j + 11k.
Solution
The scalar product
p of these two vectors √ has already been found inp Example 12 to be −9.√ The
modulus of a is 52 + 32 + (−2)2 = 38. The modulus of b is 82 + (−9)2 + 112 = 266.
Substituting these values for a · b, |a| and b into the formula for the scalar product we find
a · b = |a| |b| cos θ

√ √
−9 = 38 266 cos θ
from which
−9
cos θ = √ √ = −0.0895
38 266
so that θ = cos−1 (−0.0895) = 95.14◦
In general, the angle between two vectors can be found from the following formula:
Key Point 16
The angle θ between vectors a, b is such that:

a·b
cos θ =
|a| |b|
HELM (2006): 37
Exercises
1. If a = 2i − 5j and b = 3i + 2j find a · b and verify that a · b = b · a.
2. Find the angle between p = 3i − j and q = −4i + 6j.
3. Use the definition of the scalar product to show that if two vectors are perpendicular, their
scalar product is zero.
4. If a and b are perpendicular, simplify (a − 2b) · (3a + 5b).
5. If p = i + 8j + 7k and q = 3i − 2j + 5k, find p · q.
6. Show that the vectors 12 i + j and 2i − j are perpendicular.
7. The work done by a force F in moving a body through a displacement r is given by F · r.

Find the work done by the force F = 3i + 7k if it causes a body to move from the point with
coordinates (1, 1, 2) to the point (7, 3, 5).
8. Find the angle between the vectors i − j − k and 2i + j + 2k.

Answers
1. −4.
2. 142.1◦ ,
3. This follows from the fact that cos θ = 0 since θ = 90◦ .
4. 3a2 − 10b2 .
5. 22.
6. This follows from the scalar product being zero.
7. 39 units.
8. 101.1◦
38 HELM (2006):
Workbook 9: Vectors
®
5. Vectors and electrostatics

Electricity is important in several branches of engineering - not only in electrical or electronic en-
gineering. For example the design of the electrostatic precipitator plates for cleaning the solid fuel
power stations involves both mechanical engineering (structures and mechanical rapping systems for
cleaning the plates) and electrostatics (to determine the electrical forces between solid particles and
plates).
The following example and tasks relate to the electrostatic forces between particles. Electric charge
is measured in coulombs (C). Charges can be either positive or negative.
The force between two charges
Let q1 and q2 be two charges in free space located at points P1 and P2 . Then q1 will experience a
force due to the presence of q2 and directed from P2 towards P1 .
q1 q2
This force is of magnitude K 2 where r is the distance between P1 and P2 and K is a constant.
r
q 1 q2
In vector notation this coulomb force (measured in newtons) can then be expressed as F = K 2 r̂
r
where r̂ is a unit vector directed from P2 towards P1 .
1
The constant K is known to be where ε0 = 8.854 × 10−12 F m−1 (farads per metre).
4πε0
The electric field
Kq1
A unit charge located at a general point G will then experience a force 2 r̂1 (where r̂1 is the unit
r1
vector directed from P1 towards G) due to a charge q1 located at P1 . This is the electric field E
newtons per coulomb (N C−1 or alternatively V m−1 ) at G due to the presence of q1 .
For several point charges q1 at P1 , q2 at P2 etc., the total electric field E at G is given by
Kq1 Kq2
E= 2
r̂1 + 2 r̂2 + . . .
r1 r2
where r̂i is the unit vector directed from point Pi towards G.
From the definition of a unit vector we see that

Kq1 r1 Kq2 r2 Kq1 Kq2 1 q1 q2
E= 2 + 2 + ... = r + r + ... = r + r + ...
r1 |r1 | r2 |r2 | |r1 |3 1 |r2 |3 2 4πε0 |r1 |3 1 |r2 |3 2
where ri is the vector directed from point Pi towards G, so that r1 = OG − OP 1 etc., where OG
and OP 1 are the position vectors of G and P1 (see Figure 34).
P1 G
Figure 34
OP1 + P1 G = OG P1 G = OG − OP1
The work done
The work done W (energy expended) in moving a charge q through a distance dS, in a direction
given by the unit vector S/|S|, in an electric field E is (defined by)
W = −qE.dS (4)
where W is in joules.
HELM (2006): 39
Field due to point charges
In free space, point charge q1 = 10 nC (1 nC = 10−9 C, i.e. a nanocoulomb) is at P1 (0, −4, 0) and
charge q2 = 20 nC is at P2 = (0, 0, 4).
[Note: Since the x-coordinate of both charges is zero, the problem is two-dimensional in the yz plane
as shown in Figure 35.]
z
k
P2 (0, 0, 4)
P1 (0, −4, 0) O y
Figure 35
(a) Find the field at the origin E 1,2 due to q1 and q2 .
(b) Where should a third charge q3 = 30 nC be placed in the yz plane so that the total field
due to q1 , q2 , q3 is zero at the origin?
Solution
(a) Total field at the origin E 1,2 = (field at origin due to charge at P1 ) + (field at origin due to
charge at P2 ). Therefore
10 × 10−9 20 × 10−9
E 1,2 = j + (−k) = 5.617j − 11.23k
4π × 8.854 × 10−12 × 42 4π × 8.854 × 10−12 × 42
(The negative sign in front of the second term results from the fact that the direction from P2 to
O is in the −z direction.)
(b) Suppose the third charge q3 = 30 nC is placed at P3 (0, a, b). The field at the origin due to the
third charge is
30 × 10−9 −(aj + bk)
E3 = × ,
4π × 8.854 × 10−12 × (a2 + b2 ) (a2 + b2 )1/2
aj + bk
where is the unit vector in the direction from O to P3
(a2 + b2 )1/2
If the position of the third charge is such that the total field at the origin is zero, then E 3 = −E 1,2 .
There are two unknowns (a and b). We can write down two equations by considering the j and k
directions.
40 HELM (2006):
Workbook 9: Vectors
®
Solution (contd.)

a b
E 3 = −269.6 2 2 3/2
j+ 2 k E 1,2 = 5.617j − 11.23k
(a + b ) (a + b2 )3/2
So
a
5.617 = 269.6 × (1)
(a2 + b2 )3/2
b
−11.23 = 269.6 × (2)
(a2 + b2 )3/2
So
a
= 0.02083 (3)
(a2 + b2 )3/2
b
= −0.04165 (4)
(a2 + b2 )3/2
a2 + b 2
Squaring and adding (3) and (4) gives = 0.002169
(a2 + b2 )3
So
(a2 + b2 ) = 21.47 (5)
Substituting back from (5) into (1) and (2) gives a = 2.07 and b = −4.14, to 3 s.f.
Task
Eight point charges of 1 nC each are located at the corners of a cube in free space
which is 1 m on each side (see Figure 36). Calculate |E| at
(a) the centre of the cube

(b) the centre of any face
(c) the centre of any edge.
z
(0, 0, 1) A S (0, 1, 1)
(1, 0, 1) B P
(1, 1, 1)
O R (0, 1, 0)
y
(1, 0, 0) (1, 1, 0)
x D T
Figure 36
HELM (2006): 41
Your solution
Work the problem on a separate piece of paper but record here your main results and conclusions.
Answer
(a) The field at the centre of the cube is zero because of the symmetrical distribution of the charges.
(b) Because of the symmetrical nature of the problem it does not matter which face is chosen in
order to find the magnitude of the field at the centre of a face. Suppose the chosen face has corners
located at P (1, 1, 1), T (1, 1, 0), R(0, 1, 0)and S(0, 1, 1) then the centre (C) of this face can be
1 1
seen from the diagram to be located at C , 1, .
2 2
The electric field at C due to the charges at the corners P, T, R and S will then be zero since the field
vectors due to equal charges located at opposite corners of the square P T RS cancel one another
out. The field at C is then due to the equal charges located at the remaining four corners (OABD)
of the cube, and we note from the symmetry of the cube, that sthedistance of each of these corners
2 2
1 1 √
from C will be the same. In particular the distance OC = + 12 + = 1.5 m. The
2 2
4
1 X qi · r i
electric field E at C due to the remaining charges can then be found using E =
4πε0 1 |ri |3
where q1 to q4 are the equal charges (10−9 coulombs) and r1 to r4 are the vectors directed from
the four corners,
√ where the charges are located, towards C. In this case since q1 = 10−9 coulombs
and |ri | = 1.5 for i = 1 to i = 4 we have
1 10−9
E= [r + r2 + r3 + r4 ] ,
4πε0 (1.5)3/2 1
 1     1   
2
0 2
1
where r1 = AC = 1 − 0 , r2 = BC = 1 − 0  etc.
      
1 1
2 1  2
1
0
1 10−9  
Thus E = 4
4πε0 (1.5)3/2
0
1 10−9 10−9
and |E| = = = 19.57 V m−1
πε0 (1.5)3/2 π × 8.854 × 10−12 (1.5)3/2
42 HELM (2006):
Workbook 9: Vectors
®
Answer
(c) Suppose the chosen
edgeto be used connects A(0, 0, 1) to B(1, 0, 1) then the centre point (G)
1
will be located at G , 0, 1 .
2
By symmetry the field at G due to the charges at A and B will be zero.
We note that the distances DG, OG, s P G and SG are all equal. In the case of OG we calculate
2
1 √
by Pythagoras that this distance is + 02 + 12 = 1.25.
2
√
Similarly the distances T G and RG are equal to 2.25.
1 P qi r i
Using the result that E = gives
4πε0 |ri |3
  1   1   1   1 
−9
10  1  −2 2
−2 2 
E=  0  + 0 + −1 + −1
     
4πε0 (1.25)3/2 
1 1 0 0

 1   1 
1  −2 2 
+  −1  +  −1  
(2.25)3/2
1 1
 
    
−9 0 0
10  1  −2  + 1  −2 
=
4πε0 (1.25)3/2 (2.25)3/2
2 2
 
−9 0
10 
= −2.02367 
4πε0
2.02367
10−9 p
Thus |E| = 02 + (−2.02367)2 + (2.02367)2
4 × π × 8.854 × 10−12
= 25.72 V m−1 (2 d.p.).
HELM (2006): 43
Task
If E = −50i − 50j + 30k V m−1 where i, j and k are unit vectors in the x, y
and z directions respectively, find the differential amount of work done in moving
a 2µC point charge a distance of 5 mm.
(a) From P (1, 2, 3) towards Q(2, 4, 1)

(b) From Q(2, 4, 1) towards P (1, 2, 3)
Your solution
Answer
(a) The work done in moving a 2µC charge through a distance of 5 mm towards Q is
PQ
W = −qE.ds = −(2 × 10−6 )(5 × 10−3 )E.
|P Q|
(i + 2j − 2k)
= −10−8 (−50i − 50j + 30k) · p
12 + 22 + (−2)2
10−8 (50 + 100 + 60)
= = 7 × 10−7 J
3
(b) A similar calculation yields that the work done in moving the same charge through the
same distance in the direction from Q to P is W = −7 × 10−7 J
44 HELM (2006):
Workbook 9: Vectors
®

The Vector Product 9.4

Introduction
In this Section we describe how to find the vector product of two vectors. Like the scalar product,
its definition may seem strange when first met but the definition is chosen because of its many
applications. When vectors are multiplied using the vector product the result is always a vector.
' $
• know that a vector can be represented as a
directed line segment
Prerequisites • know how to express a vector in Cartesian
• know how to evaluate 3 × 3 determinants

&
' %
$
• use the right-handed screw rule
• calculate the vector product of two given

Learning Outcomes vectors
On completion you should be able to . . . • use determinants to calculate the vector

product of two vectors given in Cartesian
form
& %
HELM (2006): 45
Section 9.4: The Vector Product
1. The right-handed screw rule
To understand how the vector product is formed it is helpful to consider first the right-handed screw
rule. Consider the two vectors a and b shown in Figure 37.
Figure 37
The two vectors lie in a plane; this plane is shaded in Figure 37. Figure 38 shows the same two
vectors and the plane in which they lie together with a unit vector, denoted ê, which is perpendicular
to this plane. Imagine turning a right-handed screw, aligned along ê, in the direction from a towards
b as shown. A right-handed screw is one which when turned clockwise enters the material into which
it is being screwed (most screws are of this kind). You will see from Figure 38 that the screw will
advance in the direction of ê.
ê
θ
a
Figure 38
On the other hand, if the right-handed screw is turned from b towards a the screw will retract in the
direction of fˆ as shown in Figure 39.
b
θ
a
fˆ
Figure 39
We are now in a position to describe the vector product.
46 HELM (2006):
Workbook 9: Vectors
®
2. Definition of the vector product

We define the vector product of a and b, written a × b, as
a × b = |a| |b| sin θ ê
By inspection of this formula note that this is a vector of magnitude |a| |b| sin θ in the direction
of the vector ê, where ê is a unit vector perpendicular to the plane containing a and b in the sense
defined by the right-handed screw rule. The quantity a × b is read as “a cross b”and is sometimes
referred to as the cross product. The angle is chosen to lie between 0 and π. See Figure 40.
length |a||b| sin θ

a×b
θ
a
Figure 40: a × b is perpendicular to the plane containing a and b

Formally we have
Key Point 17
The vector product of a and b is: a × b = |a| |b| sin θ ê

The modulus of the vector product is: |a × b| = |a| |b| sin θ
Note that |a| |b| sin θ gives the modulus of the vector product whereas ê gives its direction.
Now study Figure 41 which is used to illustrate the calculation of b × a. In particular note the
direction of b × a arising through the application of the right-handed screw rule.
We see that a × b is not equal to b × a because their directions are opposite. In fact a × b = −b × a.
b
b×a
Figure 41: b × a is in the opposite direction to a × b
HELM (2006): 47
Example 15
If a and b are parallel, show that a × b = 0.
Solution
If a and b are parallel then the angle θ between them is zero. Consequently sin θ = 0 from which it
follows that a × b = |a| |b| sin θ ê = 0. Note that the result, 0, is the zero vector.
Note in particular the important results which follow:
Key Point 18
i×i=0 j×j =0 k×k =0
Example 16
Show that i × j = k and find expressions for j × k and k × i.
Solution
Note that i and j are perpendicular so that the angle between them is 90◦ . Also, the vector k is
perpendicular to both i and j. Using Key Point 17, the modulus of i × j is (1)(1) sin 90◦ = 1. So
i×j is a unit vector.The unit vector perpendicular to i and j in the sense defined by the right-handed
screw rule is k as shown in Figure 42(a). Therefore i × j = k as required.
z z z
i ×j =k j ×k =i k k ×i =j
k k
j y j y j y
i i i
x (a) x (b) x (c)
Similarly you should verify that j × k = i (Figure 42(b)) and k × i = j (Figure 42(c)).
48 HELM (2006):
Workbook 9: Vectors
®
Key Point 19
i × j = k, j × k = i, k×i=j j × i = −k, k × j = −i, i × k = −j
To help remember these results you might like to think of the vectors i, j and k written in alphabetical
order like this:
i j k i j k
Moving left to right yields a positive result: e.g. k × i = j.
Moving right to left yields a negative result: e.g. j × i = −k.
3. A formula for finding the vector product

We can use the results in Key Point 19 to develop a formula for finding the vector product of two
vectors given in Cartesian form: Suppose a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k then
a × b = (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)
= a1 i × (b1 i + b2 j + b3 k)
+ a2 j × (b1 i + b2 j + b3 k)
+ a3 k × (b1 i + b2 j + b3 k)
= a1 b1 (i × i) + a1 b2 (i × j) + a1 b3 (i × k)
+ a2 b1 (j × i) + a2 b2 (j × j) + a2 b3 (j × k)
+ a3 b1 (k × i) + a3 b2 (k × j) + a3 b3 (k × k)
Using Key Point 19, this expression simplifies to

a × b = (a2 b3 − a3 b2 )i − (a1 b3 − a3 b1 )j + (a1 b2 − a2 b1 )k
This gives us Key Point 20:
Key Point 20
a × b = (a2 b3 − a3 b2 )i − (a1 b3 − a3 b1 )j + (a1 b2 − a2 b1 )k
HELM (2006): 49
Example 17
Evaluate the vector product a × b if a = 3i − 2j + 5k and b = 7i + 4j − 8k.
Solution
Identifying a1 = 3, a2 = −2, a3 = 5, b1 = 7, b2 = 4, b3 = −8 we find
a × b = ((−2)(−8) − (5)(4))i − ((3)(−8) − (5)(7))j + ((3)(4) − (−2)(7))k

= −4i + 59j + 26k
Task
Use Key Point 20 to find the vector product of
p = 3i + 5j and q = 2i − j.
Note that in this case there are no k components so a3 and b3 are both zero:
Your solution
p×q =
Answer
−13k
4. Using determinants to evaluate a vector product

Evaluation of a vector product using the formula in Key Point 20 is very cumbersome. A more
convenient and easily remembered method is to use determinants. Recall from Workbook 7 that, for
a 3 × 3 determinant,

a b c
e f d f d e
h i − b g i + c g h
d e f = a

g h i
The vector product of two vectors a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k can be found by
evaluating the determinant:

i j k

a × b = a1 a2 a3
b1 b2 b3
in which i, j and k are (temporarily) treated as if they were scalars.
50 HELM (2006):
Workbook 9: Vectors
®
Key Point 21

i j k

a × b = a1 a2 a3 = i(a2 b3 − a3 b2 ) − j(a1 b3 − a3 b1 ) + k(a1 b2 − a2 b1 )
b1 b2 b3
Example 18
Find the vector product of a = 3i − 4j + 2k and b = 9i − 6j + 2k.
Solution

i j k

We have a × b = 3 −4 2 which, when evaluated, gives
9 −6 2
a × b = i(−8 − (−12)) − j(6 − 18) + k(−18 − (−36)) = 4i + 12j + 18k
Example 19
The area AT of the triangle shown in Figure 43 is given by the formula
−→ −→
AT = 12 bc sin α. Show that an equivalent formula is AT = 12 |AB × AC|.
B
c
α
a
A
b
C
Figure 43
Solution
−→ −→ −→ −→
We use the definition of the vector product |AB × AC| = |AB| |AC| sin α.
−→ −→ −→ −→
Since α is the angle between AB and AC, and |AB| = c and |AC| = b, the required result follows
immediately:
1 −→ −→ 1
|AB × AC| = c · b · sin α.
2 2
HELM (2006): 51
Moments
The moment (or torque) of the force F about a point O is defined as
Mo = r × F
where r is a position vector from O to any point on the line of action of F as shown in Figure 44.
r F
O θ
θ
D
Figure 44
It may seem strange that any point on the line of action may be taken but it is easy to show that
exactly the same vector M o is always obtained.
By the properties of the cross product the direction of M o is perpendicular to the plane containing
r and F (i.e. out of the paper). The magnitude of the moment is
|M 0 | = |r||F | sin θ.
From Figure 32, |r| sin θ = D. Hence |M o | = D|F |. This would be the same no matter which point
on the line of action of F was chosen.
Example 20
Find the moment of the force given by F = 3i + 4j + 5k (N) acting at the point
(14, −3, 6) about the point P (2, −2, 1).
z
F = 3i + 4j + 5k
r
(14, −3, 6)
y P (2, −2, 1)
x
Figure 45
Solution
The vector r can be any vector from the point P to any point on the line of action of F . Choosing
r to be the vector connecting P to (14, −3, 6) (and measuring distances in metres) we have:
r = (14 − 2)i + (−3 − (−2))j + (6 − 1)k = 12i − j + 5k.

i j k

The moment is M = r × F = 12 −1 5 = −25i − 45j + 51k (N m)
3 4 5
52 HELM (2006):
Workbook 9: Vectors
®
Exercises
1. Show that if a and b are parallel vectors then their vector product is the zero vector.
2. Find the vector product of p = −2i − 3j and q = 4i + 7j.
3. If a = i + 2j + 3k and b = 4i + 3j + 2k find a × b. Show that a × b 6= b × a.
4. Points A, B and C have coordinates (9, 1, −2), (3,1,3), and (1, 0, −1) respectively. Find the
−→ −→
vector product AB × AC.
5. Find a vector which is perpendicular to both of the vectors a = i + 2j + 7k and b = i + j − 2k.

Hence find a unit vector which is perpendicular to both a and b.
6. Find a vector which is perpendicular to the plane containing 6i + k and 2i + j.
7. For the vectors a = 4i + 2j + k, b = i − 2j + k, and c = 3i − 3j + 4k, evaluate both a × (b × c)

and (a × b) × c. Deduce that, in general, the vector product is not associative.
8. Find the area of the triangle with vertices at the points with coordinates (1, 2, 3), (4, −3, 2)
and (8, 1, 1).
9. For the vectors r = i + 2j + 3k, s = 2i − 2j − 5k, and t = i − 3j − k, evaluate

(a) (r · t)s − (s · t)r. (b) (r × s) × t. Deduce that (r · t)s − (s · t)r = (r × s) × t.
Answers
1. This uses the fact that sin 0 = 0.
2. −2k
3. −5i + 10j − 5k
4. 5i − 34j + 6k
1
5. −11i + 9j − k, √ (−11i + 9j − k)
203
6. −i + 2j + 6k for example.
7. 7i − 17j + 6k, −42i − 46j − 3k. These are different so the vector product is not associative.
√
8. 21 1106
9. Each gives −29i − 10j + k
HELM (2006): 53

Lines and Planes 9.5
Introduction
Vectors are very convenient tools for analysing lines and planes in three dimensions. In this Section you
will learn about direction ratios and direction cosines and then how to formulate the vector equation of
a line and the vector equation of a plane. Direction ratios provide a convenient way of specifying the
direction of a line in three dimensional space. Direction cosines are the cosines of the angles between
a line and the coordinate axes. We begin this Section by showing how these quantities are calculated.
#
• understand and be able to calculate the
scalar product of two vectors
Prerequisites
• understand and be able to calculate the
vector product of two vectors
"
' !
$
• obtain the vector equation of a line
• obtain the vector equation of a plane passing

through a given point and which is
Learning Outcomes perpendicular to a given vector
• obtain the vector equation of a plane which is
a given distance from the origin and which is
perpendicular to a given vector
& %
54 HELM (2006):
Workbook 9: Vectors
®
1. The direction ratio and direction cosines

Consider the point P (4, 5) and its position vector 4i + 5j shown in Figure 46.
5 P (4,5)
r = 4i + 5j
"
!
x
O 4
Figure 46
−→
The direction ratio of the vector OP is defined to be 4:5. We can interpret this as stating that to
move in the direction of the line OP we must move 4 units in the x direction for every 5 units in the
y direction.
−→
The direction cosines of the vector OP are the cosines of the angles between the vector and each
of the axes. Specifically, referring to Figure 46 these are
cos α and cos β
−→ √ √
Noting that the length of OP is 42 + 52 = 41, we can write
4 5
cos α = √ , cos β = √ .
41 41
It is conventional to label the direction cosines as ` and m so that
4 5
`= √ , m= √ .
41 41
More generally we have the following result:
Key Point 22
For any vector r = ai + bj, its direction ratio is a : b.
Its direction cosines are
a b
`= √ , m= √
a2+ b2 a2+ b2
HELM (2006): 55
Section 9.5: Lines and Planes
Example 21
Point A has coordinates (3, 5), and point B has coordinates (7, 8).
−→
(a) Write down the vector AB.
−→
(b) Find the direction ratio of the vector AB.
(c) Find its direction cosines, ` and m.
(d) Show that `2 + m2 = 1.
Solution
−→
(a) AB = b − a = 4i + 3j.
−→
(b) The direction ratio of AB is therefore 4:3.
(c) The direction cosines are
4 4 3 3
`= √ = , m= √ =
42 + 32 5 42 + 32 5
(d)
2 2
2 4
2 3 16 9 25
` +m = + = + = =1
5 5 25 25 25
The final result in the previous Example is true in general:
Key Point 23
If ` and m are the direction cosines of a vector lying in the xy plane, then `2 + m2 = 1
Exercise
P and Q have coordinates (−2, 4) and (7, 8) respectively.
−→ −→
(a) Find the direction ratio of the vector P Q (b) Find the direction cosines of P Q.
Answer
9 4
(a) 9 : 4, (b) √ , √ .
97 97
56 HELM (2006):
Workbook 9: Vectors
®
2. Direction ratios and cosines in three dimensions

The concepts of direction ratio and direction cosines extend naturally to three dimensions. Consider
Figure 47.
z
P(a,b,c)
! "
y
Figure 47
Given a vector r = ai + bj + ck its direction ratios are a : b : c. This means that to move in the
direction of the vector we must must move a units in the x direction and b units in the y direction
for every c units in the z direction.
The direction cosines are the cosines of the angles between the vector and each of the axes. It is
conventional to label direction cosines as `, m and n and they are given by
a b c
` = cos α = √ , m = cos β = √ , n = cos γ = √
a2 + b2 + c 2 a2 + b2 + c2 a2 + b2 + c 2
Wee have the following general result:
Key Point 24
For any vector r = ai + bj + ck its direction ratios are a : b : c.
Its direction cosines are
a b c
`= √ , m= √ , n= √
a2 + b2 + c2 a2 + b2 + c2 a2 + b2 + c2
where `2 + m2 + n2 = 1
HELM (2006): 57
Exercises
1. Points A and B have position vectors a = −3i + 2j + 7k, and b = 3i + 4j − 5k respectively.
Find
−→
(a) AB
−→
(b) |AB|
−→
(c) The direction ratios of AB
−→
(d) The direction cosines (`, m, n) of AB.
(e) Show that `2 + m2 + n2 = 1.
−→
2. Find the direction ratios, the direction cosines and the angles that the vector OP makes with
each of the axes when P is the point with coordinates (2, 4, 3).
3. A line is inclined at 60◦ to the x axis and 45◦ to the y axis. Find its inclination to the z axis.
Answers
√ 6 2 −12
1. (a) 6i + 2j − 12k, (b) 184, (c) 6 : 2 : −12, (d) √ , √ , √
184 184 184
2 4 3
2. 2:4:3; √ , √ , √ ; 68.2◦ , 42.0◦ , 56.1◦ .
29 29 29
◦ ◦
3. 60 or 120 .
3. The vector equation of a line

Consider the straight line AP B shown in Figure 48. This is a line in three-dimensional space.
B
P
A
r
b
a
O y
Figure 48
Points A and B are fixed and known points on the line, and have position vectors a and b respectively.
−→
Point P is any other arbitrary point on the line, and has position vector r. Note that because AB
−→ −→ −→ −→ −→
and AP are parallel, AP is simply a scalar multiple of AB, that is, AP = tAB where t is a number.
58 HELM (2006):
Workbook 9: Vectors
®
Task
−→
Referring to Figure 48, write down an expression for the vector AB in terms of a
and b.
Your solution
Answer
−→
AB = b − a
Task
Referring to Figure 48, use the triangle law for vector addition to find an expression
−→ −→
for r in terms of a, b and t, where AP = tAB.
Your solution
Answer
−→ −→ −→
OP = OA + AP
so that
−→ −→
r = a + t(b − a) since AP = tAB
The answer to the above Task, r = a + t(b − a), is the vector equation of the line through A and
B. It is a rule which gives the position vector r of a general point on the line in terms of the given
vectors a, b. By varying the value of t we can move to any point on the line. For example, referring
to Figure 48,
when t = 0, the equation gives r = a, which locates point A,
when t = 1, the equation gives r = b, which locates point B.

If 0 < t < 1 the point P lies on the line between A and B. If t > 1 the point P lies on the line
beyond B (to the right in the figure). If t < 0 the point P lies on the line beyond A (to the left in
the figure).
HELM (2006): 59
Key Point 25
The vector equation of the line through points A and B with position vectors a and b is
r = a + t(b − a)
Task
Write down the vector equation of the line which passes through the points with
position vectors a = 3i + 2j and b = 7i + 5j. Also express the equation in column
vector form.
Your solution
Answer
b − a = (7i + 5j) − (3i + 2j) = 4i + 3j

The equation of the line is then
r = a + t(b − a)
= (3i + 2j) + t(4i + 3j)
Using column vector notation we could write

3 4
r= +t
2 3
60 HELM (2006):
Workbook 9: Vectors
®
Task
Using column vector notation, write down the vector equation of the line which
passes through the points with position vectors a = 5i−2j+3k and b = 2i+j−4k.
Your solution
Answer 
    
2 5 −3
Using column vector notation note that b − a =  1 − −2 =
    3
−4 3 −7
   
5 −3
The equation of the line is then r = a + t(b − a) = −2 + t 3 
  
3 −7
Cartesian form
On occasions it is useful to convert the vector form of the equation of a straight line into Cartesian
form. Suppose we write
     
a1 b1 x
a = a2 ,
  b = b2 ,
  r = y

a3 b3 z
then r = a + t(b − a) implies
       
x a1 b 1 − a1 a1 + t(b1 − a1 )
y  = a2  + t b2 − a2  = a2 + t(b2 − a2 )
z a3 b 3 − a3 a3 + t(b3 − a3 )
Equating the individual components we find
x − a1
x = a1 + t(b1 − a1 ), or equivalently t =
b 1 − a1
y − a2
y = a2 + t(b2 − a2 ), or equivalently t =
b 2 − a2
z − a3
z = a3 + t(b3 − a3 ), or equivalently t =
b 3 − a3
HELM (2006): 61
Each expression on the right is equal to t and so we can write
x − a1 y − a2 z − a3
= =
b 1 − a1 b2 − a2 b 3 − a3
This gives the Cartesian form of the equations of the straight line which passes through the points
with coordinates (a1 , a2 , a3 ) and (b1 , b2 , b3 ).
Key Point 26
The Cartesian form of the equation of the straight line which passes through the points with
coordinates (a1 , a2 , a3 ) and (b1 , b2 , b3 ) is
x − a1 y − a2 z − a3
= =
b 1 − a1 b 2 − a2 b 3 − a3
Example 22
(a) Write down the Cartesian form of the equation of the straight line
which passes through the two points (9, 3, −2) and (4, 5, −1).
(b) State the equivalent vector equation.
Solution
(a)
x−9 y−3 z − (−2)
= =
4−9 5−3 −1 − (−2)
that is
x−9 y−3 z+2
= = (Cartesian form)
−5 2 1
(b) The vector equation is
r = a + t(b − a)
     
9 4 9
=  3  + t( 5  −  3 )
−2 −1 −2
   
9 −5
=  3 +t 2 
 
−2 1
62 HELM (2006):
Workbook 9: Vectors
®
Exercises
−→
1. (a) Write down the vector AB joining the points A and B with coordinates (3, 2, 7) and
(−1, 2, 3) respectively.
(b) Find the equation of the straight line through A and B.
2. Write down the vector equation of the line passing through the points with position vectors
p = 3i + 7j − 2k and q = −3i + 2j + 2k. Find also the Cartesian equation of this line.
3. Find the vector equation of the line passing through (9, 1, 2) and which is parallel to the vector
(1, 1, 1).
Answers
   
3 −4
1. (a) −4i − 4k. (b) r = 2 + t 0 .
  
7 −4
   
3 −6
x−3 y−7 z+2
2. r =  7  + t −5. Cartesian form = = .
−6 −5 4
−2 4
   
9 1
3. r = 1 + t 1.
2 1
4. The vector equation of a plane

Consider the plane shown in Figure 49.
A
P
a
r
Figure 49
Suppose that A is a fixed point in the plane and has position vector a. Suppose that P is any other
−→
arbitrary point in the plane with position vector r. Clearly the vector AP lies in the plane.
HELM (2006): 63
Task
−→
Referring to Figure 49, find the vector AP in terms of a and r.
Your solution
Answer
r−a
Also shown in Figure 49 is a vector which is perpendicular to the plane and denoted by n.
Task
−→
What relationship exists between n and the vector AP ?
Hint: think about the scalar product:
Your solution
Answer
−→
Because AP and n are perpendicular their scalar product must equal zero, that is
(r − a).n = 0 so that r.n = a.n
The answer to the above Task, r.n = a.n, is the equation of a plane, written in vector form,
passing through A and perpendicular to n.
Key Point 27
A plane perpendicular to the vector n and passing through the point with position vector a, has
equation
r.n = a.n
64 HELM (2006):
Workbook 9: Vectors
®
In this formula it does not matter whether or not n is a unit vector.

If n̂ is a unit vector then a.n̂ represents the perpendicular distance from the origin to the plane which
we usually denote by d (for details of this see Section 9.3). Hence we can write
r.n̂ = d
This is the equation of a plane, written in vector form, with unit normal n̂ and which is a perpen-
dicular distance d from O.
Key Point 28
A plane with unit normal n̂, which is a perpendicular distance d from O is given by
r.n̂ = d
Example 23
(a) Find the vector equation of the plane which passes through the point with
position vector 3i + 2j + 5k and which is perpendicular to i + k.
(b) Find the Cartesian equation of this plane.
Solution
(a) Using the previous results we can write down the equation
r.(i + k) = (3i + 2j + 5k).(i + k) = 3 + 5 = 8
(b) Writing r as xi + yj + zk we have the Cartesian form:
(xi + yj + zk).(i + k) = 8
so that
x+z =8
HELM (2006): 65
Task
(a) Find the vector equation of the plane through (7, 3, −5) for which
n = (1, 1, 1) is a vector normal to the plane.
(b) What is the distance of the plane from O?
Your solution
Answer
(a) Using the formula r · n = a · n the equation of the plane is
     
1 7 1
r · 1 =  3  · 1 = 7 × 1 + 3 × 1 − 5 × 1 = 5
1 −5 1
   
7 1
1   5
(b) The distance from the origin is a.n̂ =  3 ·√
 1 =√
−5 3 1 3
Exercises
1. Find the equation of a plane which is normal to 8i + 9j + k and which is a distance 1 from the
origin. Give both vector and Cartesian forms.
2. Find the equation of a plane which passes through (8, 1, 0) and which is normal to the vector
i + 2j − 3k.
 
3
3. What is the distance of the plane r. 2 = 5 from the origin?
1
Answers
 
8 √
1  
1. r · √ 9 = 1; 8x + 9y + z = 146.
146 1
       
1 8 1 1
2. r ·  2  =  1 ·  2 , that is r · 2  = 10.

−3 0 −3 −3
5
3. √
14
66 HELM (2006):
Workbook 9: Vectors
Contents 10
Complex Numbers
10.1 Complex Arithmetic 2
10.2 Argand Diagrams and the Polar Form 12
10.3 The Exponential Form of a Complex Number 20
10.4 De Moivre’s Theorem 29
Learning outcomes
In this Workbook you will learn what a complex number is and how to combine complex
numbers together using the familiar operations of addition, subtraction, multiplication
and division. You will also learn how to describe a complex number graphically using
the Argand diagram. The connection between the exponential function and the
trigonometric functions is explained. You will understand how De Moivre's theorem
is used to obtain fractional powers of complex numbers.

Complex Arithmetic 10.1
Introduction
Complex numbers are used in many areas of engineering and science. In this Section we define
what a complex number is and explore how two such numbers may be combined together by adding,
subtracting, multiplying and dividing. We also show how to find ‘complex roots’ of polynomial
equations.
A complex number is a generalisation of an ordinary real number. In fact, as we shall see, a complex
number is a pair of real numbers ordered in a particular way. Fundamental to the study of complex
numbers is the symbol i with the strange looking property i2 = −1. Apart from this property complex
numbers follow the usual rules of number algebra.
' $
real numbers
• be able to combine algebraic fractions

Prerequisites together
• understand what a polynomial is
• have a knowledge of trigonometric identities

&
' %
$
• combine complex numbers together
• find the modulus and conjugate of a complex

Learning Outcomes number
On completion you should be able to . . . • obtain complex solutions to polynomial
equations
& %
2 HELM (2006):
Workbook 10: Complex Numbers
®
1. What is a complex number?

We assume that you are familiar with the properties of ordinary numbers; examples are
3 √
1, −2, , 2.634, −3.111, π, e, 2
10
We all know how to add, subtract, multiply and divide such numbers. We are aware that the
numbers can be positive or negative or zero and also aware of their geometrical interpretation as
being represented by points on a ‘real’ axis known as a number line (Figure 1).
x
O
Figure 1
The real axis is a line with a direction (usually chosen to be from left to right) indicated by an arrow.
We shall refer to this as the x-axis. On this axis we select a point, arbitrarily, and refer to this as the
origin O. The origin (where zero is located) distinguishes positive numbers from negative numbers:
• to the right of the origin are the positive numbers
• to the left of the origin are the negative numbers
Thus we can ‘locate’ the numbers in our example. See Figure 2.
√ x
− 3.111 −2 O 3 1 2.634 e π
2
10
Figure 2
From now on we shall refer to these ‘ordinary’ numbers as real numbers. We can formalise the
algebra of real numbers into a set of rules which they obey.
So if x1 , x2 and x3 are any three real numbers then we know that, in particular:
1. x1 + x2 = x2 + x1 x1 + (x2 + x3 ) = (x1 + x2 ) + x3
2. 1 × x1 = x1 0 × x1 = 0
3. x1 × x2 = x2 × x1 x1 × (x2 + x3 ) = x1 × x2 + x1 × x3
Also, in multiplication we are familiar with the elementary rules:
(positive) × (positive) = positive (positive) × (negative) = negative
(negative) × (positive) = negative (negative) × (negative) = positive
It follows that if x represents any real number then
x2 ≥ 0
in words, the square of a real number is always non-negative.
In this Workbook we will consider a kind of number (a generalisation of a real number) whose square
is not necessarily positive (and not necessarily real either). Don’t worry that i ‘does not exist’.
Because of that it is called imaginary! We just define it and get on and use it and it then turns out
to be very useful and important in many practical applications. However, it is important to get to
know how to handle complex numbers before using them in calculations. This will not be difficult as
the new set of rules is, in fact, precisely the same set of rules obeyed by the ‘real’ numbers. These
HELM (2006): 3
Section 10.1: Complex Arithmetic
new numbers are called complex numbers.
A complex number is an ordered pair of real numbers, usually denoted by z or w etc. So if a, b are
real numbers then we designate a complex number through:
z = a + ib
where i is a symbol obeying the rule
i2 = −1
For simplicity we shall assume we can write
√
i = −1.
(Often, particularly in engineering applications, the symbol j is used instead of i). Also note that,
conventially, examples of actual complex numbers such as 2 + 3i are written like this and not 2 + i3.
Again we ask the reader to accept matters at this stage without worrying about the meaning of
finding the square root of a negative number. Using this notation we can write
√ p √ √
−4 = (4)(−1) = 4 −1 = 2i etc.
Key Point 1
The symbol i is such that
i2 = −1
Using the normal rules of algebra it follows that
i3 = i2 × i = −i i4 = i2 × i2 = (−1) × (−1) = 1
and so on.
Simple examples of complex numbers are

z1 = 3 + 2i z2 = −3 + (2.461)i z3 = 17i z4 = 3 + 0i = 3
Generally, if z = a + ib then ‘a’ is called the real part of z, or Re(z) for short, and ‘b’ is called the
imaginary part of z or Im(z). The fourth example indicates that the real numbers can be considered
a subset of the complex numbers.
Key Point 2
If z = a + ib then Re(z) = a and Im(z) = b
Both the real and imaginary parts of a complex number are real.
4 HELM (2006):
®
Key Point 3
Two complex numbers z = a + ib and w = c + id are said to be equal if and only if both their real
parts are the same and both their imaginary parts are the same, that is
a=c and b=d
Key Point 4
The modulus of a complex number z = a + ib is denoted by |z| and is defined by
√
|z| = a2 + b2
so that the modulus is always a non-negative real number.
Example 1
If z = 3 − 2i then find Re(z), Im(z) and |z|.
Solution
p √
Here Re(z) = 3, Im(z) = −2 and |z| = 32 + (−2)2 = 13.
Complex conjugate
If z = a + ib is any complex number then the complex conjugate of z is denoted by z ∗ and is defined
by z ∗ = a − ib. (Sometimes the notation z̄ is used instead of z ∗ to denote the conjugate). For
example if z = 2 − 3i then z ∗ = 2 + 3i. If z is entirely real then z ∗ = z whereas if z is entirely
imaginary then z ∗ = −z. E.g. if z = 17i then z ∗ = −17i. In fact the following relationships are
easily obtained:
z + z∗ i(z ∗ − z)
Re(z) = and Im(z) =
2 2
HELM (2006): 5
Task
If z = −2 + i find expressions for Re(z ∗ ) and Im(i(z ∗ − z)).
Hint: first find z ∗ , z ∗ − z, and i(z ∗ − z):
Your solution
Answer
Re(z ∗ ) = −2 and Im(i(z ∗ − z)) = 0
2. The algebra of complex numbers

Complex numbers are added, subtracted, multiplied and divided in much the same way as these
operations are carried out for real numbers.
Addition and subtraction of complex numbers

Let z and w be any two complex numbers
z = a + ib w = c + id
then
z + w = (a + c) + i(b + d) z − w = (a − c) + i(b − d)
For example if z = 2 − 3i, w = −4 + 2i then
z + w = {2 + (−4)} + {(−3) + 2}i = −2 − i z − w = {2 − (−4)} + {(−3) − 2}i = 6 − 5i
Multiplying one complex number by another

In multiplication we proceed using an obvious approach: again consider any two complex numbers
z = a + ib and w = c + id. Then
zw = (a + ib)(c + id)
= ac + aid + ibc + i2 bd
obtained in the usual way by multiplying all the terms in one bracket by all the terms in the other
bracket. Now we use the fundamental relation i2 = −1 so that
zw = ac + aid + ibc − bd
= ac − bd + i(ad + bc)
where we have re-grouped terms with the ‘i’ symbol and terms without the ‘i’ symbol separately.
These are the real and imaginary parts of the product zw respectively. A numerical example will
6 HELM (2006):
®
confirm the approach. If z = 2 − 3i and w = −4 + 2i then
zw = (2 − 3i)(−4 + 2i)
= 2(−4) + 2(2i) − 3i(−4) − 3i(2i)
= −8 + 4i + 12i − 6i2
= −8 + 16i + 6
= −2 + 16i
Task
If z = −2 + i and w = 3 + 2i find expressions for
(a) z + 2w, (b) |z − w| and (c) zw
Your solution
(a)
Answer
z + 2w = 4 + 5i
Your solution
(b) Hint: you should find that z − w = −5 − i
Answer
p √
|z − w| = (−5)2 + (−1)2 = 26
Your solution
(c)
Answer
zw = −6 + 3i − 4i + 2i2 = −8 − i
In general the square of a complex number is not necessarily a positive real number; it may not even
be real at all. For example if z = −2 + i then
z 2 = (−2 + i)2 = 4 − 4i + i2 = 4 − 4i − 1 = 3 − 4i
However, the product of a complex number with its conjugate is always a non-negative real number.
If z = a + ib then
zz ∗ = (a + ib)(a − ib)
= a2 − a(ib) + (ib)a − i2 b2
= a2 − i 2 b 2
= a2 + b 2 since i2 = −1
HELM (2006): 7
For example, if z = 2 + i then
zz ∗ = (2 + i)(2 − i) = 4 + 1 = 5
Task
Show, for any complex number z = a + ib that zz ∗ = |z|2 .
Your solution
Answer
√
By definition |z| = a2 + b2 , so that |z|2 = a2 + b2 . Now zz ∗ = a2 + b2 so that zz ∗ = |z|2 .
Dividing one complex number by another

Here we consider the operation of dividing one complex number z = a + ib by another, w = c + id:
z a + ib
=
w c + id
We wish to simplify the right-hand side into the standard form of a complex number (this is called
the Cartesian form):
(Real part) + i (Imaginary part)
or the equivalent:
(Real part) + (Imaginary part) i
To do this we multiply ‘top and bottom’ by the complex conjugate of the bottom (the denominator),
that is, by c − id (this is called rationalising):
z a + ib a + ib c − id
= = ×
w c + id c + id c − id
and then carry out the multiplication, top and bottom:
z (ac + bd) + i(bc − ad)

=
w c2
+ d2
ac + bd bc − ad
= +i
c2 + d2 c2 + d2
which is now in the required form.

The reason for rationalising is to get a real number in the denominator since a complex number
divided by a real number is easy to evaluate.
8 HELM (2006):
®
Example 2
z
Find if z = 2 − 3i and w = 2 + i.
w
Solution
z 2 − 3i (2 − 3i) × (2 − i)
= = rationalising
w 2+i (2 + i) × (2 − i)
4 − 3 + i(−6 − 2)
= multiplying out
4+1
1 8
= − i dividing through
5 5
Task
2z + 3w
If z = 3 − i and w = 1 + 3i find .
2z − 3w
Your solution
Answer
2z + 3w 9 + 7i (9 + 7i)(3 + 11i)
= =
2z − 3w 3 − 11i (3 − 11i)(3 + 11i)
27 − 77 + (21 + 99)i
=
9 + 121
50 120 5 12
= − + i=− + i
130 130 13 13
HELM (2006): 9
Exercises
1. If z = 2 − i, w = 3 + 4i find expressions
z ∗ (in standard
z Cartesian form) for
∗
(a) z − 3w, (b) zw (c) (d)

w w
2. Verify the following statements for general complex numbers z = a + ib and w = c + id
z |z| ∗ ∗ ∗ z + z∗ i(z ∗ − z)
(a) = (b) (zw) = z w (c) Re(z) = (d) Im(z) = .

w |w| 2 2
3. Find z such that zz ∗ + 3(z − z ∗ ) = 13 + 12i
Answers
√
2 11 5
1. (a) −7 − 13i (b) 2 − 11i (c) + i (d)
25 25 5
∗ ∗
2. Note that since z − z is imaginary then i(z − z) is real!
3. z = ±3 + 2i
3. Solutions of polynomial equations

With the introduction of complex numbers we can now obtain solutions to those polynomial equations
which may have real solutions, complex solutions or a combination of real and complex solutions.
For example, the simple quadratic equation:
x2 + 16 = 0 can be rearranged: x2 = −16
and then taking square roots:
√ √
x = ± −16 = ±4 −1 = ±4i
√
where we are replacing −1 by the symbol ‘i’.
This approach can be extended to the general quadratic equation
√
2 −b ± b2 − 4ac
ax + bx + c = 0 with roots x=
2a
so that for example, if
3x2 + 2x + 2 = 0
then solving for x:
p
−2 ±
4 − 4(3)(2)
x =
2(3)
√ √
−2 ± −20 −2 ± i 20
= =
6 6
√ √ √ √ √
20 2 5 5 1 5 1 5
so, (as = = ), the two roots are − + i and − − i.
6 6 3 3 3 3 3
In this example we see that the two solutions (roots) are complex conjugates of each other. In fact
this will always be the case if the polynomial equation has real coefficients: that is, if any complex
roots occur they will always occur in complex conjugate pairs.
10 HELM (2006):
®
Key Point 5
Complex roots to polynomial equations having real coefficients
always occur in complex conjugate pairs
Example 3
Given that x = 3 − 2i is one root of the cubic equation x3 − 7x2 + 19x − 13 = 0
find the other two roots.
Solution
Since the coefficients of the equation are real and 3 − 2i is a root then its complex conjugate 3 + 2i is
also a root which implies that x − (3 − 2i) and x − (3 + 2i) are factors of the given cubic expression.
Multiplying together these two factors:
(x − (3 − 2i))(x − (3 + 2i)) = x2 − x(3 − 2i) − x(3 + 2i) + 13 = x2 − 6x + 13
So x2 − 6x + 13 is a quadratic factor of the cubic equation. The remaining factor must take the
form (x + a) where a is real, since only one more linear factor of the cubic equation is required, and
so we write:
x3 − 7x2 + 19x − 13 = (x2 − 6x + 13)(x + a)
By inspection (consider for example the constant terms), it is clear that a = −1 so that the final
factor is (x − 1), implying that the original cubic equation has a root at x = 1.
Exercises
1. Find the roots of the equation x2 + 2x + 2 = 0.
2. If i is one root of the cubic equation x3 + 2x2 + x + 2 = 0 find the two other roots.
3. Find the complex number z if 2z + z ∗ + 3i + 2 = 0.

z
4. If z = cos θ + i sin θ show that = cos 2θ + i sin 2θ.
z∗
2
Answers 1. x = −1 ± i 2. −i, −2 3. − − 3i
3
HELM (2006): 11
Argand Diagrams
and the Polar Form 10.2
Introduction
In the first part of this Section we introduce a geometrical interpretation of a complex number. Since
a complex number z = x + iy is defined by two real numbers x and y it is natural to consider a plane
in which to place a complex number. We shall see that there is a close connection between complex
numbers and two-dimensional vectors.
In the second part of this Section we introduce an alternative form, called the polar form, for
representing complex numbers. We shall see that the polar form is particularly advantageous when
multiplying and dividing complex numbers.
' $
• know what a complex number is
• be able to use trigonometric functions sin,

Prerequisites cos and tan
Before starting this Section you should . . . • understand what a polynomial is
• possess a knowledge of vectors

&
' %
$
• represent complex numbers on an Argand
diagram
Learning Outcomes • obtain the polar form of a complex number
• multiply and divide complex numbers in polar
form
& %
12 HELM (2006):
®
1. The argand diagram

In Section 10.1 we met a complex number z = x + iy in which x, y are real numbers and
i2 = −1. We learned how to combine complex numbers together using the usual operations of
addition, subtraction, multiplication and division. In this Section we examine a useful geometrical
description of complex numbers.
Since a complex number is specified by two real numbers x, y it is natural to represent a complex
number by a vector in a plane. We take the usual Oxy plane in which the ‘horizontal’ axis is the
x-axis and the ‘vertical’ axis is the y-axis.
(2, 3)
y
z
(−1, 1) 1
w
−1 0 1 2 x
Figure 3
Thus the complex number z = 2 + 3i would be represented by a line starting from the origin and
ending at the point with coordinates (2, 3) and w = −1 + i is represented by the line starting from
the origin and ending at the point with coordinates (−1, 1). See Figure 3. When the Oxy plane is
used in this way it is called an Argand diagram. With this interpretation the modulus of z, that is
|z| is the length of the line which represents z.
Note: An alternative interpretation is to consider the complex number a + ib to be represented by
the point (a, b) rather than the line from 0 to (a, b).
Task
Given that z = 1 + i, w = i , represent the three complex numbers z, w and
2z − 3w − 1 on an Argand diagram.
Your solution
HELM (2006): 13
Section 10.2: Argand Diagrams and the Polar Form
Answer
Noting that 2z − 3w − 1 = 2 + 2i − 3i − 1 = 1 − i you should obtain the following diagram.
y
1 Argand diagram
w z
0 1 x
1 2z − 3w − 1
If we have two complex numbers z = a + ib, w = c + id then, as we already know

z + w = (a + c) + i(b + d)
that is, the real parts add together and the imaginary parts add together. But this is precisely what
occurs with the addition of two vectors. If p and q are 2-dimensional vectors then:
p = ai + bj q = ci + dj
where i and j are unit vectors in the x- and y-directions respectively. So, using vector addition:
p + q = (a + c)i + (b + d)j
y y
q w
p+q z+w
p z
0 x 0 x
vector addition complex addition
Figure 4
We conclude from this that addition (and hence subtraction) of complex numbers is essentially
equivalent to addition (subtraction) of two-dimensional vectors. (See Figure 4.) Because of this,
complex numbers (when represented on an Argand diagram) are slidable — as long as you keep
their length and direction the same, you can position them anywhere on an Argand diagram.
We see that the Cartesian form of a complex number: z = a + ib is a particularly suitable form for
addition (or subtraction) of complex numbers. However, when we come to consider multiplication
and division of complex numbers, the Cartesian description is not the most convenient form that is
available to us. A much more convenient form is the polar form which we now introduce.
14 HELM (2006):
®
2. The polar form of a complex number

We have seen, above, that the complex number z = a + ib can be represented by a line pointing out
from the origin and ending at a point with Cartesian coordinates (a, b).
y
P
r
z b
θ
0 x
a
Figure 5
To locate the point P we introduce polar coordinates (r, θ) where r is the positive distance from 0
and θ is the angle measured from the positive x-axis, as shown in Figure 5. From the properties of
the right-angled triangle there is an obvious relation between (a, b) and (r, θ):
a = r cos θ b = r sin θ
or equivalently,
√ b
r = a2 + b 2 tan θ = .
a
This leads to an alternative way of writing a complex number:
z = a + ib = r cos θ + ir sin θ
= r(cos θ + i sin θ)
The angle θ is called the argument of z and written, for short, arg(z). The non-negative real number
r is the modulus of z. We normally consider θ measured in radians to lie in the interval −π < θ ≤ π
although any value θ + 2kπ for integer k will be equivalent to θ. The angle θ may be expressed in
radians or degrees.
Key Point 6
If z = a + ib then
z = r(cos θ + i sin θ)
in which
√ b
r = |z| = a2 + b 2 and θ = arg(z) = tan−1
a
HELM (2006): 15
Example 4
Find the polar coordinate form of (a) z = 3 + 4i (b) z = −3 − i
Solution
(a) Here
√ √ 4
r = |z| = 32 + 42 = 25 = 5 θ = arg(z) = tan−1 ( ) = 53.13◦
3
so that z = 5(cos 53.13◦ + i sin 53.13◦ )
(b) Here
p √ (−1)
r = |z| = (−3)2 + (−1)2 = 10 ≈ 3.16 θ = arg(z) = tan−1
(−3)

(−1)
−1 −1 1
It is natural to assume that tan = tan . Using this value on your calculator (unless
(−3) 3
it is very sophisticated) you will obtain a value of about 18.43◦ for tan−1 ( 31 ). This is incorrect since
if we use the Argand diagram to plot z = −3 − i we get:
−3 0
x
θ
−1
Figure 6
The angle θ is clearly −180◦ + 18.43◦ = −161.57◦ .
This example warns us to take care when determining arg(z) purely using algebra. You will always
find it helpful to construct the Argand diagram to locate the particular quadrant into which your
complex number is pointing. Your calculator cannot do this for you.
Finally, in this example, z = 3.16(cos 198.43◦ + i sin 198.43◦ ).
Task
Find the polar coordinate form of the complex numbers
(a) z = −i (b) z = 3 − 4i
Your solution
(a)
16 HELM (2006):
®
Answer
z = 1(cos 270◦ + i sin 270◦ )
Your solution
(b)
Answer
z = 5(cos 306.87◦ + i sin 306.87◦ )
Remember, to get the correct angle, draw your complex number on an Argand diagram.
Multiplication and division using polar coordinates

The reader will perhaps be wondering why we have bothered to introduce the polar form of a complex
number. After all, the calculation of arg(z) is not particularly straightforward. However, as we have
said, the polar form of a complex number is a much more convenient vehicle to use for multiplication
and division of complex numbers. To see why, let us consider two complex numbers in polar form:
z = r(cos θ + i sin θ) w = t(cos φ + i sin φ)
Then the product zw is calculated in the usual way
zw = [r (cos θ + i sin θ)][t (cos φ + i sin φ)]

≡ rt [cos θ cos φ − sin θ sin φ + i(sin θ cos φ + cos θ sin φ)]
≡ rt [cos(θ + φ) + i sin(θ + φ)]
in which we have used the standard trigonometric identities

cos(θ + φ) ≡ cos θ cos φ − sin θ sin φ sin(θ + φ) ≡ sin θ cos φ + cos θ sin φ.
We see that in calculating the product that the moduli r and t multiply together whilst the arguments
arg(z) = θ and arg(w) = φ add together.
Task
z
If z = r(cos θ + i sin θ) and w = t(cos φ + i sin φ) find the polar expression for .
w
Your solution
Answer
z r
= (cos(θ − φ) + i sin(θ − φ))
w t
We see that in calculating the quotient that the moduli r and t divide whilst the arguments arg(z) = θ
and arg(w) = φ subtract.
HELM (2006): 17
Key Point 7
If z = r(cos θ + i sin θ) and w = t(cos φ + i sin φ) then
z r
zw = rt(cos(θ + φ) + i sin(θ + φ)) = (cos(θ − φ) + i sin(θ − φ))
w t
We conclude that addition and subtraction are most easily carried out in Cartesian form whereas
multiplication and division are most easily carried out in polar form.
Complex numbers and rotations

We have seen that, when multiplying one complex number by another, the moduli multiply together
and the arguments add together. If, in particular, w is a complex number with a modulus t
w = t(cos φ + i sin φ) (i.e. r = t)
and if z is a complex number with modulus 1
z = (cos θ + i sin θ) (i.e. r = 1)
then multiplying w by z gives
wz = t(cos(θ + φ) + i sin(θ + φ)) (using Key Point 7)
We see that the effect of multiplying w by z is to rotate the line representing the complex number
w anti-clockwise through an angle θ which is arg(z), and preserving the length. See Figure 7.
y y
wz
w
multiply by z
θ+ φ
φ
0 x 0 x
Figure 7
This result would certainly be difficult to obtain had we continued to use the Cartesian form.
Since, in terms of the polar form of a complex number
−1 = 1(cos 180◦ + i sin 180◦ )
we see that multiplying a number by −1 produces a rotation through 180◦ . In particular multiplying
a number by −1 and then by (−1) again (i.e. (−1)(−1)) rotates the number through 180◦ twice,
totalling 360◦ , which is equivalent to leaving the number unchanged. Hence the introduction of
complex numbers has ‘explained’ the accepted (though not obvious) result (−1)(−1) = +1.
18 HELM (2006):
®
Exercises
1. Display, on an Argand diagram, the complex numbers 1 − i, 1 + 3i and −1 + 2i.
(1 + 3i)
2. Find the polar form of (a) 1 − i, (b) 1 + 3i (c) 2i − 1. Hence calculate
(−1 + 2i)
3. On an Argand diagram draw the complex number 1 + 2i. By changing to polar form examine
the effect of multiplying 1 + 2i by, in turn, i, i2 , i3 , i4 . Represent these new complex numbers
on an Argand diagram.
4. By utilising the Argand diagram convince yourself that |z + w| ≤ |z| + |w| for any two complex
numbers z, w. This is known as the triangle inequality.
Answers
1.
y 1 + 3i
2i − 1
0 x
1−i
√ √
2. (a) 2(cos 315◦ + i sin 315◦ ) (b) 10(cos 71.57◦ + i sin 71.57◦ )
√
(c) 5(cos 116.57◦ + i sin 116.57◦ ).
(1 + 3i) √ √
= 2(cos(−45◦ ) + i sin(−45◦ )) = 2(cos(45◦ ) − i sin(45◦ )) = (1 − i).
(−1 + 2i)
3. Each time you multiply through by i you effect a rotation through 90◦ of the line representing
the complex number 1 + 2i. After four such products you are back to where you started, at
1 + 2i.
4. This inequality states that no one side of a triangle is greater in length than the sum of the
lengths of the other two sides.
HELM (2006): 19
The Exponential Form
of a Complex Number 10.3
Introduction
In this Section we introduce a third way of expressing a complex number: the exponential form. We
shall discover, through the use of the complex number notation, the intimate connection between
the exponential function and the trigonometric functions. We shall also see, using the exponential
form, that certain calculations, particularly multiplication and division of complex numbers, are even
easier than when expressed in polar form.
The exponential form of a complex number is in widespread use in engineering and science.
' $
• be able to convert from degrees to radians
• understand how to use the Cartesian and

Prerequisites polar forms of a complex number
• be familiar with the hyperbolic functions
cosh x and sinh x
&
' %
$
• explain the relations between the
exponential function ex and the
trigonometric functions cos x, sin x
Learning Outcomes • interchange between Cartesian, polar and
On completion you should be able to . . . exponential forms of a complex number
• explain the relation between hyperbolic and

trigonometric functions
& %
20 HELM (2006):
®
1. Series expansions for exponential and trigonometric

functions
We have, so far, considered two ways of representing a complex number:
z = a + ib Cartesian form
or
z = r(cos θ + i sin θ) polar form
In this Section we introduce a third way of denoting a complex number: the exponential form.
If x is a real number then, as we shall verify in 16, the exponential number e raised to the
power x can be written as a series of powers of x:
x2 x3 x4
ex = 1 + x + + + + ···
2! 3! 4!
in which n! = n(n − 1)(n − 2) . . . (3)(2)(1) is the factorial of the integer n. Although there are
an infinite number of terms on the right-hand side, in any practical calculation we could only use a
finite number. For example if we choose x = 1 (and taking only six terms) then
1 1 1 1
e1 ≈ 1 + 1 + + + +
2! 3! 4! 5!
= 2 + 0.5 + 0.16666 + 0.04166 + 0.00833
= 2.71666
which is fairly close to the accurate value of e = 2.71828 (to 5 d.p.)

x2 x3
We ask you to accept that ex , for any real value of x, is the same as 1 + x + + + · · · and
x
2! 3!
that if we wish to calculate e for a particular value of x we will only take a finite number of terms
in the series. Obviously the more terms we take in any particular calculation the more accurate will
be our calculation.
As we shall also see in 16, similar series expansions exist for the trigonometric functions sin x
and cos x:
x3 x5 x7
sin x = x − + − + ···
3! 5! 7!
x2 x4 x6
cos x = 1 − + − + ···
2! 4! 6!
in which x is measured in radians.
The observant reader will see that these two series for sin x and cos x are similar to the series for ex .
Through the use of the symbol i (where i2 = −1) we will examine this close correspondence.
In the series for ex replace x on both left-hand and right-hand sides by iθ to give:
(iθ)2 (iθ)3 (iθ)4 (iθ)5
eiθ = 1 + (iθ) + + + + + ···
2! 3! 4! 5!
Then, as usual, replace every occurrence of i2 by −1 to give
iθ θ2 θ3 θ4 θ5
e = 1 + iθ − −i + + i + ···
2! 3! 4! 5!
HELM (2006): 21
Section 10.3: The Exponential Form of a Complex Number
which, when re-organised into real and imaginary terms gives, finally:
θ2 θ4 θ3 θ5

iθ
e = 1− + − ··· + i θ − + − ···
2! 4! 3! 5!
= cos θ + i sin θ
Key Point 8
eiθ ≡ cos θ + i sin θ
Example 5
Find complex number expressions, in Cartesian form, for
(a) eiπ/4 (b) e−i (c) eiπ
We use Key Point 8:
Solution
(a) eiπ/4 = cos π4 + i sin π4 = √1
2
+ i √12
(b) e−i = cos(−1) + i sin(−1) = 0.540 − i(0.841) don’t forget: use radians
(c) eiπ = cos π + i sin π = −1 + i(0) = −1
2. The exponential form

Since z = r(cos θ + i sin θ) and since eiθ = cos θ + i sin θ we therefore obtain another way in which
to denote a complex number: z = reiθ , called the exponential form.
Key Point 9
The exponential form of a complex number is
z = reiθ in which r = |z| and θ = arg(z)
so
z = reiθ = r(cos θ + i sin θ)
22 HELM (2006):
®
Task
Express z = 3eiπ/6 in Cartesian form, correct to 2 d.p.
Use Key Point 9:
Your solution
Answer
π π
z = 3eiπ/6 = 3(cos + i sin )
6 6
= 3(0.8660 + i0.5000)
= 2.60 + 1.50i to 2 d.p.
Example 6
If z = reiθ and w = teiφ then find expressions for (a) z −1 (b) z ∗ (c) zw
Solution
1 1
(a) If z = reiθ then z −1 = iθ
= e−iθ using the normal rules for indices.
re r
(b) Working in polar form: if z = reiθ = r(cos θ + i sin θ) then
z ∗ = r(cos θ − i sin θ) = r(cos(−θ) + i sin(−θ)) = re−iθ
since cos(−θ) = cos θ and sin(−θ) = − sin θ. In fact this reflects the general rule: to find the
complex conjugate of any expression simply replace i by −i wherever it occurs in the expression.
(c) zw = (reiθ )(teiφ ) = rteiθ eiφ = rteiθ+iφ = rtei(θ+φ) which is again the result we are familiar with:
when complex numbers are multiplied their moduli multiply and their arguments add.
We see that in some circumstances the exponential form is even more convenient than the polar form
since we need not worry about cumbersome trigonometric relations.
HELM (2006): 23
Task
Express the following complex numbers in exponential form:
(a) z = 1 − i (b) z = 2 + 3i (c) z = −6.
Your solution
(a)
Answer
√ √
z = 2ei7π/4 (or, equivalently, 2e−iπ/4 )
Your solution
(b)
Answer
√
z = 13ei(0.9828)
Your solution
(c)
Answer
z = 6eiπ
3. Hyperbolic and trigonometric functions

We have seen in subsection 1 (Key Point 8) that
eiθ = cos θ + i sin θ
It follows from this that
e−iθ = cos(−θ) + i sin(−θ) = cos θ − i sin θ
Now if we add these two relations together we obtain
eiθ + e−iθ
cos θ =
2
whereas if we subtract the second from the first we have
eiθ − e−iθ
sin θ =
2i
These new relations are reminiscent of the hyperbolic functions introduced in 6. There we
defined cosh x and sinh x in terms of the exponential function:
ex + e−x ex − e−x
cosh x = sinh x =
2 2
In fact, if we replace x by iθ in these last two equations we obtain
24 HELM (2006):
®
eiθ + e−iθ eiθ − e−iθ

cosh(iθ) = ≡ cos θ and sinh(iθ) = ≡ i sin θ
2 2
Although, by our notation, we have implied that both x and θ are real quantities in fact these
expressions for cosh and sinh in terms of cos and sin are quite general.
Key Point 10
If z is any complex number then
cosh(iz) ≡ cos z and sinh(iz) ≡ i sin z
Equivalently, replacing z by iz:
cosh z ≡ cos(iz) and i sinh z ≡ sin(iz)
Task
Given that cos2 z + sin2 z ≡ 1 for all z then, utilising complex numbers, obtain
the equivalent identity for hyperbolic functions.
Your solution
Answer
You should obtain cosh2 z − sinh2 z ≡ 1 since, if we replace z by iz in the given identity then
cos2 (iz) + sin2 (iz) ≡ 1. But as noted above cos(iz) ≡ cosh z and sin(iz) ≡ i sinh z so the result
follows.
Further analysis similar to that in the above task leads to Osborne’s rule:
Key Point 11
Osborne’s Rule
Hyperbolic function identities are obtained from trigonometric function
identities by replacing sin θ by sinh θ and cos θ by cosh θ except that
every occurrence of sin2 θ is replaced by − sinh2 θ.
HELM (2006): 25
Example 7
Use Osborne’s rule to obtain the hyperbolic identity equivalent to
1 + tan2 θ ≡ sec2 θ.
Solution
2 2 sin2 θ 1
Here 1 + tan θ ≡ sec θ is equivalent to 1 + 2
≡ . Hence if
cos θ cos2 θ
sin2 θ → − sinh2 θ and cos2 θ → cosh2 θ
then we obtain
sinh2 θ 1
1− 2 ≡ or, equivalently, 1 − tanh2 θ ≡ sech2 θ
cosh θ cosh2 θ
Feedback applied to an amplifier
Feedback is applied to an amplifier such that

A
A0 =
1 − βA
where A0 , A and β are complex quantities. A is the amplifier gain, A0 is the gain with feedback and
β is the proportion of the output which has been fed back.
Vi Vo
Gain
A
Feedback loop
Figure 8: An amplifier with feedback
(a) If at 30 Hz, A = −500 and β = 0.005e8πi/9 , calculate A0 in exponential form.

(b) At a particular frequency it is desired to have A0 = 300e5πi/9 where it is known that
A = 400e11πi/18 . Find the value of β necessary to achieve this gain modification.
26 HELM (2006):
®

A
For (a): substitute A = −500 and β = 0.005e8πi/9 into A0 = in order to find A0 .
1 − βA
For (b): we need to solve for β when A0 = 300e5πi/9 and A = 400e11πi/18 .
A −500 −500
(a) A0 = = 8πi/9
=
1 − βA 1 − 0.005e × (−500) 1 + 2.5e8πi/9
Expressing the bottom line of this expression in Cartesian form this becomes:
−500 −500
A0 = ≈
8π 8π −1.349 + 0.855i
1 + 2.5 cos( ) + 2.5i sin( )
9 9
Expressing both the top and bottom lines in exponential form we get:
500eiπ
A0 ≈ ≈ 313e0.566i
1.597ei2.576
A
(b) A0 = → A0 (1 − βA) = A → −βAA0 = A − A0
1 − βA
A0 − A 1 1
i.e. β= 0
→β= − 0
AA A A
So
1 1 1 1
β= − 0 = 11πi/18
− ≈ 0.0025e−11πi18 − 0.00333e−5πi/9
A A 400e 300e5πi/9
Expressing both complex numbers in Cartesian form gives
11π 11π 5π 5π
β = 0.0025 cos(− ) + 0.0025i sin(− ) − 0.00333 cos(− ) − 0.00333i sin(− )
18 18 9 9
= −2.768 × 10−4 + 9.3017 × 10−4 i = 9.7048 × 10−4 e1.86i
So to 3 significant figures β = 9.70 × 10−4 e1.86i
HELM (2006): 27
Exercises
1. Two standard identities in trigonometry are sin 2z ≡ 2 sin z cos z and cos 2z ≡ cos2 z − sin2 z.
Use Osborne’s rule to obtain the corresponding identities for hyperbolic functions.
2. Express sinh(a + ib) in Cartesian form.
3. Express the following complex numbers in Cartesian form (a) 3eiπ/3 (b) e−2πi (c) eiπ/2 eiπ/4 .
4. Express the following complex numbers in exponential form
(a) z = 2 − i (b) z = 4 − 3i (c) z −1 where z = 2 − 3i.

iπ
5. Obtain the real and imaginary parts of sinh(1 + ).
6
Answers
1. sinh 2z ≡ 2 sinh z cosh z, cosh 2z ≡ cosh2 z + sinh2 z.
2. sinh(a + ib) ≡ sinh a cosh ib + cosh a sinh ib
≡ sinh a cos b + cosh a(i sin b)
≡ sinh a cos b + i cosh a sin b
3. (a) 1.5 + i(2.598) (b) 1 (c) −0.707 + i(0.707)
√ √ 1 1
4. (a) 5ei(5.820) (b) 5ei(5.6397) (c) 2 − 3i = 13ei(5.300) therefore = √ e−i(5.300)
2 − 3i 13
√
iπ 3 i
5. sinh(1 + ) = sinh 1 + cosh 1 = 1.0178 + i(0.7715)
6 2 2
28 HELM (2006):
®

De Moivre’s Theorem 10.4

Introduction
In this Section we introduce De Moivre’s theorem and examine some of its consequences. We shall
see that one of its uses is in obtaining relationships between trigonometric functions of multiple angles
(like sin 3x, cos 7x) and powers of trigonometric functions (like sin2 x, cos4 x). Another important
use of De Moivre’s theorem is in obtaining complex roots of polynomial equations. In this application
we re-examine our definition of the argument arg(z) of a complex number.
' $
• be familiar with the polar form of a complex
number
• be familiar with the Argand diagram

Prerequisites
• be familiar with the trigonometric identity
Before starting this Section you should . . . cos2 θ + sin2 θ ≡ 1
• know how to expand (x + y)n when n is a

positive integer
&
' %
$
• employ De Moivre’s theorem in a number of
applications
Learning Outcomes • fully define the argument arg(z) of a complex
On completion you should be able to . . . number
• obtain complex roots of complex numbers

& %
HELM (2006): 29
Section 10.4: De Moivre’s Theorem
1. De Moivre’s theorem
We have seen, in Section 10.2 Key Point 7, that, in polar form, if z = r(cos θ + i sin θ) and
w = t(cos φ + i sin φ) then the product zw is:
zw = rt(cos(θ + φ) + i sin(θ + φ))
In particular, if r = 1, t = 1 and θ = φ (i.e. z = w = cos θ + i sin θ), we obtain
(cos θ + i sin θ)2 = cos 2θ + i sin 2θ
Multiplying each side of the above equation by cos θ + i sin θ gives
(cos θ + i sin θ)3 = (cos 2θ + i sin 2θ)(cos θ + i sin θ) = cos 3θ + i sin 3θ
on adding the arguments of the terms in the product.
Similarly
(cos θ + i sin θ)4 = cos 4θ + i sin 4θ.
After completing p such products we have:
(cos θ + i sin θ)p = cos pθ + i sin pθ
where p is a positive integer.
In fact this result can be shown to be true for those cases in which p is a negative integer and even
when p is a rational number e.g. p = 21 .
Key Point 12
If p is a rational number:
(cos θ + i sin θ)p ≡ cos pθ + i sin pθ
This result is known as De Moivre’s theorem.
Recalling from Key Point 8 that cos θ + i sin θ = eiθ , De Moivre’s theorem is simply a statement of
the laws of indices:
(eiθ )p = eipθ
2. De Moivre’s theorem and root finding

In this subsection we ask if we can obtain fractional powers of complex numbers; for example what
are the values of 81/3 or (−24)1/4 or even (1 + i)1/2 ?
More precisely, for these three examples, we are asking for those values of z which satisfy
z3 − 8 = 0 or z 4 + 24 = 0 or z 2 − (1 + i) = 0
30 HELM (2006):
®
Each of these problems involve finding roots of a complex number.

To solve problems such as these we shall need to be more careful with our interpretation of arg(z)
for a given complex number z.
Arg(z ) revisited
By definition arg(z) is the angle made by the line representing z with the positive x-axis. See Figure
9(a). However, as the Figure 9(b) shows you can increase θ by 2π (or 3600 ) and still obtain the same
line in the xy plane. In general, as indicated in Figure 9(c) any integer multiple of 2π can be added
to or subtracted from arg(z) without affecting the Cartesian form of the complex number.
y y y
P P P
z z z
θ θ + 2π θ + 2kπ
x x x
(a) (b) (c)
Figure 9
Key Point 13
arg(z) is unique only up to an integer multiple of 2π radians
For example:
√ π π
z =1+i= 2(cos+ i sin ) in polar form
4 4
However, we could also write, equivalently:
√ π π
z = 1 + i = 2(cos( + 2π) + i sin( + 2π))
4 4
or, in full generality:
√ π π
z = 1 + i = 2(cos( + 2kπ) + i sin( + 2kπ)) k = 0, ±1, ±2, · · ·
4 4
This last expression shows that in the polar form of a complex number the argument of z, arg(z),
can assume infinitely many different values, each one differing by an integer multiple of 2π. This is
nothing more than a consequence of the well-known properties of the trigonometric functions:
cos(θ + 2kπ) ≡ cos θ, sin(θ + 2kπ) ≡ sin θ for any integer k
We shall now show how we can use this more general interpretation of arg(z) in the process of finding
roots.
HELM (2006): 31
Example 8
Find all the values of 81/3 .
Solution
Solving z = 81/3 for z is equivalent to solving the cubic equation z 3 − 8 = 0. We expect that there
are three possible values of z satisfying this cubic equation. Thus, rearranging: z 3 = 8. Now write
the right-hand side as a complex number in polar form:
z 3 = 8(cos 0 + i sin 0)
(i.e. r = |8| = 8 and arg(8) = 0). However, if we now generalise our expression for the argument,
by adding an arbitrary integer multiple of 2π, we obtain the modified expression:
z 3 = 8(cos(2kπ) + i sin(2kπ)) k = 0, ±1, ±2, · · ·
Now take the cube root of both sides:
√3 1
z = 8(cos(2kπ) + i sin(2kπ)) 3
√3 2kπ 2kπ
= 8(cos + i sin ) using De Moivre’s theorem.
3 3
Now in this expression k can take any integer value or zero. The normal procedure is to take three
consecutive values of k (say k = 0, 1, 2). Any other value of k chosen will lead to a root (a value
of z) which repeats one of the three already determined.
So if k = 0 z0 = 2(cos 0 + i sin 0) = 2
2π 2π √
k = 1 z1 = 2(cos + i sin ) = −1 + i 3
3 3
4π 4π √
k = 2 z2 = 2(cos + i sin ) = −1 − i 3
3 3
1
These are
√ the three (complex) values
√ of3 8 3 . The reader should verify, by direct multiplication, that
3
(−1 + i 3) = 8 and that (−1 − i 3) = 8.
The reader may have noticed within this Example a subtle change in notation. When, for example,
we write
√ 81/3 then we are expecting three possible values, as calculated above. However, when we
3
write 8 then we are only expecting one value: that delivered by your calculator.
Note the two complex roots are complex conjugates (since z 3 − 8 = 0 is a polynomial equation with
real coefficients).
32 HELM (2006):
®
In Example 8 we have worked with the polar form. Precisely the same calculation can be carried
through using the exponential form of a complex number. We take this opportunity to repeat this
calculation but working exclusively in exponential form.
Thus
z3 = 8
= 8ei(0) (i.e. r = |8| = 8 and arg(8) = 0)
= 8ei(2kπ) k = 0, ±1, ±2, · · ·
therefore taking cube roots
√ 1
z = 8 ei(2kπ) 3
3

√3 i2kπ
= 8e 3 using De Moivre’s theorem.
Again k can take any integer value or zero. Any three consecutive values will give the roots.
So if k = 0 z0 = 2ei0 = 2
i2π √
k = 1 z1 = 2e 3 = −1 + i 3
i4π √
k = 2 z2 = 2e 3 = −1 − i 3
1
These are the three (complex) values of 8 3 obtained using the exponential form. Of course at the
end of the calculation we have converted back to standard Cartesian form.
Task
Following the procedure outlined in Example 8 obtain the two complex values of
(1 + i)1/2 .
Begin by obtaining the polar form (using the general form of the argument) of (1 + i):
Your solution
Answer
√ π π
You should obtain 1 + i = 2(cos( + 2kπ) + i sin( + 2kπ)) k = 0, ±1, ±2, · · ·.
4 4
Now take the square root and use De Moivre’s theorem to complete the solution:
Your solution
HELM (2006): 33
Answer
You should obtain
√4 π π
z1 = 2(cos + i sin ) = 1.099 + 0.455i
8 8
√4 π π
z2 = 2(cos( + π) + i sin( + π)) = −1.099 − 0.455i
8 8
A good exercise would be to repeat the calculation using the exponential form.
Exercise
Find all those values of z which satisfy z 4 + 1 = 0. Write your values in standard Cartesian form.
Answer
1 i 1 i 1 i 1 i
z0 = √ + √ z1 = − √ + √ z2 = − √ − √ z3 = √ − √
2 2 2 2 2 2 2 2
34 HELM (2006):
Contents 11
Differentiation
11.1 Introducing Differentiation 2
11.2 Using a Table of Derivatives 11
11.3 Higher Derivatives 24
11.4 Differentiating Products and Quotients 29
11.5 The Chain Rule 38
11.6 Parametric Differentiation 44
11.7 Implicit Differentiation 51
Learning outcomes
In this Workbook you will learn what a derivative is and how to obtain the derivative of
many commonly occurring functions. You will learn of the relationship between a derivative
and the tangent line to a curve. You will learn something of the limiting process which
arises in many areas of mathematics. You will learn how to use a table of derivatives to
obtain the derivative of simple combinations of functions. Finally, you will learn how to take
higher derivatives
Introducing
Differentiation 11.1
Introduction
Differentiation is a technique which can be used for analysing the way in which functions change. In
particular, it measures how rapidly a function is changing at any point. In engineering applications
the function may, for example, represent the magnetic field strength of a motor, the voltage across
a capacitor, the temperature of a chemical mix, and it is often important to know how quickly these
quantities change.
In this Section we explain what is meant by the gradient of a curve and introduce differentiation as
a method for finding the gradient at any point.

• understand functional notation, e.g. y = f (x)
Prerequisites
• be able to calculate the gradient of a straight
Before starting this Section you should . . . line

'
$
• explain what is meant by the tangent to a
curve
Learning Outcomes • explain what is meant by the gradient of a

curve at a point
• calculate the derivative of a number of simple
functions from first principles
& %
2 HELM (2006):
Workbook 11: Differentiation
®
1. Drawing tangents
Look at the graph shown in Figure 1a. A and B are two points on the graph, and they have been
joined by a straight line. The straight line segment AB is known as a chord. We have lengthened
the chord on both sides so that it extends beyond both A and B.
y y y
A A A
!1 !2 !
x x x
Figure 1a Figure 1b Figure 1c
In Figure 1b we have moved point B nearer to point A before drawing the extended chord. Imagine
what would happen if we continue moving B nearer and nearer to A. You can do this for yourself by
drawing additional points on the graph. Eventually, when B coincides with A, the extended chord is
a straight line which just touches the curve at A. This line is now called the tangent to the curve
at A, and is shown in Figure 1c.
If we know the position of two points on the line we can find the gradient of the straight line and
can calculate the gradient of the tangent. We define the gradient of the curve at A to be the
gradient of the tangent there. If this gradient is large at a particular point, the rate at which the
function is changing is large too. If the gradient is small, the rate at which the function is changing
is small. This is illustrated in Figure 2. Because of this, the gradient at A is also known as the
instantaneous rate of change of the curve at A. Recall from your knowledge of the straight line,
that if the line slopes upwards as we look from left to right, the gradient of the line is positive,
whereas if the line slopes downwards, the gradient is negative.
y y
! !
x x
The gradient of the tangent at P is small, so the The gradient of the tangent at Q is large so the rate
rate at which the function is changing is small. rate at which the function is changing is large.
Figure 2
HELM (2006): 3
Section 11.1: Introducing Differentiation
Key Point 1
The gradient of the curve at a point, P , is equal to tan θ where θ is the angle the tangent line at
P makes with the positive x axis.
θ
x
Figure 3
Task
Draw in, by eye, tangents to the curve shown below, at points A to E. State
whether each tangent has positive, negative or zero gradient.
A D
E
C
B x
Your solution
Answer
A negative, B zero, C positive, D zero, E negative
In the following subsection we will see how to calculate the gradient of a curve precisely.
4 HELM (2006):
®
2. Finding the gradient at a specific point
In this subsection we shall consider a simple function to illustrate the calculation of a gradient. Look
at the graph of the function y(x) = x2 shown in Figure 4. Notice that the gradient of the graph
changes as we move from point to point. In some places the gradient is positive; at others it is
negative. The gradient is greater at some points than at others. In fact the gradient changes from
point to point as we move along the curve.
15 B(4,16)
y = x2
10
the slope is
the slope is
positive here
negative here
5
A(1,1)
−4 −3 −2 −1 1 2 3 4 x
Figure 4
Inspect the graph carefully and make the following observations:
(a) A is the point with coordinates (1, 1).
(b) B is the point with coordinates (4, 16).
(c) We can calculate the gradient of the line AB from the formula
difference between y coordinates
gradient =
difference between x coordinates
16 − 1 15
Therefore the gradient of chord AB is equal to = = 5. The gradient of AB is not the
4−1 3
same as the gradient of the graph at A but we can regard it as an approximation, or estimate of the
gradient at A. Is it an over-estimate or under-estimate ?
Task
Add the point C to the graph in Figure 4 where C has coordinates (3, 9). Draw
the line AC and calculate its gradient.
Your solution
gradient =
HELM (2006): 5
Answer
9−1
= 4. Would you agree that this is a better estimate of the gradient at A than using AB?
3−1
We now carry the last task further by introducing point D at (2, 4) and point E at (1.5, 2.25) as
shown in Figure 5. The gradient of AD is found to be 3 and the gradient of AE is 2.5.
y y = x2
15 B(4,16)
10
5
D(2,4)
E(1.5, 2.25)
A(1,1)
1 2 3 4 x
Figure 5
Observe that each time we carry out this procedure, and move the second point closer to A, the
gradient of the line drawn is getting closer and closer to the gradient of the tangent at A. If we
continue, the value we eventually obtain is the gradient of the tangent at A whose value is 2 as we
will see shortly. This procedure illustrates how we define the gradient of the curve at A.
3. Finding the gradient at a general point
We now carry out the previous procedure more mathematically. Consider the graph of y(x) = x2 in
Figure 6. Let point A be any point with coordinates (a, a2 ), and let point B be a second point with
x coordinate (a + h).
The y coordinate at A is a2 , because A lies on the graph y = x2 .
Similarly the y coordinate at B is (a + h)2 .
Therefore the gradient of the chord AB is
(a + h)2 − a2
h
This simplifies to
a2 + 2ha + h2 − a2 2ha + h2 h(2a + h)
= = = 2a + h
h h h
This is the gradient of the line AB. As we let B move closer to A the value of h gets smaller and
smaller and eventually tends to zero. We write this as h → 0.
Now, as h → 0, the gradient of AB tends to 2a. Thus the gradient of the tangent to the curve at
point A is 2a. Because A is an arbitrary point, this result gives us a formula for finding the gradient
of the graph of y = x2 at any point: the gradient is simply twice the x coordinate there. For
example when x = 3 the gradient is 2 × 3, that is 6, and when x = 1 the gradient is 2 × 1, that is 2
as we saw in the previous subsection.
6 HELM (2006):
®
y
B(a+h, (a+h)2)
A(a,a2)
h
x
Figure 6
Generally, at a point whose coordinate is x the gradient is given by 2x. The function, 2x which
gives the gradient of y = x2 is called the derivative of y with respect to x. It has other names too
including the rate of change of y with respect to x.
A special notation is used to represent the derivative. It is not a particularly user-friendly notation
dy
but it is important to get used to it anyway. We write the derivative as , pronounced ‘dee y over
dx
dee x’ or ‘dee y by dee x’ or even ‘dee y, dee x’.
dy
is not a fraction - so you can’t do things like cancel the d’s - just remember that it is the symbol
dx
or notation for the derivative. An alternative notation for the derivative is y 0 .
Key Point 2
dy
The derivative of y(x) is written or y 0 (x) or simply y 0
dx
Exercises
1. Carry out the procedure above for the function y = 3x2 :
(a) Let A be the point (a, 3a2 ).
(b) Let B be the point (a + h, 3(a + h)2 ).
(c) Find the gradient of the line AB.
(d) Let h → 0 to find the gradient of the curve at A.
2. Carry out the procedure above for the function y = x3 :
(a) Let A be the point (a, a3 ).
(b) Let B be the point (a + h, (a + h)3 ).
(c) Find the gradient of the line AB.
(d) Let h → 0 to find the gradient of the curve at A.
HELM (2006): 7
Answers
dy
1. gradient AB = 6a + 3h, gradient at A = 6a. So, if y = 3x2 , = 6x,
dx
dy
2. gradient AB = 3a2 + 3ah + h2 , gradient at A = 3a2 . So, if y = x3 , = 3x2 .
dx
4. Differentiation of a general function from first principles
Consider the graph of y = f (x) shown in Figure 7.
y y=f(x) y y=f(x)
(x+h, f(x+h)) B
A h A
(x, f(x)) !
x x
Figure 7: As h → 0 the chord AB becomes the tangent at A
Carefully make the following observations:

(a) Point A has coordinates (x, f (x)).
(b) Point B has coordinates (x + h, f (x + h)).
(c) The straight line AB has gradient
f (x + h) − f (x)
h
(d) If we let h → 0 we can find the gradient of the graph of y = f (x) at the arbitrary
point A, provided we can evaluate the appropriate limit on h. The resulting limit is the
df
derivative of f with respect to x and is written or f 0 (x).
dx
8 HELM (2006):
®
Key Point 3
Definition of Derivative
Given y = f (x), its derivative is defined as
df f (x + h) − f (x)
= in the limit as h tends to 0.
dx h
This is written
df f (x + h) − f (x)
= lim
dx h→0 h
df
In a graphical context, the value of at A is equal to tan θ which is the tangent of the angle that
dx
the gradient line makes with the positive x-axis.
Example 1
Differentiate f (x) = x2 + 2x + 3 from first principles.
Solution

df f (x + h) − f (x)
= lim
dx h→0 h
[(x + h)2 + 2(x + h) + 3] − [x2 + 2x + 3]

= lim
h→0 h
2
x + 2xh + h2 + 2x + 2h + 3 − x2 − 2x − 3]

= lim
h→0 h
2xh + h2 + 2h

= lim
h→0 h
= lim {2x + h + 2}
h→0
= 2x + 2
HELM (2006): 9
Exercises
df
1. Use the definition of the derivative to find when
dx
1
(a) f (x) = 4x2 , (b) f (x) = 2x3 , (c) f (x) = 7x + 3, (d) f (x) = .
x
(Harder: try (e) f (x) = sin x and use the small angle approximation sin θ ≈ θ if θ is small and
measured in radians.)
2. Using your results from Exercise 1 calculate the gradient of the following graphs at the given
points:
(a) f (x) = 4x2 at x = −2, (b) f (x) = 2x3 at x = 2, (c) f (x) = 7x + 3 at x = −5,
1
(d) f (x) = at x = 1/2.
x
x
3. Find the rate of change of the function y(x) = at x = 3 by considering the interval
x+3
x = 3 to x = 3 + h.
Answers
1. (a) 8x, (b) 6x2 , (c) 7, (d) −1/x2 , (e) cos x.
2. (a) −16, (b) 24, (c) 7, (d) −4
3. 1/12
10 HELM (2006):
®
Using a Table of
Derivatives 11.2
Introduction
In Section 11.1 you were introduced to the idea of a derivative and you calculated some derivatives
from first principles. Rather than calculating the derivative of a function from first principles it is
common practice to use a table of derivatives. This Section provides such a table and shows you
how to use it.
#
• understand the meaning of the term
Prerequisites ‘derivative’
dy
Before starting this Section you should . . . • understand what is meant by the notation
dx
" !

Learning Outcomes • use a table of derivatives to perform

differentiation

HELM (2006): 11
Section 11.2: Using a Table of Derivatives
1. Table of derivatives
Table 1 lists some of the common functions used in engineering and their corresponding derivatives.
Remember that in each case the function in the right-hand column gives the rate of change, or the
gradient of the graph, of the function on the left at a particular value of x.
N.B. The angle must always be in radians when differentiating trigonometric functions.
Table 1
Common functions and their derivatives
(In this table k, n and c are constants)
Function Derivative
constant 0
x 1
kx k
n
x nxn−1
n
kx knxn−1
ex ex
kx
e kekx
ln x 1/x
ln kx 1/x
sin x cos x
sin kx k cos kx
sin(kx + c) k cos(kx + c)
cos x − sin x
cos kx −k sin kx
cos(kx + c) −k sin(kx + c)
tan x sec2 x
tan kx k sec2 kx
tan(kx + c) k sec2 (kx + c)
In the trigonometric functions the angle is in radians.
Key Point 4
Particularly important is the rule for differentiating powers of functions:
dy
If y = xn then = nxn−1
dx
dy
For example, if y = x3 then = 3x2 .
dx
12 HELM (2006):
®
Example 2
dy
Use Table 1 to find when y is given by (a) 7x (b) 14 (c) 5x2 (d) 4x7
dx
Solution
dy
(a) We note that 7x is of the form kx where k = 7. Using Table 1 we then have = 7.
dx
dy
(b) Noting that 14 is a constant we see that = 0.
dx
(c) We see that 5x2 is of the form kxn , with k = 5 and n = 2. The derivative, knxn−1 , is then
dy
10x1 , or more simply, 10x. So if y = 5x2 , then = 10x.
dx
dy
(d) We see that 4x7 is of the form kxn , with k = 4 and n = 7. Hence the derivative, , is
dx
6
given by 28x .
Task
dy √ 5
Use Table 1 to find when y is (a) x (b)
dx x3
√ 1 1
(a) Write x as x 2 , and use the result for differentiating xn with n = .
2
Your solution
Answer
dy 1 1 1 1 1
= nxn−1 = x 2 −1 = x− 2 . This may be written as √ .
dx 2 2 2 x
5
(b) Write as 5x−3 and use the result for differentiating kxn with k = 5 and n = −3.
x3
Your solution
Answer
5(−3)x−3−1 = −15x−4
Although Table 1 is written using x as the independent variable, the Table can be used for any
variable.
HELM (2006): 13
Task
Use Table 1 to find
dz dp dz
(a) given z = et (b) given p = e8t (c) given z = e−3y
dt dt dy
(a)
Your solution
dz
=
dt
Answer
dy dz
From Table 1, if y = ex , then = ex . Hence if z = et then = et .
dx dt
(b)
Your solution
dp
=
dt
Answer
8e8t
(c)
Your solution
dz
=
dy
Answer
−3e−3y
Task
dy x
Find the derivative, , when y is (a) sin 2x (b) cos (c) tan 5x
dx 2
(a) Use the result for sin kx in Table 1, taking k = 2:

Your solution
dy
=
dx
Answer
2 cos 2x
14 HELM (2006):
®
x 1
(b) Note that cos is the same as cos x. Use the result for cos kx in Table 1:
2 2
Your solution
dy
=
dx
Answer
1 x
− sin
2 2
(c) Use the result for tan kx in Table 1:

Your solution
dy
=
dx
Answer
5 sec2 5x
Exercises
1. Find the derivatives of the following functions with respect to x:
(a) 9x2 (b) 5 (c) 6x3 (d) −13x4
dz
2. Find when z is given by:
dt
5 √ 3 3
(a) 3 (b) t3 (c) 5t−2 (d) − t 2 (e) ln 5t
t 2
3. Find the derivative of each of the following with respect to the appropriate variable:
1
(a) sin 5x (b) cos 4t (c) tan 3r (d) e2v (e)
e3t
4. Find the derivatives of the following with respect to x:
2x x
(a) cos (b) sin(−2x) (c) tan πx (d) e 2 (e) ln 32 x
3
Answers
1. (a) 18x (b) 0 (c) 18x2 (d) −52x3

3 1 9 1 1
2. (a) −15t−4 (b) t2 (c) −10t−3 (d) − t 2 (e)
2 4 t
3. (a) 5 cos 5x (b) −4 sin 4t (c) 3 sec2 3r (d) 2e2v (e) −3e−3t
2 2x 1 x 1
4. (a) − sin (b) −2 cos(−2x) (c) π sec2 πx (d) e2 (e)
3 3 2 x
HELM (2006): 15
Electrostatic potential
Introduction
The electrostatic potential due to a point charge Q coulombs at a position r (m) from the charge is
given by
Q
V =
4π0 r
where 0 , the permittivity of free space, ≈ 8.85 × 10−12 F m−1 and π ≈ 3.14.
dV
The field strength at position r is given by E = − .
dr
Problem in words
Find the electric field strength at a distance of 5 m from a source with a charge of 1 coulomb.

Q
V =
4π0 r
Substitute for 0 and π and use Q = 1, so that
1 9 × 109
V = ≈ = 9 × 109 r−1
4 × 3.14 × 8.85 × 10−12 r r
dV
We need to differentiate V in order to find the electric field strength from the relationship E = −
dr
dV
E=− = −9 × 109 (−r−2 ) = 9 × 109 r−2
dr
When r = 5,
9 × 109
E= = 3.6 × 108 (V m−1 .)
25
Interpretation
The electric field strength is 3.6 × 108 V m−1 at r = 5 m.
Note that the field potential varies with the reciprocal of distance (i.e. inverse linear law with distance)
whereas the field strength obeys an inverse square law with distance.
16 HELM (2006):
®
2. Extending the table of derivatives

We now quote simple rules which enable us to extend the range of functions which we can differentiate.
The first two rules are for differentiating sums or differences of functions. The reader should note
that all of the rules quoted below can be obtained from first principles using the approach outlined
in Section 11.1.
Key Point 5
df dg
Rule 1: The derivative of f (x) + g(x) is +
dx dx
df dg
Rule 2: The derivative of f (x) − g(x) is −
dx dx
These rules say that to find the derivative of the sum (or difference) of two functions, we simply
calculate the sum (or difference) of the derivatives of each function.
Example 3
Find the derivative of y = x6 + x4 .
Solution
We simply calculate the sum of the derivatives of each separate function:
dy
= 6x5 + 4x3
dx
The third rule tells us how to differentiate a multiple of a function. We have already met and applied
particular cases of this rule which appear in Table 1.
Key Point 6
df
Rule 3: If k is a constant, the derivative of kf (x) is k
dx
HELM (2006): 17
This rule tells us that if a function is multiplied by a constant, k, then the derivative is also multiplied
by the same constant, k.
Example 4
Find the derivative of y = 8e2x
Solution
Here we are interested in differentiating a multiple of the function e2x . We differentiate e2x , giving
2e2x , and multiply the result by 8. Thus
dy
= 8 × 2e2x = 16e2x
dx
Example 5
Find the derivative of y = 6 sin 2x + 3x2 − 5e3x
Solution
We differentiate each part of the function in turn.
y = 6 sin 2x + 3x2 − 5e3x

dy
= 6(2 cos 2x) + 3(2x) − 5(3e3x )
dx
= 12 cos 2x + 6x − 15e3x
Task
dy
Find where y = 7x5 − 3e5x .
dx
First find the derivative of 7x5 :

Your solution
Answer
7(5x4 ) = 35x4
18 HELM (2006):
®
Next find the derivative of 3e5x :

Your solution
Answer
3(5e5x ) = 15e5x
Combine your results to find the derivative of 7x5 − 3e5x :

Your solution
dy
=
dx
Answer
35x4 − 15e5x
Task
dy x
Find where y = 4 cos + 17 − 9x3 .
dx 2
x
First find the derivative of 4 cos :
2
Your solution
Answer
1 x x
4(− sin ) = −2 sin
2 2 2
Next find the derivative of 17:

Your solution
Answer
0
Then find the derivative of −9x3 :
Your solution
Answer
3(−9x2 ) = −27x2
HELM (2006): 19
x
Finally state the derivative of y = 4 cos + 17 − 9x3 :
2
Your solution
dy
=
dx
Answer
x
−2 sin − 27x2
2
Exercises
dy
1. Find when y is given by:
dx
9 14 3 + 2x
(a) 3x7 + 8x3 (b) −3x4 + 2x1.5 (c) 2
+ − 3x (d) (e) (2 + 3x)2
x x 4
2. Find the derivative of each of the following functions:
(a) z(t) = 5 sin t + sin 5t (b) h(v) = 3 cos 2v − 6 sin v2
2 n2 e3t
(c) m(n) = 4e2n + + (d) H(t) = 2
+ 2 tan 2t (e) S(r) = (r2 + 1)2 − 4e−2r
e2n 2
3. Differentiate the following functions.
1
(a) A(t) = (3 + et )2 (b) B(s) = πe2s + + 2 sin πs
s
1 θ
(c) V (r) = (1 + )2 + (r + 1)2 (d) M (θ) = 6 sin 2θ − 2 cos + 2θ2
r 4
(e) H(t) = 4 tan 3t + 3 sin 2t − 2 cos 4t
Answers
18 14 1
1. (a) 21x6 + 24x2 (b) −12x3 + 3x0.5 (c) − 3
− 2 − 3 (d) (e) 12 + 18x
x x 2
v
2. (a) z 0 = 5 cos t + 5 cos 5t (b) h0 = −6 sin 2v − 3 cos (c) m0 = 8e2n − 4e−2n + n
2
3e3t
0
(d) H = + 4 sec2 2t (e) S 0 = 4r3 + 4r + 8e−2r
2
1
3. (a) A0 = 6et + 2e2t (b) B 0 = 2πe2s − + 2π cos(πs)
s2
2 2 1 θ
(c) V 0 = − 2 − 3 + 2r + 2 (d) M 0 = 12 cos 2θ + sin + 4θ
r r 2 4
(e) H 0 = 12 sec2 3t + 6 cos 2t + 8 sin 4t
20 HELM (2006):
®
3. Evaluating a derivative
The need to find the rate of change of a function at a particular point occurs often. We do this by
finding the derivative of the function, and then evaluating the derivative at that point. When taking
derivatives of trigonometric functions, any angles must be measured in radians. Consider a function,
dy
y(x). We use the notation (a) or y 0 (a) to denote the derivative of y evaluated at x = a. So
dx
y 0 (0.5) means the value of the derivative of y when x = 0.5.
Example 6
Find the value of the derivative of y = x3 where x = 2. Interpret your result.
Solution
dy
We have y = x3 and so = 3x2 .
dx
dy dy
When x = 2, = 3(2)2 = 12, that is, (2) = 12 (Equivalently, y 0 (2) = 12).
dx dx
The derivative is positive when x = 2 and so y is increasing at this point. When x = 2, y is
increasing at a rate of 12 vertical units per horizontal unit.
Electromotive force
Introduction
Potential difference in an electrical circuit is produced by electromotive force (e.m.f.) which is
measured in volts and describes the force that maintains current flow around a closed path. Every
source of continuous electrical energy, including batteries, generators and thermocouples, consist
essentially of an energy converter that produces an e.m.f. An electric current always produces a
magnetic field. So the current i which flows round any closed path produces a magnetic flux φ which
passes through that path. Conversely, if another closed path, i.e. another coil, is placed within the
first path, then the magnetic field due to the first circuit can induce an e.m.f. and hence a current
in the second coil. The simplest closed path is a single loop. More commonly, helical coils, known
as search coils, with known area and number of turns, are used. The induced e.m.f. depends upon
the number of turns in the coil. The search coil is used with a fluxmeter to measure the change of
flux linkage.
Problem in words
A current i is travelling through a single turn loop of radius 1 m. A 4-turn search coil of effective
area 0.03 m2 is placed inside the loop. The magnetic flux in weber (Wb) linking the search coil is
given by:
HELM (2006): 21
iA
φ = µ0
2r
where r (m) is the radius of the current carrying loop, A (m2 ) is the area of the search coil and µ0
is the permeability of free space, 4 × 10−7 H m−1 .

dφ
Find the e.m.f. (in volts) induced in the search coil, given by ε = −N where N is the number
dt
of turns in the search coil, and the current is given by
i = 20 sin(20πt) + 50 sin(30πt)

iA dφ
Substitute i = 20 sin(20πt) + 50 sin(30πt) into φ = µ0 and find ε = −N when r = 1 m,
2r dt
A = 0.03 m2 , µ0 = 4 × 10−7 H m−1 and N = 4.
iA µ0 A
φ = µ0 = (20 sin(20πt) + 50 sin(30πt))
2r 2r
dφ µ0 A
So = (20π × 20 cos(20πt) + 30π × 50 cos(30πt))
dt 2r
µ0 A
= (400π cos(20πt) + 1500π cos(30πt))
2r
dφ µ0 A
so ε = −N = −4 (400π cos(20πt) + 1500π cos(30πt))
dt 2r
Now, µ0 = 4π × 10−7 , A = 0.03 and r = 1
−4 × 4π × 10−7 × 0.03
So ε = (400π cos(20πt) + 1500π cos(30πt))
2×1
= −7.5398 × 10−8 (1256.64 cos(20πt) + 4712.39 cos(30πt))
= −9.475 × 10−5 cos(20πt) + 3.553 × 10−4 cos(30πt)
Interpretation
The induced e.m.f. is −9.475 × 10−5 cos(20πt) + 3.553 × 10−4 cos(30πt).
The graphs in Figure 8 show the initial current in the single loop and the e.m.f. induced in the search
coil.
22 HELM (2006):
®
100
50
0
i(t)
−50
−100 0.25 0.5 0.75 1.0

6
emf(t) × 104 0
−2
−4
−6
Figure 8
Note that the induced e.m.f. does not start at zero, which the initial current does, and has a different
pattern of variation with time.
Exercises
1. Calculate the derivative of y = x2 + sin x when x = 0.2 radians.
2. Calculate the rate of change of i(t) = 4 sin 2t + 3t when

π
(a) t = (b) t = 0.6 radians
3
3. Calculate the rate of change of F (t) = 5 sin t − 3 cos 2t when
(a) t = 0 radians (b) t = 1.3 radians
Answers
1. 1.380
2. (a) −1 (b) 5.8989
3. (a) 5 (b) 4.4305
HELM (2006): 23

Higher Derivatives 11.3

Introduction
dy
The derivative, , is more expressly called the first derivative of y. By differentiating the first
dx
derivative, we obtain the second derivative; by differentiating the second derivative we obtain
the third derivative and so on. These second and subsequent derivatives are known as higher
derivatives. Second derivatives in particular occur frequently in engineering contexts.

Prerequisites • be able to differentiate standard functions


Learning Outcomes • obtain higher derivatives


24 HELM (2006):
®
1. The derivative of a derivative

You have already learnt how to calculate the derivative of a function using a table of derivatives and
dy
applying some basic rules. By differentiating the function, y(x), we obtain the derivative, . By
dx
repeating the process we can obtain higher derivatives.
Example 7
Calculate the first, second and third derivatives of y = x4 + 6x2 .
Solution
dy
The first derivative is :
dx
first derivative = 4x3 + 12x
To obtain the second derivative we differentiate the first derivative.
second derivative = 12x2 + 12
The third derivative is found by differentiating the second derivative.
third derivative = 24x + 0 = 24x
2. Notation for derivatives

Just as there is a notation for the first derivative so there is a similar notation for higher derivatives.
dy d
Consider the function, y(x). We know that the first derivative is or (y) which is the instruction
dx dx
to differentiate the function y(x). The second derivative is calculated by differentiating the first
derivative, that is

d dy
second derivative =
dx dx
So, using a fairly obvious adaptation of our derivative notation, the second derivative is denoted by
d2 y
and is read as ‘dee two y by dee x squared’. This is often written more concisely as y 00 .
dx2
d3 y
In similar manner, the third derivative is denoted by 3 or y 000 and so on. So, referring to Example
dx
6 we could have written
dy
first derivative = = 4x3 + 12x
dx
d2 y
second derivative = 2 = 12x2 + 12
dx
d3 y
third derivative = 3 = 24x
dx
HELM (2006): 25
Section 11.3: Higher Derivatives
Key Point 7
If y = y(x) then its first, second and third derivatives are denoted by:
dy d2 y d3 y
dx dx2 dx3
or y0 y 00 y 000
In most examples we use x to denote the independent variable and y the dependent variable. However,
in many applications, time t is the independent variable. In this case a special notation is used for
dy
derivatives. Derivatives with respect to t are often indicated using a dot notation, so can be
dt
written as ẏ, pronounced ‘y dot’. Similarly, a second derivative with respect to t can be written as
ÿ, pronounced ‘y double dot’.
Key Point 8
If y = y(t) then
dy d2 y
ẏ stands for , ÿ stands for etc
dt dt2
Task
d2 y d3 y
Calculate and given y = e2t + cos t.
dt2 dt3
dy
First find :
dx
Your solution
Answer
dy
= 2e2t − sin t
dt
26 HELM (2006):
®
Now obtain the second derivative:

Your solution
d2 y
=
dt2
Answer
4e2t − cos t
Finally, obtain the third derivative:
Your solution
d3 y d d2 y

= =
dt3 dt dt2
Answer
8e2t + sin t
Note that in the last Task we could have used the dot notation and written ẏ = 2e2t − sin t,
ÿ = 4e2t − cos t and ÿ˙ = 8e2t + sin t
We may need to evaluate higher derivatives at specific points. We use an obvious notation.
d2 y
The second derivative of y(x), evaluated at say, x = 2, is written as (2), or more simply as y 00 (2).
dx2
d3 y
The third derivative evaluated at x = −1 is written as (−1) or y 000 (−1).
dx3
Task
Given y(x) = 2 sin x + 3x2 find (a) y 0 (1) (b) y 00 (−1) (c) y 000 (0)
First find y 0 (x), y 00 (x) and y 000 (x):

Your solution
y 0 (x) = y 00 (x) = y 000 =
Answer
y 0 (x) = 2 cos x + 6x y 00 (x) = −2 sin x + 6 y 000 (x) = −2 cos x
Now substitute x = 1 in y 0 (x) to obtain y 0 (1):
Your solution
(a) y 0 (1) =
Answer
y 0 (1) = 2 cos 1 + 6(1) = 7.0806. Remember, in cos 1 the ‘1’ is 1 radian.
HELM (2006): 27
Section 11.3: Higher Derivatives
Now find y 00 (−1):
Your solution
(b) y 00 (−1) =
Answer
y 00 (−1) = −2 sin(−1) + 6 = 7.6829
Finally, find y 000 (0):
Your solution
(c) y 000 (0) =
Answer
y 000 (0) = −2 cos 0 = −2.
Exercises
d2 y
1. Find where y(x) is defined by:
dx2
√
(a) 3x2 − e2x (b) sin 3x + cos x (c) x (d) ex + e−x (e) 1 + x + x2 + ln x
d3 y
2. Find where y is given in Exercise 1.
dx3
3. Calculate ÿ(1) where y(t) is given by:
1 t
(a) t(t2 + 1) (b) sin(−2t) (c) 2et + e2t (d) (e) cos
t 2
...
4. Calculate y (−1) for the functions given in Exercise 3.
Answers
1 1
1. (a) 6 − 4e2x (b) −9 sin 3x − cos x (c) − x−3/2 (d) ex + e−x (e) 2 −
4 x2
3 −5/2 2
2. (a) −8e2x (b) −27 cos 3x + sin x (c) x (d) ex − e−x (e) 3
8 x
3. (a) 6 (b) 3.6372 (c) 34.9927 (d) 2 (e) −0.2194
4. (a) 6 (b) −3.3292 (c) 1.8184 (d) −6 (e) −0.0599
28 HELM (2006):
®
Differentiating
ProductsandQuotients 11.4
Introduction
We have seen, in the first three Sections, how standard functions like xn , eax , sin ax, cos ax, ln ax
may be differentiated.
In this Section we see how more complicated functions may be differentiated. We concentrate, for
eax ln x
the moment, on products and quotients of standard functions like xn eax , .
sin x
We will see that two simple rules may be consistently employed to obtain the derivatives of such
functions.
#
• be able to differentiate the standard
functions: logarithms, polynomials,
Prerequisites exponentials, and trigonometric functions
• be able to manipulate algebraic expressions
"
!

• differentiate products and quotients of the
Learning Outcomes standard functions
On completion you should be able to . . . • differentiate a quotient using the product rule

HELM (2006): 29
Section 11.4: Differentiating Products and Quotients
1. Differentiating a product
In previous Sections we have examined the process of differentiating functions. We found how to
obtain the derivative of many commonly occurring functions. These are recorded in the following
table (remember, arguments of trigonometric functions are assumed to be in radians).
Table 2
dy
y
dx
xn nxn−1
sin ax a cos ax
cos ax −a sin ax
tan ax a sec2 ax
sec ax a sec x tan x
1
ln ax
x
eax aeax
cosh ax a sinh ax
sinh ax a cosh ax
In this Section we consider how to differentiate non-standard functions - in particular those which can
be written as the product of standard functions. Being able to differentiate such functions depends
upon the following Key Point.
Key Point 9
Product Rule
dy df dg
If y = f (x)g(x) then = g(x) + f (x) (or y 0 = f 0 g + f g 0 )
dx dx dx
dy dv du
If y = u.v then =u +v (or y 0 = uv 0 + vu0 )
dx dx dx
These versions are equivalent, and called the product rule.
We shall not prove this result, instead we shall concentrate on its use.
30 HELM (2006):
®
Example 8
Differentiate (a) y = x2 sin x (b) y = x ln x
Solution
df dg
(a) Here f (x) = x2 , g(x) = sin x ∴ = 2x , = cos x
dx dx
dy
and so = 2x(sin x) + x2 (cos x) = x(2 sin x + x cos x)
dx
df dg 1
(b) Here f (x) = x, g(x) = ln x ∴ =1, =
dx dx x

dy 1
and so = 1.(ln x) + x. = ln x + 1
dx x
Task
e2x
Determine the derivatives of the following functions (a) y = ex ln x, (b) y =
x2
(a) Use the product rule:

Your solution
df dg
f (x) = = g(x) = =
dx dx
dy
∴ =
dx
Answer
dy ex
= ex ln x +
dx x
(b) Write y = (x−2 )e2x and then differentiate:

Your solution
df dg
f (x) = = g(x) = =
dx dx
dy
∴ =
dx
Answer
dy 2e2x
= (−2x−3 )e2x + x−2 (2e2x ) = 3 (−1 + x)
dx x
HELM (2006): 31
The rule for differentiating a product can be extended to any number of products. If, for example,
y = f (x)g(x)h(x) then
dy df d
= [g(x)h(x)] + f (x) [g(x)h(x)]
dx dx dx
df dg dh
= g(x)h(x) + f (x) h(x) + g(x)
dx dx dx
df dg dh
= g(x)h(x) + f (x) h(x) + f (x)g(x)
dx dx dx
That is, each function in the product is differentiated in turn and the three results added together.
Example 9
dy
If y = xe2x sin x then find .
dx
Solution
Here f (x) = x, g(x) = e2x , h(x) = sin x
df dg dh
= 1, = 2e2x , = cos x
dx dx dx
dy
∴ = 1(e2x sin x) + x(2e2x ) sin x + xe2x (cos x)
dx
= e2x (sin x + 2x sin x + x cos x)
Task
Obtain the first derivative of y = x2 (ln x) sinh x.
Firstly identify the three functions:
Your solution
f (x) = g(x) = h(x) =
Answer
f (x) = x2 , g(x) = ln x, h(x) = sinh x
Now find the derivative of each of these functions:
32 HELM (2006):
®
Your solution
df dg dh
= = =
dx dx dx
Answer
df dg 1 dh
= 2x, = , = cosh x
dx dx x dx
dy
Finally obtain :
dx
Your solution
Answer

dy 2 1
= 2x(ln x) sinh x + x sinh x + x2 ln x(cosh x)
dx x
= 2x ln x sinh x + x sinh x + x2 ln x cosh x
Task
Find the second derivative of y = x2 (ln x) sinh x by differentiating each of the
dy
three terms making up found in the previous Task (2x ln x sinh x, x sinh x,
dx
2
x ln x cosh x), and finally, simplify your answer by collecting like terms together:
Your solution
d
(2x ln x sinh x) =
dx
d
(x sinh x) =
dx
d 2
(x ln x cosh x) =
dx
d2 y
=
dx2
Answer
d2 y
= (2 + x2 ) ln x sinh x + 3 sinh x + 2x cosh x + 4x ln x cosh x
dx2
HELM (2006): 33
Exercises
1. In each case find the derivative of the function
(a) y = x tan x
(b) y = x4 ln(2x)
(c) y = sin2 x
(d) y = e2x cos 3x
2. Find the derivatives of:

x
(a) y =
cos x
(b) y = ex sin x
(c) Obtain the derivative of y = xex tan x using the results of parts (a) and (b).
Answers
dy
1. (a) = tan x + x sec2 x
dx
dy x4
(b) = 4x3 ln(2x) + = x3 (4 ln(2x) + 1)
dx x
(c) y = sin x. sin x
dy
∴ = cos x sin x + sin x cos x = 2 sin x cos x = sin 2x
dx
dy
(d) = (2e2x ) cos 3x + e2x (−3 sin 3x) = e2x (2 cos 3x − 3 sin 3x)
dx
dy
2. (a) y = x sec x ∴ = sec x + x sec x tan x
dx
dy
(b) = ex sin x + ex cos x = ex (sin x + cos x)
dx
(c) The derivative of y = xex tan x = (x sec x)(ex sin x) is found by applying the product
rule to the results of (a) and (b):
dy d d
= (x sec x).ex sin x + (x sec x) (ex sin x)
dx dx dx
= (sec x + x sec x tan x)e sin x + x sec x(ex )(sin x + cos x)
x
= ex (x + tan x + x tan x + x tan2 x)
34 HELM (2006):
®
2. Differentiating a quotient
f (x)
In this Section we consider functions of the form y = . To find the derivative of such a function
g(x)
we make use of the following Key Point:
Key Point 10
Quotient Rule
df dg
f (x) dy g(x) − f (x) gf 0 − g 0 f
If y = then = dx dx (or y 0 = )
g(x) dx [g(x)]2 g2
du dv
u dy v− u vu0 − v 0 u
If y = then = dx 2 dx (or y 0 = )
v dx v v2
These are two equivalent versions of the quotient rule.
Example 10
ln x
Find the derivative of y =
x
Solution
Here f (x) = ln x and g(x) = x
df 1 dg
∴ = and =1
dx x dx
1
x − 1(ln x)
dy x 1 − ln x
Hence = 2
=
dx [x] x2
HELM (2006): 35
Task
sin x
Obtain the derivative of y = (a) using the formula for differentiating a
x2
product and (b) using the formula for differentiating a quotient.
dy
(a) Write y = x−2 sin x then use the product rule to find :
dx
Your solution
Answer
dy dy −2 sin x + x cos x
y = x−2 sin x ∴ = (−2x−3 ) sin x + x−2 cos x ∴ =
dx dx x3
dy
(b) Now use the quotient rule instead to find :
dx
Your solution
Answer
sin x dy x2 (cos x) − (2x) sin x x cos x − 2 sin x
y= 2 ∴ = 2 2
=
x dx (x ]) x3
36 HELM (2006):
®
Exercise
Find the derivatives of the following:
(a) (2x3 − 4x2 )(3x5 + x2 )

2x3 + 4
(b)
x2 − 4x + 1
x2 + 2x + 1
(c)
x2 − 2x + 1
(d) (x2 + 3)(2x − 5)(3x + 2)
(2x + 1)(3x − 1)
(e)
x+5
(f) (ln x) sin x
(g) (ln x)/ sin x
(h) ex /x2
ex sin x
(i)
cos 2x
Answer
(a) 48x7 − 84x6 + 10x4 − 16x3

2x4 − 16x3 + 6x2 − 8x + 16
(b)
(x2 − 4x + 1)2
4(x + 1)
(c) −
(x − 1)3
(d) 24x3 − 33x2 + 16x − 33
6(x2 + 10x + 1)
(e)
(x + 5)2
1
(f) sin x + (ln x) cos x
x

1
sin x − (ln x) cos x
x 1
(g) 2 = cosecx( − cot x ln x)
sin x x
2 x x
x e − 2xe
(h) = (x−2 − 2x−3 )ex
x4
cos 2x(ex sin x + ex cos x) + 2 sin 2xex sin x
(i)
cos2 2x
= ex [(sin x + cos x) sec 2x + 2 sin x sin 2x sec2 2x]
HELM (2006): 37

The Chain Rule 11.5
Introduction
In this Section we will see how to obtain the derivative of a composite function (often referred to as
a ‘function of a function’). To do this we use the chain rule. This rule can be used to obtain the
2
derivatives of functions such as ex +3x (the exponential
√ function of a polynomial); sin(ln x) (the sine
function of the natural logarithm function); x3 + 4 (the square root function of a polynomial).

• be able to differentiate standard functions
Prerequisites
• be able to use the product and quotient rule
Before starting this Section you should . . . for finding derivatives

• differentiate a function of a function
Learning Outcomes using the chain rule
On completion you should be able to . . . • differentiate a power function

38 HELM (2006):
®
1. The meaning of a function of a function

√
When we use a function like sin 2x or eln x or x2 + 1 we are in fact dealing with a composite
function or function of a function.
sin 2x is the sine function of 2x. This is, in fact, how we ‘read’ it:
sin 2x is read ‘sine of 2x’
Similarly eln x is the exponential function of the logarithm of x:
eln x is read ‘e to the power of ln x’
√
Finally x2 + 1 is also a composite function. It is the square root function of the polynomial x2 + 1:
√
x2 + 1 is read as the ‘square root of (x2 + 1)’
When we talk about a function of a function in a general setting we will use the notation f (g(x))
where both f and g are functions.
Example 11
Specify the functions f, g for the composite functions
√
(a) sin 2x (b) x2 + 1 (c) eln x
Solution
(a) Here f is the sine function and g is the polynomial 2x. We often write:
f (g) = sin g and g(x) = 2x

√
(b) Here f (g) = g and g(x) = x2 + 1
(c) Here f (g) = eg and g(x) = ln x
In each case the original function of x is obtained when g(x) is substituted into f (g).
HELM (2006): 39
Section 11.5: The Chain Rule
Task
Specify the functions f, g for the composite functions
(a) cos(3x2 − 1) (b) sinh(ex ) (c) (x2 + 3x − 1)1/3
Your solution
(a)
Answer
f (g) = cos g g(x) = 3x2 − 1
Your solution
(b)
Answer
f (g) = sinh g g(x) = ex
Your solution
(c)
Answer
f (g) = g 1/3 g(x) = x2 + 3x − 1
2. The derivative of a function of a function

To differentiate a function of a function we use the following Key Point:
Key Point 11
The Chain Rule
If y = f (g(x)), that is, a function of a function, then
dy df dg
= ×
dx dg dx
This is called the chain rule.
40 HELM (2006):
®
Example 12
Find the derivatives of the following composite functions using the chain rule and
check the result using other methods
(a) (2x2 − 1)2 (b) ln ex
Solution
(a) Here y = f (g(x)) where f (g) = g 2 and g(x) = 2x2 − 1. Thus
df dg dy
= 2g and = 4x ∴ = 2g.(4x) = 2(2x2 − 1)(4x) = 8x(2x2 − 1)
dg dx dx
This result is easily checked by using the rule for differentiating products:
dy
y = (2x2 −1)(2x2 −1) so = 4x(2x2 −1)+(2x2 −1)(4x) = 8x(2x2 −1) as obtained above.
dx
(b) Here y = f (g(x)) where f (g) = ln g and g(x) = ex . Thus
df 1 dg dy 1 1
= and = ex ∴ = · ex = x · ex = 1
dg g dx dx g e
This is easily checked since, of course,
dy
y = ln ex = x and so, obviously = 1 as obtained above.
dx
Task
Obtain the derivatives of the following functions
3
2 9 2x + 1
(a) (2x − 5x + 3) (b) sin(cos x) (c)
2x − 1
(a) Specify f and g for the first function:
Your solution
f (g) = g(x) =
Answer
f (g) = g 9 g(x) = 2x2 − 5x + 3
Now obtain the derivative using the chain rule:
Your solution
Answer
9(2x2 − 5x + 3)8 (4x − 5). Can you see how to obtain the derivative without going through the
intermediate stage of specifying f, g?
HELM (2006): 41
(b) Specify f and g for the second function:
Your solution
Answer
f (g) = sin g g(x) = cos x
Now use the chain rule to obtain the derivative:

Your solution
Answer
−[cos(cos x)] sin x
(c) Apply the chain rule to the third function:
Your solution
Answer
12(2x + 1)2
−
(2x − 1)4
3. Power functions
An example of a function of a function which often occurs is the so-called power function [g(x)]k
where k is any rational number. This is an example of a function of a function in which
f (g) = g k
dy df dg dg
Thus, using the chain rule: if y = [g(x)]k then = · = k g k−1 .
dx dg dx dx
dy 1
For example, if y = (sin x + cos x)1/3 then = (sin x + cos x)−2/3 (cos x − sin x).
dx 3
42 HELM (2006):
®
Task
Find the derivatives of the following power functions
(a) y = sin3 x (b) y = (x2 + 1)1/2 (c) y = (e3x )7
(a) Note that sin3 x is the conventional way of writing (sin x)3 . Now find its derivative:
Your solution
Answer
dy
= 3(sin x)2 cos x which we would normally write as 3 sin2 x cos x
dx
(b) Use the function of a function approach again:
Your solution
Answer
dy 1 x
= (x2 + 1)−1/2 2x = √
dx 2 2
x +1
(c) Use the function of a function approach first, and then look for a quicker way in this case:
Your solution
Answer
dy
= 7(e3x )6 (3e3x ) = 21(e3x )7 = 21e21x
dx
dy
Note that (e3x )7 = e21x ∴ = 21e21x directly - a much quicker way.
dx
Exercise
Obtain the derivatives of the following functions:
4
2x + 1
(a) (b) tan(3x2 + 2x) (c) sin2 (3x2 − 1)
3x − 1
Answer
20(2x + 1)3
(a) − (b) 2(3x + 1) sec2 (3x2 + 2x)
(3x − 1)5
(c) 6x sin(6x2 − 2) (remember sin 2x ≡ 2 sin x cos x)
HELM (2006): 43
Parametric
Introduction
Sometimes the equation of a curve is not be given in Cartesian form y = f (x) but in parametric form:
dy
x = h(t), y = g(t). In this Section we see how to calculate the derivative from a knowledge of
dx
dx dy
the so-called parametric derivatives and . We then extend this to the determination of the
dt dt
d2 y
second derivative 2 .
dx
Parametric functions arise often in particle dynamics in which the parameter t represents the time
and (x(t), y(t)) then represents the position of a particle as it varies with time.

Prerequisites
• be able to plot a curve given in parametric

• find first and second derivatives when the
Learning Outcomes equation of a curve is given in parametric
On completion you should be able to . . . form

44 HELM (2006):
®
1. Parametric differentiation
In this subsection we consider the parametric approach to describing a curve:
x = h(t) y = g(t) t ≤t≤t
| {z } |0 {z }1
/ \
parametric equations parametric range
As various values of t are chosen within the parameter range the corresponding values of x, y are
calculated from the parametric equations. When these points are plotted on an xy plane they trace
out a curve. The Cartesian equation of this curve is obtained by eliminating the parameter t from
the parametric equations. For example, consider the curve:
x = 2 cos t y = 2 sin t 0 ≤ t ≤ 2π.
We can eliminate the t variable in an obvious way - square each parametric equation and then add:
x2 + y 2 = 4 cos2 t + 4 sin2 t = 4 ∴ x2 + y 2 = 4
which we recognise as the standard equation of a circle with centre at (0, 0) with radius 2.
In a similar fashion the parametric equations
x = 2t y = 4t2 −∞<t<∞
describes a parabola. This follows since, eliminating the parameter t:
2
x x
t= ∴ y=4 so y = x2
2 4
which we recognise as the standard equation of a parabola.
dy
The question we wish to address in this Section is ‘how do we obtain the derivative if a curve is
dx
given in parametric form?’ To answer this we note the key result in this area:
Key Point 12
Parametric Differentiation
If x = h(t) and y = g(t) then
dy dy dx
= ÷
dx dt dt
dy
We note that this result allows the determination of without the need to find y as an explicit
dx
function of x.
HELM (2006): 45
Section 11.6: Parametric Differentiation
Example 13
Determine the equation of the tangent line to the semicircle with parametric equa-
tions
x = cos t y = sin t 0≤t≤π
at t = π/4.
Solution
The semicircle is drawn in Figure 9. We have also drawn the tangent line at t = π/4 (or, equivalently,
π 1 π 1
at x = cos = √ , y = sin = √ .)
4 2 4 2
y
√ P
1/ 2
π/4
x
√
1/ 2
Figure 9
Now
dy dy dx cos t
= ÷ = = − cot t.
dx dt dt − sin t
π dy π
Thus at t = we have = − cot = −1.
4 dx 4
The equation of the tangent line is
y = mx + c
where m is the gradient of the line and c is a constant.
Clearly m = −1 (since, at the point P the line and the circle have the same gradient).

1 1
To find c we note that the line passes through the point P with coordinates √ , √ . Hence
2 2
1 1 2
√ = (−1) √ + c ∴ c= √
2 2 2
Finally,
2
y = −x + √
2
is the equation of the tangent line at the point in question.
46 HELM (2006):
®
We should note, before proceeding, that a derivative with respect to the parameter t is often denoted
by a ‘dot’. Thus
dx dy d2 x
= ẋ, = ẏ, = ẍ etc.
dt dt dt2
Task
dy
Find the value of if x = 3t, y = t2 − 4t + 1.
dx
dy
Check your result by finding in the normal way.
dx
dx dy
First find , :
dt dt
Your solution
Answer
dx dy
= 3, = 2t − 4
dt dt
dy
Now obtain :
dx
Your solution
Answer
dy dy dx 2t − 4 2 4
= ÷ = = t− ,
dx dt dt 3 3 3
dy ẏ 2t − 4 2 4
or, using the ‘dot’ notation = = = t−
dx ẋ 3 3 3
dy
Now find y explicitly as a function of x by eliminating t, and so find directly:
dx
Your solution
Answer
x x2 4x dy 2x 4 2t 4
t= ∴ y= − + 1. Finally: = − = − .
3 9 3 dx 9 3 3 3
HELM (2006): 47
Task
dy
Find the value of at t = 2 if x = 3t − 4 sin πt, y = t2 + t cos πt, 0≤t≤4
dx
dx dy
First find , :
dt dt
Your solution
Answer
dx dy
= 3 − 4π cos πt = 2t + cos πt − πt sin πt
dt dt
dy
Now obtain :
dx
Your solution
Answer
dy dy dx 2t + cos πt − πt sin πt
= ÷ =
dx dt dt 3 − 4π cos πt
dy ẏ 2t + cos πt − πt sin πt
or, using the dot notation, = =
dx ẋ 3 − 4π cos πt
dy
Finally, substitute t = 2 to find at this value of t.
dx
Your solution
Answer

dy 4+1 5
= = = −0.523
dx t=2 3 − 4π
3 − 4π
48 HELM (2006):
®
2. Higher derivatives
dy
Having found the first derivative using parametric differentiation we now ask how we might
dx
d2 y
determine the second derivative 2 .
dx
By definition:
d2 y

d dy
=
dx2 dx dx
But
d2 y

dy ẏ d ẏ
= and so =
dx ẋ dx2 dx ẋ
ẏ
Now is a function of t so we can change the derivative with respect to x into a derivative with
ẋ
respect to t since

d dy d dy dt
=
dx dx dt dx dx
from the function of a function rule (Key Point 11 in Section 11.5).
But, differentiating the quotient ẏ/ẋ, we have

d ẏ ẋÿ − ẏẍ dt 1 1
= and = =
dt ẋ ẋ2 dx dx ẋ
dt
so finally:
d2 y ẋÿ − ẏẍ
=
dx2 ẋ3
Key Point 13
If x = h(t), y = g(t) then the first and second derivatives of y with respect to x are:
dy ẏ d2 y ẋÿ − ẏẍ
= and =
dx ẋ dx2 ẋ3
HELM (2006): 49
Example 14
dy d2 y
If the equations of a curve are x = 2t, y = t2 − 3, determine and 2 .
dx dx
Solution
dy ẏ 2t
Here ẋ = 2, ẏ = 2t ∴ = = = t.
dx ẋ 2
d2 y 2(2) − 2t(0) 1
Also ẍ = 0, ÿ = 2 ∴ 2
= 3
= .
dx (2) 2
x x2
These results can easily be checked since t = and y = t2 − 3 which imply y = − 3. Therefore
2 4
dy 2x x d2 y 1
the derivatives can be obtained directly: = = and 2
= .
dx 4 2 dx 2
Exercises
dy d2 y
1. For the following sets of parametric equations find and 2
dx dx
(a) x = 3t2 y = 4t3 (b) x = 4 − t2 y = t2 + 4t (c) x = t2 et y=t
2. Find the equation of the tangent line to the curve

π
x = 1 + 3 sin t y = 2 − 5 cos t at t =
6
Answers
dy d2 y 1 dy 2 d2 y 1
1. (a) = 2t, 2
= . (b) = −1 − , 2
=− 3
dx dx 3t dx t dx t
dy e−t d2 y e−2t (t2 + 4t + 2)
(c) = , = −
dx 2t + t2 dx2 (t + 2)3 t3
2. ẋ = 3 cos t ẏ = +5 sin t
√
dy 5 dy 5 π 5 1 5 3
∴ = tan t ∴ = tan = √ =
dx 3 dx t=π/6 3 6 3 3 9
√
5 3
The equation of the tangent line is y = mx + c where m = .
9
√
π 3 3
The line passes through the point x = 1 + 3 sin = 1 + , y = 2 − 5 and so
6 2 2
√ √ √
3 5 3 3 35 3
2−5 = (1 + ) + c ∴ c=2−
2 9 2 9
50 HELM (2006):
®
Implicit
Introduction
This Section introduces implicit differentiation which is used to differentiate functions expressed in
implicit form (where the variables are found together). Examples are x3 + xy + y 2 = 1, and
x2 y 2
+ 2 = 1 which represents an ellipse.
a2 b


Prerequisites
• be competent in using the chain rule

Learning Outcomes • differentiate functions expressed implicitly


HELM (2006): 51
Section 11.7: Implicit Differentiation
1. Implicit and explicit functions
1
Equations such as y = x2 , y = , y = sin x are said to define y explicitly as a function of x because
x
the variable y appears alone on one side of the equation.
The equation
yx + y + 1 = x
is not of the form y = f (x) but can be put into this form by simple algebra.
Task
Write y as the subject of
yx + y + 1 = x
Your solution
Answer
We have y(x + 1) = x − 1 so
x−1
y=
x+1
We say that y is defined implicitly as a function of x by means of yx + y +1 = x, the actual function

being given explicitly as
x−1
y=
x+1
We note than an equation relating x and y can implicitly define more than one function of x.
For example, if we solve
x2 + y 2 = 1
√
we obtain y = ± 1 − x2 so x2 + y 2 = 1 defines implicitly two functions
√ √
f1 (x) = 1 − x2 f2 (x) = − 1 − x2
52 HELM (2006):
®
Task √ √
Sketch the graphs of f1 (x) = 1 − x2 f2 (x) = − 1 − x2
(The equation x2 + y 2 = 1 should give you the clue.)
Your solution
Answer
Since x2 + y 2 = 1 is the well-known equation of the circle with centre at the origin and radius 1, it
follows that the graphs of f1 (x) and f2 (x) are the upper and lower halves of this circle.
y √ y
f1 (x) = + 1 − x2
−1 1
x x
−1 1
√
f2 (x) = − 1 − x2
Sometimes it is difficult or even impossible to solve an equation in x and y to obtain y explicitly in

terms of x.
Examples where explicit expressions for y cannot be obtained are
sin(xy) = y x2 + sin y = 2y
2. Differentiation of implicit functions

Fortunately it is not necessary to obtain y in terms of x in order to differentiate a function defined
implicitly.
Consider the simple equation
xy = 1
dy
Here it is clearly possible to obtain y as the subject of this equation and hence obtain .
dx
HELM (2006): 53
Task
dy
Express y explicitly in terms of x and find for the case xy = 1.
dx
Your solution
Answer
We have immediately
1 dy 1
y= so =− 2
x dx x
dy
We now show an alternative way of obtaining which does not involve writing y explicitly in terms
dx
of x at the outset. We simply treat y as an (unspecified) function of x.
Hence if xy = 1 we obtain
d d
(xy) = (1).
dx dx
The right-hand side differentiates to zero as 1 is a constant. On the left-hand side we must use the
product rule of differentiation:
d dy dx dy
(xy) = x + y =x +y
dx dx dx dx
Hence xy = 1 becomes, after differentiation,
dy dy y
x + y = 0 or =−
dx dx x
1
In this case we can of course substitute y = to obtain
x
1
y=−
x2
as before.
The method used here is called implicit differentiation and, apart from the final step, it can be
applied even if y cannot be expressed explicitly in terms of x. Indeed, on occasions, it is easier to
differentiate implicitly even if an explicit expression is possible.
54 HELM (2006):
®
Example 15
dy
Obtain the derivative where
dx
x2 + y = 1 + y 3
Solution
We begin by differentiating the left-hand side of the equation with respect to x to get:
d 2 dy
(x + y) = 2x + .
dx dx
We now differentiate the right-hand side of with respect to x. Using the chain (or function of a
function) rule to deal with the y 3 term:
d d d dy
(1 + y 3 ) = (1) + (y 3 ) = 0 + 3y 2
dx dx dx dx
Now by equating the left-hand side and right-hand side derivatives, we have:
dy dy
2x + = 3y 2
dx dx
dy
We can make the subject of this equation:
dx
dy dy dy 2x
− 3y 2 = −2x which gives = 2
dx dx dx 3y − 1
dy
We note that has to be expressed in terms of both x and y. This is quite usual if y cannot be
dx
obtained explicitly in terms of x. Now try this Task requiring implicit differentiation.
Task
dy
Find if 2y = x2 + sin y
dx
Note that your answer will be in terms of both y and x.
Your solution
Answer
We have, on differentiating both sides of the equation with respect to x and using the chain rule
on the sin y term:
d d 2 d dy dy dy 2x
(2y) = (x ) + (sin y) i.e. 2 = 2x + cos y leading to = .
dx dx dx dx dx dx 2 − cos y
HELM (2006): 55
d2 y
We sometimes need to obtain the second derivative for a function defined implicitly.
dx2
Example 16
dy d2 y
Obtain and 2 at the point (4, 2) on the curve defined by the equation
dx dx
x − xy − y 2 − 2y = 0
2
Solution
dy
Firstly we obtain by differentiating the equation implicitly and then evaluate it at (4, 2).
dx
dy dy dy
We have 2x − x − y − 2y −2 =0 (1)
dx dx dx
dy 2x − y
from which = (2)
dx x + 2y + 2
dy 6 3
so at (4, 2) = = .
dx 10 5
d2 y
To obtain the second derivative it is easier to use (1) than (2) because the latter is a quotient.
dx2
We simplify (1) first:
dy
2x − y − (x + 2y + 2) =0 (3)
dx
We will have to use the product rule to differentiate the third term here.
Hence differentiating (3) with respect to x:
dy d2 y dy dy
2− − (x + 2y + 2) 2 − (1 + 2 ) =0
dx dx dx dx
or
2
d2 y

dy dy
2−2 −2 − (x + 2y + 2) 2 = 0 (4)
dx dx dx
2
dy
Note carefully that the third term here, , is the square of the first derivative. It should not
dx
d2 y
be confused with the second derivative denoted by 2 .
dx
dy 3 3 9 d2 y
Finally, at (4, 2) where = we obtain from (4): 2 − 2( ) − 2( ) − (4 + 4 + 2) 2 = 0
dx 5 5 25 dx
d2 y 1
from which 2
= at (4, 2).
dx 125
56 HELM (2006):
®
Task
This Task involves finding a formula for the curvature of a bent beam. When a
horizontal beam is acted on by forces which bend it, then each small segment of
the beam will be slightly curved and can be regarded as an arc of a circle. The
radius R of that circle is called the radius of curvature of the beam at the point
concerned. If the shape of the beam is described by an equation of the form
y = f (x) then there is a formula for the radius of curvature R which involves only
dy d2 y
the first and second derivatives and 2 .
dx dx
Find that equation as follows.
Start with the equation of a circle in the simple implicit form
x2 + y 2 = R 2
and perform implicit differentiation twice. Now use the result of the first implicit
differentiation to find a simple expression for the quantity 1 + (dy/dx)2 in terms of
R and y; this can then be used to simplify the result of the second differentiation,
1 dy d2 y
and will lead to a formula for (called the curvature) in terms of and 2 .
R dx dx
Your solution
HELM (2006): 57
Answer
Differentiating: x2 + y 2 = R2 gives:
dy
2x + 2y =0 (1)
dx
2
d2 y

dy
Differentiating again: 2+2 + 2y =0 (2)
dx dx2
From (1)
2 2
dy x dy x2 y 2 + x2 R
=− ∴ 1+ =1+ 2 = 2
= (3)
dx y dx y y y
2 2
dy R
So 1 + = .
dx y
2 2 3
R dy d2 y R2 1 R
Thus (2) becomes 2 + 2y 2
=0 ∴ 2
=− 3 =−
y dx dx y R y
3
d2 y

1 R
so = − (4)
dx2 R y

1 R
Rearranging (4) to make the subject and substituting for from (3) gives the result:
R y
d2 y
1 dx2
= −" 2 #3/2
R
dy
1+
dx
The equation usually found in textbooks omits the minus sign but the sign indicates whether the
circle is above or below the curve, as you will see by sketching a few examples. When the gradient is
dy
small (as for a slightly deflected horizontal beam), i.e. is small, the denominator in the equation
dx
for (1/R) is close to 1, and so the second derivative alone is often used to estimate the radius of
curvature in the theory of bending beams.
58 HELM (2006):
Contents 12
Applications of
Differentiation
12.1 Tangents and Normals 2
12.2 Maxima and Minima 14
12.3 The Newton-Raphson Method 38
12.4 Curvature 47
12.5 Differentiation of Vectors 54
12.6 Case Study: Complex Impedance 60
Learning outcomes
In this Workbook you will learn to apply your knowledge of differentiation to solve some
basic problems connected with curves. First you will learn how to obtain the equation of
the tangent line and the normal line to any point of interest on a curve. Secondly, you will
learn how to find the positions of maxima and minima on a given curve. Thirdly, you
will learn how, given an approximate position of the root of a function, a better estimate
of the position can be obtained using the Newton-Raphson technique. Lastly you will
learn how to characterise how sharply a curve is turning by calculating its curvature.

Tangents and Normals 12.1
Introduction
In this Section we see how the equations of the tangent line and the normal line at a particular point
on the curve y = f (x) can be obtained. The equations of tangent and normal lines are often written
as
y = mx + c, y = nx + d
respectively. We shall show that the product of their gradients m and n is such that mn is −1 which
is the condition for perpendicularity.
' $
• understand the geometrical interpretation of

Prerequisites a derivative
Before starting this Section you should . . . • know the trigonometric expansions of
sin(A + B), cos(A + B)
&
' %
$
• obtain the condition that two given lines are
perpendicular
Learning Outcomes • obtain the equation of the tangent line to a

curve
• obtain the equation of the normal line to a
curve
& %
2 HELM (2006):
Workbook 12: Applications of Differentiation
®
1. Perpendicular lines
One form for the equation of a straight line is
y = mx + c
where m and c are constants. We remember that m is the gradient of the line and its value is the
tangent of the angle θ that the line makes with the positive x-axis. The constant c is the value
obtained where the line intersects the y-axis. See Figure 1:
y
y = mx + c
c m = tan θ
θ
x
Figure 1
If we have a second line, with equation
y = nx + d
then, unless m = n, the two lines will intersect at one point. These are drawn together in Figure 2.
The second line makes an angle ψ with the positive x-axis.
y
y = mx + c
c
ψ
θ
x
y = nx + d
n = tan ψ
Figure 2
A simple question to ask is “what is the relation between m and n if the lines are perpendicular?” If
the lines are perpendicular, as shown in Figure 3, the angles θ and ψ must satisfy the relation:
ψ − θ = 90◦
c
ψ
θ
x
Figure 3
HELM (2006): 3
Section 12.1: Tangents and Normals
This is true since the angles in a triangle add up to 180◦ . According to the figure the three angles
are 90◦ , θ and 180◦ − ψ. Therefore
180◦ = 90◦ + θ + (180◦ − ψ) implying ψ − θ = 90◦
In this special case that the lines are perpendicular or normal to each other the relation between
the gradients m and n is easily obtained. In this deduction we use the following basic trigonometric
relations and identities:
sin(A − B) ≡ sin A cos B − cos A sin B cos(A − B) ≡ cos A cos B + sin A sin B
sin A
tan A ≡ sin 90◦ = 1 cos 90◦ = 0
cos A
Now
m = tan θ
= tan(ψ − 90o ) (see Figure 3)
o
sin(ψ − 90 )
=
cos(ψ − 90o )
− cos ψ 1 1
= =− =−
sin ψ tan ψ n
So mn = −1
Key Point 1
Two straight lines y = mx + c, y = nx + d are perpendicular if
1
m=− or equivalently mn = −1
n
This result assumes that neither of the lines are parallel to the x-axis or to the y-axis, as in such
cases one gradient will be zero and the other infinite.
Exercise
Which of the following pairs of lines are perpendicular?
(a) y = −x + 1, y =x+1
(b) y + x − 1 = 0, y+x−2=0
(c) 2y = 8x + 3, y = −0.25x − 1
Answer
(a) perpendicular (b) not perpendicular (c) perpendicular
4 HELM (2006):
®
2. Tangents and normals to a curve

As we know, the relationship between an independent variable x and a dependent variable y is denoted
by
y = f (x)
As we also know, the geometrical interpretation of this relation takes the form of a curve in an xy
plane as illustrated in Figure 4.
y
y = f (x)
Figure 4
We know how to calculate a value of y given a value of x. We can either do this graphically (which
is inaccurate) or else use the function itself. So, at an x value of x0 the corresponding y value is y0
where
y0 = f (x0 )
Let us examine the curve in the neighbourhood of the point (x0 , y0 ). There are two important
constructions of interest
• the tangent line at (x0 , y0 )
• the normal line at (x0 , y0 )
These are shown in Figure 5.

y
tangent line
y0
ψ
θ
x0 x
normal line
Figure 5
We note the geometrically obvious fact: the tangent and normal lines at any given point on a curve
are perpendicular to each other.
HELM (2006): 5
Task
The curve y = x2 is drawn below. On this graph draw the tangent line and the
normal line at the point (x0 = 1, y0 = 1):
Your solution
y
1
x
1
Answer
y
1 θ
ψ
x
1
tangent line normal line
From your graph, estimate the values of θ and ψ in degrees. (You will need a protractor.)
Your solution
θ' ψ'
Answer
θ ≈ 63.4o ψ ≈ 153.4o
Returning to the curve y = f (x) : we know, from the geometrical interpretation of the derivative
that

df
= tan θ
dx x0

df df
(the notation means evaluate at the value x = x0 )
dx x0
dx
Here θ is the angle the tangent line to the curve y = f (x) makes with the positive x-axis. This is
highlighted in Figure 6:
y
y = f (x)
df
= tan θ
dx x0
θ x
x0
Figure 6
6 HELM (2006):
®
3. The tangent line to a curve

Let the equation of the tangent line to the curve y = f (x) at the point (x0 , y0 ) be:
y = mx + c
where m and c are constants to be found. The line just touches the curve y = f (x) at the point
(x0 , y0 ) so, at this point both must have the same value for the derivative. That is:

df
m=
dx x0
Since we know (in any particular case) f (x) and the value x0 we can readily calculate the value for
m. The value of c is found by using the fact that the tangent line and the curve pass through the
same point (x0 , y0 ).
y0 = mx0 + c and y0 = f (x0 )
Thus mx0 + c = f (x0 ) leading to c = f (x0 ) − mx0
Key Point 2
The equation of the tangent line to the curve y = f (x) at the point (x0 , y0 ) is

df
y = mx + c where m = and c = f (x0 ) − mx0
dx x0

df
Alternatively, the equation is y − y0 = m(x − x0 ) where m = and y0 = f (x0 )
dx x0
Example 1
Find the equation of the tangent line to the curve y = x2 at the point (1,1).
Solution
Method 1

2 df df
Here f (x) = x and x0 = 1 thus = 2x ∴ m = =2
dx dx x0
Also c = f (x0 ) − mx0 = f (1) − m = 1 − 2 = −1. The tangent line has equation y = 2x − 1.
Method 2
y0 = f (x0 ) = f (1) = 12 = 1
The tangent line has equation y − 1 = 2(x − 1) → y = 2x − 1
HELM (2006): 7
Task
Find the equation of the tangent line to the curve y = ex at the point x = 0. The
curve and the line are displayed in the following figure:
y
tangent line
First specify x0 and f :

Your solution
x0 =
f (x) =
Answer
x0 = 0 f (x) = ex

df
Now obtain the values of and f (x0 ) − mx0 :
dx x0
Your solution

df
=
dx x0
f (x0 ) − mx0 =
Answer
df df
= ex ∴ =1 and f (0) − 1(0) = e0 − 0 = 1
dx dx 0
Now obtain the equation of the tangent line:

Your solution
y=
Answer
y =x+1
8 HELM (2006):
®
Task
π
Find the equation of the tangent line to the curve y = sin 3x at the point x =
4
and find where the tangent line intersects the x-axis. See the following figure:
tangent line
First specify x0 and f :

Your solution
x0 = f (x) =
Answer
π
x0 = f (x) = sin 3x
4

df
Now obtain the values of and f (x0 ) − mx0 correct to 2 d.p.:
dx x0
Your solution

df
= f (x0 ) − mx0 =
dx x0
Answer
df df 3π 3
= 3 cos 3x ∴ = 3 cos = − √ = −2.12
dx dx π 4 2
4

π mπ 3π −3 π 1 3 π
and f − = sin − √ =√ +√ = 2.37 to 2 d.p.
4 4 4 2 4 2 24
Now obtain the equation of the tangent line:

Your solution
y=
Answer
−3 1
y = √ x + √ (4 + 3π) so y = −2.12x + 2.37 (to 2 d.p.)
2 4 2
HELM (2006): 9
Where does the line intersect the x-axis?
Your solution
x=
Answer
When y = 0 ∴ −2.12x + 2.37 = 0 ∴ x = 1.12 to 2 d.p.
4. The normal line to a curve

We have already noted that, at any point (x0 , y0 ) on a curve y = f (x), the tangent and normal lines
are perpendicular. Thus if the equations of the tangent and normal lines are, respectively
y = mx + c y = nx + d
1 1
then m = − or, equivalently n = − .
n m
We have also noted, for the tangent line

df
m=
dx x0
so n can easily be obtained. To find d, we again use the fact that the normal line y = nx + d and
the curve have a point in common:
y0 = nx0 + d and y0 = f (x0 )
so nx0 + d = f (x0 ) leading to d = f (x0 ) − nx0 .
Task
π
Find the equation of the normal line to curve y = sin 3x at the point x = .
4
[The equation of the tangent line was found in the previous Task.]
First find the value of m:

Your solution
df
m= =
dx π
4
Answer
−3
m= √
2
Hence find the value of n:
Your solution
n=
Answer √
2
nm = −1 ∴ n=
3
10 HELM (2006):
®
√
2
The equation of the normal line is y = x + d. Now find the value of d to 2 d.p.. (Remember the
3
π
normal line must pass through the curve at the point x = .)
4
Your solution
Answer
√ √
2 π π 1 2π
+ d = sin ∴ d= √ − ' 0.34
3 4 4 2 3 4
Now obtain the equation of the normal line to 2 d.p.:

Your solution
y=
Answer
y = 0.47x + 0.34. The curve and the normal line are shown in the following figure:
y normal line
Task
Find the equation of the normal line to the curve y = x3 at x = 1.

df
First find f (x), x0 , , m, n:
dx x0
Your solution
Answer
3 df
2 1
f (x) = x , x0 = 1, = 3x = 3 ∴ m = 3 and n = −
dx 1
1 3
HELM (2006): 11
Now use the property that the normal line y = nx + d and the curve y = x3 pass through the point
(1, 1) to find d and so obtain the equation of the normal line:
Your solution
d= y=
Answer
1 4
1 = n+d ∴ d = 1−n = 1+ = . Thus the equation of the normal line is
3 3
1 4
y = − x + . The curve and the normal line through (1, 1) are shown below:
3 3
y
normal line
Exercises
1. Find the equations of the tangent and normal lines to the following curves at the points
indicated
(a) y = x4 + 2x2 , (1, 3)

√ √
√ 2 2
(b) y = 1 − x2 , ( , ) What would be obtained if the point was (1, 0)?
2 2
(c) y = x1/2 , (1, 1)
2. Find the value of a if the two curves y = e−x and y = eax are to intersect at right-angles.
12 HELM (2006):
®
Answers

4 2 df 3 df
1. (a) f (x) = x + 2x = 4x + 4x, =8
dx dx x=1
tangent line y = 8x + c. This passes through (1, 3) so y = 8x − 5
1 1 25
normal line y = − x + d. This passes through (1, 3) so y = − x + .
8 8 8
√

df −x df
(b) f (x) = 1 − x2 =√ = −1
dx 1 − x2 dx x= √22
√ √ !
2 2 √
tangent line y = −x + c. This passes through , so y = −x + 2
2 2
√ √ !
2 2
normal line y = x + d. This passes through , so y = x.
2 2
At (1, 0) the tangent line is x = 1 and the gradient is infinite (the line is vertical), and
the normal line is y = 0.

1 df 1 −1 df 1
(c) f (x) = x 2 = x 2 =
dx 2 dx x=1 2

1 1 1
tangent line: y = x + c. This passes through (1, 1) so y = x +
2 2 2
normal line: y = −2x + d. This passes through (1, 1) so y = −2x + 3.
2. The curves will intersect at right-angles if their tangent lines, at the point of intersection, are
perpendicular.
Point of intersection: e−x = eax i.e. −x = ax x = 0 (a = −1 not sensible)
∴

The tangent line to y = eax is y = mx + c where m = aeax x=0 = a
The tangent line to y = e−x is y = kx + g where k = −e−x x=0 = −1

These two lines are perpendicular if a(−1) = −1 i.e. a = 1.
y = ex
y = e−x
HELM (2006): 13

Maxima and Minima 12.2
Introduction
In this Section we analyse curves in the ‘local neighbourhood’ of a stationary point and, from this
analysis, deduce necessary conditions satisfied by local maxima and local minima. Locating the max-
ima and minima of a function is an important task which arises often in applications of mathematics
to problems in engineering and science. It is a task which can often be carried out using only a
knowledge of the derivatives of the function concerned. The problem breaks into two parts
• finding the stationary points of the given functions
• distinguishing whether these stationary points are maxima, minima or, exceptionally, points of
inflection.
This Section ends with maximum and minimum problems from engineering contexts.

• be able to obtain first and second derivatives
Prerequisites of simple functions
Before starting this Section you should . . . • be able to find the roots of simple equations

'
$
• explain the difference between local and
global maxima and minima
• describe how a tangent line changes near a

Learning Outcomes maximum or a minimum
On completion you should be able to . . . • locate the position of stationary points
• use knowledge of the second derivative to

distinguish between maxima and minima
& %
14 HELM (2006):
®
1. Maxima and minima

Consider the curve
y = f (x) a≤x≤b
shown in Figure 7:
y
f (a)
b
a x0 x1 x
f (b)
Figure 7
By inspection we see that there is no y-value greater than that at x = a (i.e. f (a)) and there is no
value smaller than that at x = b (i.e. f (b)). However, the points on the curve at x0 and x1 merit
comment. It is clear that in the near neighbourhood of x0 all the y-values are greater than the
y-value at x0 and, similarly, in the near neighbourhood of x1 all the y-values are less than the y-value
at x1 .
We say f (x) has a global maximum at x = a and a global minimum at x = b but also has a
local minimum at x = x0 and a local maximum at x = x1 .
Our primary purpose in this Section is to see how we might locate the position of the local maxima
and the local minima for a smooth function f (x).
A stationary point on a curve is one at which the derivative has a zero value. In Figure 8 we have
sketched a curve with a maximum and a curve with a minimum.
y y
x0 x x0 x
Figure 8
By drawing tangent lines to these curves in the near neighbourhood of the local maximum and the
local minimum it is obvious that at these points the tangent line is parallel to the x-axis so that

df
=0
dx x0
HELM (2006): 15
Section 12.2: Maxima and Minima
Key Point 3
df
Points on the curve y = f (x) at which = 0 are called stationary points of the function.
dx
However, be careful! A stationary point is not necessarily a local maximum or minimum of the
function but may be an exceptional point called a point of inflection, illustrated in Figure 9.
x0 x
Figure 9
Example 2
Sketch the curve y = (x − 2)2 + 2 and locate the stationary points on the curve.
Solution
df
Here f (x) = (x − 2)2 + 2 so = 2(x − 2).
dx
df
At a stationary point = 0 so we have 2(x − 2) = 0 so x = 2. We conclude that this function
dx
has just one stationary point located at x = 2 (where y = 2).
By sketching the curve y = f (x) it is clear that this stationary point is a local minimum.
2
2 x
Figure 10
16 HELM (2006):
®
Task
Locate the position of the stationary points of f (x) = x3 − 1.5x2 − 6x + 10.
df
First find :
dx
Your solution
df
=
dx
Answer
df
= 3x2 − 3x − 6
dx
df
Now locate the stationary points by solving = 0:
dx
Your solution
Answer
3x2 − 3x − 6 = 3(x + 1)(x − 2) = 0 so x = −1 or x = 2. When x = −1, f (x) = 13.5 and
when x = 2, f (x) = 0, so the stationary points are (−1, 13.5) and (2, 0). We have, in the figure,
sketched the curve which confirms our deductions.
y
(−1, 13.5)
−2.5 2 x
HELM (2006): 17
Task
3π
Sketch the curve y = cos 2x 0.1 ≤ x ≤ and on it locate the position
4
of the global maximum, global minimum and any local maxima or minima.
Your solution
0.1 π/4 π/2 3π/4 x
Answer
y global maximum
local maximum
0.1 π/4 π/2 3π/4 x
local minimum
and global minimum
2. Distinguishing between local maxima and minima

We might ask if it is possible to predict when a stationary point is a local maximum, a local minimum
or a point of inflection without the necessity of drawing the curve. To do this we highlight the general
characteristics of curves in the neighbourhood of local maxima and minima.
For example: at a local maximum (located at x0 say) Figure 11 describes the situation:
f (x) to the left of df

>0
the maximum dx
to the right of df < 0

the maximum dx
x0 x
Figure 11
df
If we draw a graph of the derivative against x then, near a local maximum, it must take one
dx
of two basic shapes described in Figure 12:
18 HELM (2006):
®
df df
dx dx
or
α α = 180◦
x0 x x0 x
(a) (b)
Figure 12

d df d df
In case (a) ≡ tan α < 0 whilst in case (b) =0
dx dx
x0 dx dx
x0
We reach the conclusion that at a stationary point which is a maximum the value of the second
d2 f
derivative is either negative or zero.
dx2
Near a local minimum the above graphs are inverted. Figure 13 shows a local minimum.
f (x) df
to the left of <0
the minimum dx
to the right of df
>0
x0 x the minimum dx
Figure 13
Figure 134 shows the two possible graphs of the derivative:
df df
dx dx
or
β x x
x0 x0
(a) (b)
Figure 14

d df d df
Here, for case (a) = tan β > 0 whilst in (b) = 0.
dx dx x0 dx dx
x0
In this case we conclude that at a stationary point which is a minimum the value of the second
d2 f
derivative is either positive or zero.
dx2
HELM (2006): 19
For the third possibility for a stationary point - a point of inflection - the graph of f (x) against x
df
and of against x take one of two forms as shown in Figure 15.
dx
f (x) f (x)
x0 x x0 x
df df
dx dx
x0 x x
x0
df df
to the left of x0 >0 to the left of x0 <0
dx dx
df df
to the right of x0 >0 to the right of x0 <0
dx dx
Figure 15

d df
For either of these cases =0
dx dx
x0
The sketches and analysis of the shape of a curve y = f (x) in the near neighbourhood of stationary
points allow us to make the following important deduction:
Key Point 4

df
If x0 locates a stationary point of f (x), so that = 0, then the stationary point
dx x0
d2 f

is a local minimum if >0
dx2 x0
d2 f

is a local maximum if <0
dx2 x0
d2 f

is inconclusive if =0
dx2 x0
20 HELM (2006):
®
Example 3
Find the stationary points of the function f (x) = x3 − 6x.
Are these stationary points local maxima or local minima?
Solution
df df √
= 3x2 − 6. At a stationary point = 0 so 3x2 − 6 = 0, implying x = ± 2.
dx dx
√ √
Thus f (x) has stationary points at x = 2 and x = − 2. To decide if these are maxima or minima
we examine the value of the second derivative of f (x) at the stationary points.
d2 f d2 f √ √

= 6x so = 6 2 > 0. Hence x = 2 locates a local minimum.
dx2 dx2 x=√2
d2 f √ √

Similarly = −6 2 < 0. Hence x = − 2 locates a local maximum.
dx2
x=− 2
√
A sketch of the curve confirms this analysis:
f (x)
√
2
√ x
− 2
Figure 16
Task
For the function f (x) = cos 2x, 0.1 ≤ x ≤ 6, find the positions of any local
minima or maxima and distinguish between them.
Calculate the first derivative and locate stationary points:
Your solution
df
=
dx
Stationary points are located at:
HELM (2006): 21
Answer
df
= −2 sin 2x.
dx
Hence stationary points are at values of x in the range specified for which sin 2x = 0 i.e. at 2x = π
or 2x = 2π or 2x = 3π (making sure x is within the range 0.1 ≤ x ≤ 6)
π 3π
∴ Stationary points at x = , x = π, x =
2 2
Now calculate the second derivative:

Your solution
d2 f
=
dx2
Answer
d2 f
= −4 cos 2x
dx2
Finally: evaluate the second derivative at each stationary points and draw appropriate conclusions:
Your solution
d2 f

=
dx2 x= π
2
2

d f
=
dx2 x=π
d2 f

=
dx2 x= 3π
2
Answer
d2 f

π
= −4 cos π = 4 > 0 ∴ x= locates a local minimum.
dx2 π
x= 2 2
d2 f

= −4 cos 2π = −4 < 0 ∴ x = π locates a local maximum.
dx2 x=π
d2 f

3π
= −4 cos 3π = 4 > 0 ∴ x= locates a local minimum.
dx2 x= 3π 2
2
f (x)
3π/2
0.1 π/4 π/2 3π/4 6 x
22 HELM (2006):
®
Task
1
Determine the local maxima and/or minima of the function y = x4 − x3
3
First obtain the positions of the stationary points:
Your solution
1 df
f (x) = x4 − x3 =
3 dx
df
Thus = 0 when:
dx
Answer
df df
= 4x3 − x2 = x2 (4x − 1) = 0 when x = 0 or when x = 1/4
dx dx
Now obtain the value of the second derivatives at the stationary points:
Your solution
d2 f d2 f

= ∴ =
dx2 dx2 x=0
d2 f

=
dx2 x=1/4
Answer
d2 f d2 f

2
= 12x − 2x = 0, which is inconclusive.
dx2 dx2 x=0
d2 f

12 1 1 1
2
= − = > 0 Hence x = locates a local minimum.
dx x=1/4 16 2
4 4
Using this analysis we cannot decide whether the stationary point at x = 0 is a local maximum,
df
minimum or a point of inflection. However, just to the left of x = 0 the value of (which equals
dx
df
x2 (4x − 1)) is negative whilst just to the right of x = 0 the value of is negative again. Hence
dx
the stationary point at x = 0 is a point of inflection. This is confirmed by sketching the curve as
in Figure 17.
f (x)
1/4
x
− 0.0013
Figure 17
HELM (2006): 23
Task
A materials store is to be constructed next to a 3 metre high stone wall (shown
as OA in the cross section in the diagram). The roof (AB) and front (BC) are
to be constructed from corrugated metal sheeting. Only 6 metre length sheets are
available. Each of them is to be cut into two parts such that one part is used for
the roof and the other is used for the front. Find the dimensions x, y of the store
that result in the maximum cross-sectional area. Hence determine the maximum
cross-sectional area.
A
B
Stone
3m Wall
y
O x C
Your solution
24 HELM (2006):
®
Answer
Note that the store has the shape of a trapezium. So the cross-sectional area (A) of the store is
given by the formula: Area = average length of parallel sides × distance between parallel sides:
1
A = (y + 3)x (1)
2
The lengths
p x and y are relatedpthrough the fact that AB + BC = 6, where BC = y and
AB = x + (3 − y)2 . Hence
2 x2 + (3 − y)2 + y = 6. This equation can be rearranged in the
following way:
p
x2 + (3 − y)2 = 6 − y ⇔ x2 + (3 − y)2 = (6 − y)2 i.e. x2 + 9 − 6y + y 2 = 36 − 12y + y 2
which implies that x2 + 6y = 27 (2)
It is necessary to eliminate either x or y from (1) and (2) to obtain an equation in a single variable.
Using y instead of x as the variable will avoid having square roots appearing in the expression for
the cross-sectional area. Hence from Equation (2)
27 − x2
y= (3)
6
Substituting for y from Equation (3) into Equation (1) gives
1 27 − x2 1 27 − x2 + 18

1
45x − x3

A= +3 x= x= (4)
2 6 2 6 12
dA
To find turning points, we evaluate from Equation (4) to get
dx
dA 1
= (45 − 3x2 ) (5)
dx 12
dA 1
Solving the equation = 0 gives (45 − 3x2 ) = 0 ⇒ 45 − 3x2 = 0
dx 12
√
Hence x = ± 15 = ± 3.8730. Only x > 0 is of interest, so
√
x = 15 = 3.87306 (6)
gives the required turning point.
Check: Differentiating Equation (5) and using the positive x solution (6) gives
d2 A 6x x 3.8730
= − = − = − <0
dx2 12 2 2
Since the second derivative is negative then the cross-sectional area is a maximum. This is the only
turning point identified for A > 0 and it is identified as a maximum. To find the corresponding
27 − 3.87302
value of y, substitute x = 3.8730 into Equation (3) to get y = = 2.0000
6
So the values of x and y that yield the maximum cross-sectional area are 3.8730 m and 2.00000
m respectively. To find the maximum cross-sectional area, substitute for x = 3.8730 into Equation
(5) to get
1
A = (45 × 3.8730 − 3.87303 ) = 9.6825
2
So the maximum cross-sectional area of the store is 9.68 m2 to 2 d.p.
HELM (2006): 25
Task
Equivalent resistance in an electrical circuit

Current distributes itself in the wires of an electrical circuit so as to minimise the total power
consumption i.e. the rate at which heat is produced. The power (p) dissipated in an electrical circuit
is given by the product of voltage (v) and current (i) flowing in the circuit, i.e. p = vi. The voltage
across a resistor is the product of current and resistance (r). This means that the power dissipated
in a resistor is given by p = i2 r.
Suppose that an electrical circuit contains three resistors r1 , r2 , r3 and i1 represents the current
flowing through both r1 and r2 , and that (i − i1 ) represents the current flowing through r3 (see
diagram):
R1 R2
i1
R3 i−i1
(a) Write down an expression for the power dissipated in the circuit:
Your solution
Answer
p = i21 r1 + i21 r2 + (i − i1 )2 r3
r3
(b) Show that the power dissipated is a minimum when i1 = i:
r1 + r 2 + r3
Your solution
26 HELM (2006):
®
Answer
Differentiate result (a) with respect to i1 :
dp
= 2i1 r1 + 2i1 r2 + 2(i − i1 )(−1)r3
di1
= 2i1 (r1 + r2 + r3 ) − 2ir3
This is zero when

r3
i1 = i.
r1 + r 2 + r3
To check if this represents a minimum, differentiate again:
d2 p
= 2(r1 + r2 + r3 )
di21
This is positive, so the previous result represents a minimum.
(c) If R is the equivalent resistance of the circuit, i.e. of r1 , r2 and r3 , for minimum power dissipation
and the corresponding voltage V across the circuit is given by V = iR = i1 (r1 + r2 ), show that
(r1 + r2 )r3
R= .
r 1 + r2 + r 3
Your solution
Answer
Substituting for i1 in iR = i1 (r1 + r2 ) gives
r3 (r1 + r2 )
iR = i.
r1 + r 2 + r3
So
(r1 + r2 )r3
R= .
r 1 + r2 + r3
Note In this problem R1 and R2 could be replaced by a single resistor. However, treating them as
separate allows the possibility of considering more general situations (variable resistors or temperature
dependent resistors).
HELM (2006): 27
Water wheel efficiency
Introduction
A water wheel is constructed with symmetrical curved vanes of angle of curvature θ. Assuming that
friction can be taken as negligible, the efficiency, η, i.e. the ratio of output power to input power, is
calculated as
2(V − v)(1 + cos θ)v
η=
V2
where V is the velocity of the jet of water as it strikes the vane, v is the velocity of the vane in the
direction of the jet and θ is constant. Find the ratio, v/V , which gives maximum efficiency and find
the maximum efficiency.
We need to express the efficiency in terms of a single variable so that we can find the maximum
value.
2(V − v)(1 + cos θ)v v v
Efficiency = =2 1− (1 + cos θ).
V2 V V
v
Let η = Efficiency and x = then η = 2x(1 − x)(1 + cos θ).
V
We must find the value of x which maximises η and we must find the maximum value of η. To do
dη
this we differentiate η with respect to x and solve = 0 in order to find the stationary points.
dx
Now η = 2x(1 − x)(1 + cos θ) = (2x − 2x2 )(1 + cos θ)
dη
So = (2 − 4x)(1 + cos θ)
dx
dη 1 1
Now = 0 ⇒ 2 − 4x = 0 ⇒ x = and the value of η when x = is
dx 2 2

1 1 1
η=2 1− (1 + cos θ) = (1 + cos θ).
2 2 2
d2 η
This is clearly a maximum not a minimum, but to check we calculate 2 = −4(1 + cos θ) which is
dx
negative which provides confirmation.
Interpretation
v 1
Maximum efficiency occurs when = and the maximum efficiency is given by
V 2
1
η = (1 + cos θ).
2
28 HELM (2006):
®
Refraction
The problem
A light ray is travelling in a medium (A) at speed cA . The ray encounters an interface with a medium
(B) where the velocity of light is cB . Between two fixed points P in media A and Q in media B,
find the path through the interface point O that minimizes the time of light travel (see Figure 18).
Express the result in terms of the angles of incidence and refraction at the interface and the velocities
of light in the two media.
P d
a θA
Medium (A)
O
x
Medium (B) b
θB
Q
Figure 18: Geometry of light rays at an interface

The solution
The light ray path is shown as P OQ in the above figure where O is a point with variable horizontal
position x. The points P and Q are fixed and their positions are determined by the constants a, b, d
indicated in the figure. The total path length can be decomposed as P O + OQ so the total time of
travel T (x) is given by
T (x) = P O/cA + OQ/cB . (1)
Expressing the distances P O and OQ in terms of the fixed coordinates a, b, d, and in terms of the
unknown position x, Equation (1) becomes
√ p
a2 + x 2 b2 + (d − x)2
T (x) = + (2)
cA cB
It is assumed that the minimum of the travel time is given by the stationary point of T (x) such that
dT
= 0. (3)
dx
Using the chain rule in ( 11.5) to compute (3) given (2) leads to
1 2x 1 2x − 2d
√ + p = 0.
2 cA a2 + x2 2 cB b2 + (d − x)2
After simplification and rearrangement
x d−x
√ = p .
cA a2 + x 2 cB b2 + (d − x)2
HELM (2006): 29
x d−x
Using the definitions sin θA = √ and sin θB = p this can be written as
a2+x 2 b + (d − x)2
2
sin θA sin θB
= . (4)
cA cB
Note that θA andθB are the incidence angles measured from the interface normal as shown in the
figure. Equation (4) can be expressed as
sin θA cA
=
sin θB cB
which is the well-known law of refraction for geometrical optics and applies to many other kinds
cA
of waves. The ratio is a constant called the refractive index of medium (B) with respect to
cB
medium (A).
30 HELM (2006):
®
Fluid power transmission
Introduction
Power transmitted through fluid-filled pipes is the basis of hydraulic braking systems and other
hydraulic control systems. Suppose that power associated with a piston motion at one end of a
pipeline is transmitted by a fluid of density ρ moving with positive velocity V along a cylindrical
pipeline of constant cross-sectional area A. Assuming that the loss of power is mainly attributable to
friction and that the friction coefficient f can be taken to be a constant, then the power transmitted,
P is given by
P = ρgA(hV − cV 3 ),
where g is the acceleration due to gravity and h is the head which is the height of the fluid above
4f l
some reference level (= the potential energy per unit weight of the fluid). The constant c =
2gd
where l is the length of the pipe and d is the diameter of the pipe. The power transmission efficiency
is the ratio of power output to power input.
Problem in words
Assuming that the head of the fluid, h, is a constant find the value of the fluid velocity, V , which
gives a maximum value for the output power P . Given that the input power is Pi = ρgAV h, find
the maximum power transmission efficiency obtainable.
We are given that P = ρgA(hV − cV 3 ) and we want to find its maximum value and hence maximum
efficiency.
dP
To find stationary points for P we solve = 0.
dV
d2 P
To classify the stationary points we can differentiate again to find the value of at each stationary
dV 2
point and if this is negative then we have found a local maximum point. The maximum efficiency
is given by the ratio P/Pi at this value of V and where Pi = ρgAV h. Finally we should check that
this is the only maximum in the range of P that is of interest.
P = ρgA(hV − cV 3 )
dP
= ρgA(h − 3cV 2 )
dV
dP
= 0 gives ρgA(h − 3cV 2 ) = 0
dV
r r
2 h h h
⇒ V = ⇒ V =± and as V is positive ⇒ V = .
3c 3c 3c
HELM (2006): 31
dP d2 P
To show this is a maximum we differentiate again giving = ρgA(−6cV ). Clearly this is
r dV dV 2
h
negative, or zero if V = 0. Thus V = gives a local maximum value for P .
3c
We note
r that P = 0 when E = ρgA(hV r − cV 3 ) = 0, i.e. when hV − cV 3 = 0, so V =r0 or
h h h
V = . So the maximum at V = is the only max in this range between 0 and V = .
C 3C C
The efficiency E, is given by (input power/output power), so here
ρgA(hV − cV 3 ) cV 2
E= =1−
ρgAV h h
h
r
h h c 1 2
At V = then V 2 = and therefore E = 1 − 3c = 1 − = or 66 23 %.
3c 3c c 3 3
Interpretation
r
h
Maximum power transmitted through the fluid when the velocity V = and the maximum
3c
efficiency is 66 23 %. Note that this result is independent of the friction and the maximum efficiency
is independent of the velocity and (static) pressure in the pipe.
4 2.215
P (V )
3
h= 3m
1.81
2
h= 2m
0 1 2 3 4
Figure 19: Graphs of transmitted power as a function of fluid velocity

for two values of the head
Figure 19 shows the maxima in the power transmission for two different values of the head in an oil
filled pipe (oil density 1100 kg m−3 ) of inner diameter 0.01 m and coefficient of friction 0.01 and
pipe length 1 m.
32 HELM (2006):
®
Crank used to drive a piston
Introduction
A crank is used to drive a piston as in Figure 20.
vc
C
! r
ac = ω 2 r
θ
vp
ap
Figure 20: Crank used to drive a piston

Problem
The angular velocity of the crankshaft is the rate of change of the angle θ, ω = dθ/dt. The piston
moves horizontally with velocity vp and acceleration ap ; r is the length of the crank and l is the length
of the connecting rod. The crankpin performs circular motion with a velocity of vc and centripetal
acceleration of ω 2 r. The acceleration ap of the piston varies with θ and is related by

2 r cos 2θ
ap = ω r cos θ +
l
Find the maximum and minimum values of the acceleration ap when r = 150 mm and l = 375 mm.

2 r cos 2θ
We need to find the stationary values of ap = ω r cos θ + when r = 150 mm and l = 375
l
dap
mm. We do this by solving = 0 and then analysing the stationary points to decide whether they
dθ
are a maximum, minimum or point of inflexion.
Mathematical analysis.

2 r cos 2θ dap 2 2r sin 2θ
ap = ω r cos θ + so = ω r − sin θ − .
l dθ l
To find the maximum and minimum acceleration we need to solve

dap 2 2r sin 2θ
= 0 ⇔ ω r − sin θ − = 0.
dθ l
2r 4r
sin θ + sin 2θ = 0 ⇔ sin θ + sin θ cos θ = 0
l l
HELM (2006): 33

4r
⇔ sin θ 1 + cos θ =0
l
l
⇔ sin θ = 0 or cos θ = − and as r = 150 mm and l = 375 mm
4r
5
⇔ sin θ = 0 or cos θ = −
8
CASE 1: sin θ = 0
If sin θ = 0 then θ = 0 or θ = π. If θ = 0 then cos θ = cos 2θ = 1

2 r cos 2θ 2
r 2 2 7
so ap = ω r cos θ + =ω r 1+ =ω r 1+ = ω2r
l l 5 5
If θ = π then cos θ = −1, cos 2θ = 1 so

2 r cos 2θ 2
r 2 2 3
ap = ω r cos θ + = ω r −1 + = ω r −1 + = − ω2r
l l 5 5
dap
In order to classify the stationary points, we differentiate with respect to θ to find the second
dθ
derivative:
d2 ap

2 4r cos 2θ 2 4r cos 2θ
= ω r − cos θ − = −ω r cos θ +
dθ2 l l
d2 ap

4r
At θ = 0 we get 2
= −ω 2 r 1 + which is negative.
dθ l
7
So θ = 0 gives a maximum value and ap = ω 2 r is the value at the maximum.
5
2

d ap 2 4 2 3
At θ = π we get 2
= −ω r −1 + = −ω r which is negative.
dθ l 5
3
So θ = π gives a maximum value and ap = − ω 2 r
5
5
CASE 2: cos θ = −
8
2
5 2 5 7
If cos θ = − then cos 2θ = 2 cos θ − 1 = 2 − 1 so cos 2θ = − .
8 8 32

r cos 2θ 5 7 2 57
ap = ω 2 r cos θ + = ω2r − + − × = ω 2 r.
l 8 32 5 80
d2 ap

5 4r cos 2θ 5 4r 7
At cos θ = − we get = ω 2 r − cos θ − = ω2r + which is positive.
8 dθ2 l 8 l 32
5 57
So cos θ = − gives a minimum value and ap = − ω 2 r
8 80
Thus the values of ap at the stationary points are:-
7 2 3 57
ω r (maximum), − ω 2 r (maximum) and − ω 2 r (minimum).
5 5 80
34 HELM (2006):
®
So the overall maximum value is 1.4ω 2 r = 0.21ω 2 and the minimum value is
−0.7125ω 2 r = −0.106875ω 2 where we have substituted r = 150 mm (= 0.15 m) and l = 375 mm
(= 0.375 m).
Interpretation
The maximum acceleration occurs when θ = 0 and ap = 0.21ω 2 .
5
The minimum acceleration occurs when cos θ = − and ap = −0.106875ω 2 .
8
Exercises
1. Locate the stationary points of the following functions and distinguish among them as maxima,
minima and points of inflection.
d 1
(a) f (x) = x − ln |x|. [Remember (ln |x|) = ]
dx x
(b) f (x) = x3
(x − 1)
(c) f (x) = −1<x<2
(x + 1)(x − 2)
2. A perturbation in the temperature of a stream leaving a chemical reactor follows a decaying
sinusoidal variation, according to
T (t) = 5exp(−at) sin(ωt)
where a and ω are positive constants.
(a) Sketch the variation of temperature with time.

dT
(b) By examining the behaviour of , show that the maximum temperatures occur at times
dt
ω
of tan−1 ( ) + 2πn /ω.
a
HELM (2006): 35
Answers
df 1
1. (a) = 1 − = 0 when x = 1
dx x
d2 f d2 f

1
= 2 =1>0
dx2 x dx2 x=1
∴ x = 1, y = 1 locates a local minimum.
f (x)
1 x
df d2 f
(b) = 3x2 = 0 when x = 0 = 6x = 0 when x = 0
dx dx2
df
However, > 0 on either side of x = 0 so (0, 0) is a point of inflection.
dx
f (x)
df (x + 1)(x − 2) − (x − 1)(2x − 1)
(c) =
dx (x + 1)(x − 2)
This is zero when (x + 1)(x − 2) − (x − 1)(2x − 1) = 0 i.e. x2 − 2x + 3 = 0
However, this equation has no real roots (since b2 < 4ac) and so f (x) has no stationary
points. The graph of this function confirms this:
f (x)
−1 1 2 x
Nevertheless f (x) does have a point of inflection at x = 1, y = 0 as the graph shows,

dy
although at that point 6= 0.
dx
36 HELM (2006):
®
Answer
2. (a)
T
dT ω
(b) = 0 implies tan ωt = , so tan ωt > 0 and
dt a
ω
ωt = tan−1 + kπ, k integer
a
d2 T d2 T
Examination of 2 reveals that only even values of k give 2 < 0 for a maximum so
dt dt
setting k = 2n gives the required answer.
HELM (2006): 37
The Newton-Raphson
Method 12.3
Introduction
This Section is concerned with the problem of “root location”; i.e. finding those values of x which
satisfy an equation of the form f (x) = 0. An initial estimate of the root is found (for example by
drawing a graph of the function). This estimate is then improved using a technique known as the
Newton-Raphson method, which is based upon a knowledge of the tangent to the curve near the
root. It is an “iterative” method in that it can be used repeatedly to continually improve the accuracy
of the root.

• be able to differentiate simple functions

Prerequisites
• be able to sketch graphs

'
$
• distinguish between simple and multiple roots
• estimate the root of an equation by drawing

Learning Outcomes a graph
On completion you should be able to . . . • employ the Newton-Raphson method to
improve the accuracy of a root
& %
38 HELM (2006):
®
1. The Newton-Raphson method

We first remind the reader of some basic notation: If f (x) is a given function the value of x for
which f (x) = 0 is called a root of the equation or zero of the function. We also distinguish between
various types of roots: simple roots and multiple roots. Figures 21 - 23 illustrate some common
examples.
y y y
y = (x − 2)3
y = (x − 1)2
y = f (x)
x0 x x x
1 2
simple root double root triple root
Figure 21 Figure 22 Figure 23

More precisely; a root x0 is said to be:

df
a simple root if f (x0 ) = 0 and 6= 0.
dx x0
d2 f

df
a double root if f (x0 ) = 0, = 0 and 6= 0, and so on.
dx x0 dx2 x0
In this Section we shall concentrate on the location of simple roots of a given function f (x).
Task
Given graphs of the functions (a) f (x) = x3 − 3x2 + 4, (b) f (x) = 1 + sin x
classify the roots into simple or multiple.
Your solution
(a) f (x) = x3 − 3x2 + 4: The negative root is: and the positive root is:
y
x=2 x
Answer
The negative root is simple and the positive root is double.
Your solution
(b) f (x) = 1 + sin x: Each root is a root
y
x
Answer
Each root is a double root.
HELM (2006): 39
Section 12.3: The Newton-Raphson Method
2. Finding roots of the equation f (x) = 0
A first investigation into the roots of f (x) might be graphical. Such an analysis will supply information
as to the approximate location of the roots.
Task
Sketch the function
f (x) = x − 2 + ln x x>0
and estimate the value of the root.
Your solution
1 2 x
An estimate of the root is:
Answer
y
1 2 x
A simple root is located near 1.5
One method of obtaining a better approximation is to halve the interval 1 ≤ x ≤ 2 into 1 ≤ x ≤ 1.5
and 1.5 ≤ x ≤ 2 and test the sign of the function at the end-points of these new regions. We find
x f (x)
1 <0
1.5 < 0
2 >0
so a root must lie between x = 1.5 and x = 2 because the sign of f (x) changes between these
values and f (x) is a continuous curve. We can repeat this procedure and divide the interval (1.5, 2)
into the two new intervals (1.5, 1.75) and (1.75, 2) and test again. This time we find
x f (x)
1.5 <0
1.75 > 0
2.0 >0
40 HELM (2006):
®
so a root lies in the interval (1.5, 1.75). It is obvious that proceeding in this way will give a smaller
and smaller interval in which the root must lie. But can we do better than this rather laborious
bisection procedure? In fact there are many ways to improve this numerical search for the root. In
this Section we examine one of the best methods: the Newton-Raphson method.
To derive the method we examine the general characteristics of a curve in the neighbourhood of a
simple root. Consider Figure 24 showing a function f (x) with a simple root at x = x∗ whose value
is required. Initial analysis has indicated that the root is approximately located at x = x0 . The aim
is to provide a better estimate to the location of the root.
y
y = f (x)
x∗ x0 x
Figure 24
The basic premise of the Newton-Raphson method is the assumption that the curve in the close
neighbourhood of the simple root at x∗ is approximately a straight line. Hence if we draw the
tangent to the curve at x0 , this tangent will intersect the x-axis at a point closer to x∗ than is x0 :
see Figure 25.
R
y =f (x0 )
P θ Q
x1 ∗ x
x x0
Figure 25
From the geometry of this diagram we see that
x1 = x0 − P Q
But from the right-angled triangle P QR we have
RQ
= tan θ = f 0 (x0 )
PQ
RQ f (x0 ) f (x0 )
and so PQ = = ∴ x1 = x0 −
f 0 (x0 ) f 0 (x0 ) f 0 (x0 )
If f (x) has a simple root near x0 then a closer estimate to the root is x1 where
f (x0 )
x1 = x0 −
f 0 (x0 )
This formula can be used iteratively to get closer and closer to the root, as summarised in Key Point
5:
HELM (2006): 41
Key Point 5
Newton-Raphson Method
If f (x) has a simple root near xn then a closer estimate to the root is xn+1 where
f (xn )
xn+1 = xn −
f 0 (xn )
This is the Newton-Raphson iterative formula. The iteration is begun with an initial estimate
of the root, x0 , and continued to find x1 , x2 , . . . until a suitably accurate estimate of the position
of the root is obtained. This is judged by the convergence of x1 , x2 , . . . to a fixed value.
Example 4
f (x) = x − 2 + ln x has a root near x = 1.5. Use the Newton-Raphson method
to obtain a better estimate.
Solution
Here x0 = 1.5, f (1.5) = −0.5 + ln(1.5) = −0.0945
1 1 5
f 0 (x) = 1 + ∴ f 0 (1.5) = 1 + =
x 1.5 3
Hence using the formula:
(−0.0945)
x1 = 1.5 − = 1.5567
(1.6667)
The Newton-Raphson formula can be used again: this time beginning with 1.5567 as our estimate:
f (x1 ) f (1.5567) {1.5567 − 2 + ln(1.5567)}

x2 = x1 − 0
= 1.5567 − 0 = 1.5567 −
f (x1 ) f (1.5567) 1
1+
1.5567
{−0.0007}
= 1.5567 − = 1.5571
{1.6424}
This is in fact the correct value of the root to 4 d.p., which calculating x3 would confirm.
42 HELM (2006):
®
Task
The function f (x) = x − tan x has a simple root near x = 4.5. Use one iteration
of the Newton-Raphson method to find a more accurate value for the root.
df
First find :
dx
Your solution
df
=
dx
Answer
df
= 1 − sec2 x = − tan2 x
dx
Now use the formula x1 = x0 − f (x0 )/f 0 (x0 ) with x0 = 4.5 to obtain x1 :
Your solution
f (4.5) = 4.5 − tan(4.5) =
f 0 (4.5) = 1 − sec2 (4.5) = − tan2 (4.5) =
f (4.5)
x1 = 4.5 − 0 =
f (4.5)
Answer
f (4.5) = −0.1373, f 0 (4.5) = −21.5048
0.1373
∴ x1 = 4.5 − = 4.4936.
21.5048
As the value of x1 has changed little from x0 = 4.5 we can expect the root to be 4.49 to 3 d.p.
Task
Sketch the function f (x) = x3 − x + 3 and confirm that there is a simple root
between x = −2 and x = −1. Use x0 = −2 as an initial estimate to obtain the
value to 2 d.p.
First sketch f (x) = x3 − x + 3 and identify a root:
Your solution
−3 −2 −1 1 2 x
HELM (2006): 43
Answer
y
−3 −2 −1 1 2 x
Clearly a simple root lies between x = −2 and x = −1.
Now use one iteration of Newton-Raphson to improve the estimate of the root using x0 = −2:
Your solution
f (x) = f 0 (x) = x0 =
f (x0 )
x1 = x0 − =
f 0 (x0 )
Answer
f (x) = x3 − x + 3, f 0 (x) = 3x2 − 1 x0 = −2
{−8 + 2 + 3} 3
∴ x1 = −2 − = −2 + = −1.727
11 11
Now repeat this process for a second iteration using x1 = −1.727:

Your solution
x2 = x1 − f (x1 )/f 0 (x1 ) =
Answer
x2 = −1.727 − {−(1.727)3 + 1.727 + 3}/{3(1.727)2 − 1}

= −1.727 + {(0.424)/(7.948) = −1.674
Repeat for a third iteration and state the root to 2 d.p.:

Your solution
x3 = x2 − f (x2 )/f 0 (x2 ) =
Answer
x3 = −1.674 − {−(1.674)3 + 1.674 + 3}/{3(1.674)2 − 1}

= −1.674 + {0.017}/{7.407} = −1.672
We conclude the value of the simple root is −1.67 correct to 2 d.p.
44 HELM (2006):
®
Buckling of a strut
The equation governing the buckling load P of a strut r with one end fixed and the other end simply
P
supported is given by tan µL = µL where µ = , L is the length of the strut and EI is the
EI
flexural rigidity of the strut. For safe design it is important that the load applied to the strut is less
than the lowest buckling load. This equation has no exact solution and we must therefore use the
method described in this Workbook to find the lowest buckling loadP .
deflected shape
P P
L
Figure 26
We let µL = x and so we need to solve the equation tan x = x. Before starting to apply the Newton-
Raphson iteration we must first obtain an approximate solution by plotting graphs of y = tan x and
y = x using the same axes.
y = tan x
y=x
0 π 3π/2 x
π/2
From the graph it can be seen that the solution is near to but below x = 3π/2 (∼ 4.7). We therefore
start the Newton-Raphson iteration with a value x0 = 4.5.
The equation is rewritten as tan x − x = 0. Let f (x) = tan x − x then f 0 (x) = sec2 x − 1 = tan2 x
tan xn − xn
The Newton-Raphson iteration is xn+1 = xn − , x0 = 4.5
tan2 xn
tan(4.5) − 4.5 0.137332
so x1 = 4.5 − 2
= 4.5 − = 4.493614 to 7 sig.fig.
tan 4.5 21.504847
Rounding to 4 sig.fig. and iterating:
tan(4.494) − 4.494 0.004132
x2 = 4.494 − = 4.494 − = 4.493410 to 7 sig.fig.
tan2 4.494 20.229717
p
So we conclude that the value of x is 4.493 to 4 sig.fig. As x = µL = P/EI L we find, after
EI
re-arrangement, that the smallest buckling load is given by P = 20.19 2 .
L
HELM (2006): 45
Exercises
1. By sketching the function f (x) = x − 1 − sin x show that there is a simple root near x = 2.
Use two iterations of the Newton-Raphson method to obtain a better estimate of the root.
2. Obtain an estimation accurate to 2 d.p. of the point of intersection of the curves y = x − 1

and y = cos x.
Answers
1. x0 = 2, x1 = 1.936, x2 = 1.935
2. The curves intersect when x − 1 − cos x = 0. Solve this using the Newton-Raphson method
with initial estimate (say) x0 = 1.2.
The point of intersection is (1.28342, 0.283437) to 6 significant figures.
46 HELM (2006):
®

Curvature 12.4
Introduction
Curvature is a measure of how sharply a curve is turning. At a particular point along the curve a
tangent line can be drawn; this tangent line making an angle ψ with the positive x-axis. Curvature
is then defined as the magnitude of the rate of change of ψ with respect to the measure of length
on the curve - the arc length s. That is

dψ
Curvature =
ds
In this Section we examine the concept of curvature and, from its definition, obtain more useful
expressions for curvature when the equation of the curve is expressed either in Cartesian form y = f (x)
or in parametric form x = x(t) y = y(t). We show that a circle has a constant value for the
curvature, which is to be expected, as the tangent line to a circle turns equally quickly irrespective
of the position on the circle. For all curves, except circles, other than a circle, the curvature will
depend upon position, changing its value as the curve twists and turns.
' $
• understand the geometrical interpretation of
the derivative
Prerequisites • be able to differentiate standard functions
• be able to use the parametric description of a
curve
&
%

• understand the concept of curvature
Learning Outcomes
• calculate curvature when the curve is defined
On completion you should be able to . . . in Cartesian form or in parametric form

HELM (2006): 47
Section 12.4: Curvature
1. Curvature
Curvature is a measure of how quickly a tangent line turns as the contact point moves along a curve.
For example, consider a simple parabola, with equation y = x2 . Its graph is shown in Figure 27.
y
R
P Q
x
Figure 27
It is obvious, geometrically, that the tangent lines to this curve turn ‘more quickly’ between P and
Q than between Q and R. It is the purpose of this Section to give, a quantitative measure of this
rate of ‘turning’.
If we change from a parabola to a circle, (centred on the origin, of radius 1), we can again consider
how quickly the tangent lines turn as we move along the curve. See Figure 28. It is immediately
clear that the tangent lines to a circle turn equally quickly no matter where located on the circle.
y
Figure 28
However, if we consider two circles with the same centre but different radii, as in Figure 29, it is
again obvious that the smaller circle ‘bends’ more tightly than the larger circle and we say it has a
larger curvature. Athletes who run the 200 metres find it easier to run in the outside lanes (where
the curve turns less sharply) than in the inside lanes.
y
Q!
Q
P!
P
ψ ψ
x
Figure 29
On the two circle diagram (Figure 29) we have drawn tangent lines at P and P 0 ; both lines make
an angle ψ (greek letter psi) with the positive x-axis. We need to measure how quickly the angle
48 HELM (2006):
®
ψ changes as we move along the curve. As we move from P to Q (inner circle), or from P 0 to Q0
(outer circle), the angle ψ changes by the same amount. However, the distance traversed on the
inner circle is less than the distance traversed on the outer circle. This suggests that a measure of
curvature is:
curvature is the magnitude of the rate of change of ψ
with respect to the distance moved along the curve.
We shall denote the curvature by the Greek letter κ (kappa).
So

dψ
κ =
ds
where s is the measure of arc-length along a curve. This rather odd-looking derivative needs con-
verting to involve the variable x if the equation of the curve is given in the usual form y = f (x). As
a preliminary we note that

dψ dψ ds
=
ds dx dx
dψ ds
We now obtain expressions for the derivatives and in terms of the derivatives of f (x).
dx dx
Consider Figure 30 below.
δs
δy
y
δx
ψ
x x
Figure 30
Small increments in the x- and y-directions have been denoted by δx and δy respectively. The
hypotenuse on this ‘small triangle’ is δs which is the change in arc-length along the curve.
From Pythagoras’ theorem:
δs2 = δx2 + δy 2
so
2 2 s 2
δs δy δs δy
=1+ so that = 1+
δx δx δx δx
In the limit as the increments get smaller and smaller, we write this relation in derivative form:
s 2
ds dy
= 1+
dx dx
HELM (2006): 49
However, as y = f (x) is the equation of the curve we obtain
s 2
ds df
= 1+ = (1 + [f 0 (x)]2 )1/2
dx dx
df
We also know the relation between the angle ψ and the derivative :
dx
df
= tan ψ
dx
so differentiating again:
d2 f dψ dψ
2
= sec2 ψ = (1 + tan2 ψ)
dx dx dx
dψ
= (1 + [f 0 (x)]2 )
dx
Inverting this relation:
dψ f 00 (x)
=
dx (1 + [f 0 (x)]2 )
and so, finally, the curvature is given by
f 00 (x)

dψ dψ ds
κ= =
=
0 2 3/2

ds dx dx (1 + [f (x)] )
Key Point 6
Curvature
At each point on a curve, with equation y = f (x), the tangent line turns at a certain rate.
A measure of this rate of turning is the curvature κ defined by
00

f (x)
κ =
(1 + [f 0 (x)]2 )3/2
50 HELM (2006):
®
Task
Obtain the curvature of the parabola y = x2 .
First calculate the derivatives of f (x):
Your solution
df d2 f
f (x) = = =
dx dx2
Answer
df d2 f
f (x) = x2 = 2x =2
dx dx2
Now find an expression for the curvature:
Your solution
κ=
Answer
00

f (x) 2
κ = =
[1 + [f 0 (x))]2 ]3/2 [1 + 4x2 ]3/2
Finally, plot the curvature κ as a function of x:

Your solution
κ
2
−3 −2 −1 1 2 3 x
Answer
κ
2
−3 −2 −1 1 2 3 x
The figure above supports what we have already argued:

• Close to x = 0 the parabola turns sharply (near x = 0 the curvature κ is relatively, large).
• Further away from x = 0 the curve is more ‘gentle’ (in these regions κ is small).
In general, the curvature κ is a function of position. However, from what we have said earlier, we
expect the curvature to be a constant for a given circle but to increase as the radius of the circle
decreases. This can now be checked directly.
HELM (2006): 51
Example 5
Find the curvature of y = (a2 − x2 )1/2 (this is the equation of the upper half of a
circle centred at the origin of radius a).
Solution
1
Here f (x) = (a2 − x2 ) 2
df −x d2 f −a2
= 1 = 3
dx (a − x2 ) 2
2 dx2 (a2 − x2 ) 2
x2 r2
∴ 1 + [f 0 (x)]2 = 1 + =
a2 − x 2 a2 − x2

2

−a

2
(a − x2 )3/2 1

∴ κ = 3/2 =
a 2 a

a2 − x 2
1
For a circle of radius a, the curvature is constant, with value .
a
The value of κ (at any particular point on the curve, i.e. at a particular value of x) indicates how
sharply the curve is turning. What this result states is that, for a circle, the curvature is inversely
related to the radius. The bigger the radius, the smaller the curvature; precisely what we predicted.
2. Curvature for parametrically defined curves

An expression for the curvature is also available if the curve is described parametrically:
x = g(t) y = h(t) t0 ≤ t ≤ t1
We remember the basic formulae connecting derivatives
= 2
=
dx ẋ dx ẋ3
dx d2 x
where, as usual ẋ ≡ , ẍ ≡ 2 etc.
dt dt
Then

00

f (x) ẋÿ − ẏẍ
κ= = "

0 2
{1 + [f (x)] } 3/2 3/2
#
2

ẏ

3
ẋ 1 +

ẋ

ẋÿ − ẏẍ
= 2

[ẋ + ẏ 2 ]3/2
52 HELM (2006):
®
Key Point 7

ẋÿ − ẏẍ
The formula for curvature in parametric form is κ= 2

[ẋ + ẏ 2 ]3/2
Task
An ellipse is described parametrically by the equations
x = 2 cos t y = sin t 0 ≤ t ≤ 2π
Obtain an expression for the curvature κ and find where the curvature is a maxi-
mum or a minimum.
First find ẋ, ẏ, ẍ, ÿ:

Your solution
ẋ = ẏ = ẍ = ÿ =
Answer
ẋ = −2 sin t ẏ = cos t ẍ = −2 cos t ÿ = − sin t
Now find κ:
Your solution
κ=
Answer
ẋÿ − ẏẍ 2 sin2 t + 2 cos2 t

2
κ = 2 =
2
=
2
[ẋ + ẏ ] 3/2 2
[4 sin t + cos t]3/2 [1 + 3 sin2 t]3/2
Find maximum and minimum values of κ by inspection of the expression for κ:
Your solution
max κ = min κ =
Answer
Denominator is max when t = π/2. This gives minimum value of κ = 1/4,
Denominator is min when t = 0. This gives maximum value of κ = 2.
y minimum value of κ
1
maximum value of κ
−2 2 x
−1
HELM (2006): 53
Differentiation of
Vectors 12.5
Introduction
The area of mathematics known as vector calculus is used to model mathematically a vast range of
engineering phenomena including electrostatics, electromagnetic fields, air flow around aircraft and
heat flow in nuclear reactors. In this Section we introduce briefly the differential calculus of vectors.
' $
• have a knowledge of vectors, in Cartesian
form
Prerequisites • be able to calculate the scalar and vector

products of two vectors
• be able to differentiate and integrate scalar
functions
&
%

Learning Outcomes • differentiate vectors


54 HELM (2006):
®
1. Differentiation of vectors
Consider Figure 31.
y P
r
C
Figure 31
If r represents the position vector of an object which is moving along a curve C, then the position
vector will be dependent upon the time, t. We write r = r(t) to show the dependence upon time.
Suppose that the object is at the point P , with position vector r at time t and at the point Q, with
position vector r(t + δt), at the later time t + δt, as shown in Figure 32.
P
−→
PQ
r(t)
Q
r(t + δt)
Figure 32
−→
Then P Q represents the displacement vector of the object during the interval of time δt. The length
of the displacement vector represents the distance travelled, and its direction gives the direction of
motion. The average velocity during the time from t to t + δt is defined as the displacement vector
divided by the time interval δt, that is,
−→
PQ r(t + δt) − r(t)
average velocity = =
δt δt
If we now take the limit as the interval of time δt tends to zero then the expression on the right
hand side is the derivative of r with respect to t. Not surprisingly we refer to this derivative as
the instantaneous velocity, v. By its very construction we see that the velocity vector is always
tangential to the curve as the object moves along it. We have:
r(t + δt) − r(t) dr
v = lim =
δt→0 δt dt
HELM (2006): 55
Section 12.5: Differentiation of Vectors
Now, since the x and y coordinates of the object depend upon time, we can write the position vector
r in Cartesian coordinates as:
r(t) = x(t)i + y(t)j
Therefore,
r(t + δt) = x(t + δt)i + y(t + δt)j
so that,
x(t + δt)i + y(t + δt)j − x(t)i − y(t)j
v(t) = lim
δt→0
δt
x(t + δt) − x(t) y(t + δt) − y(t)
= lim i+ j
δt→0 δt δt
dx dy
= i+ j
dt dt
This is often abbreviated to v = ṙ = ẋi + ẏj, using notation for derivatives with respect to time.
So we see that the velocity vector is the derivative of the position vector with respect to time. This
result generalizes in an obvious way to three dimensions as summarized in the following Key Point.
Key Point 8
Given r(t) = x(t)i + y(t)j + z(t)k

then the velocity vector is
v = ṙ(t) = ẋ(t)i + ẏ(t)j + ż(t)k
The magnitude of the velocity vector gives the speed of the object.
We can define the acceleration vector in a similar way, as the rate of change (i.e. the derivative) of
the velocity with respect to the time:
dv d2 r
a= = 2 = r̈ = ẍi + ÿj + z̈k
dt dt
56 HELM (2006):
®
Example 6
If w = 3t2 i + cos 2tj, find
d2 w

dw dw
(a) (b) (c) 2
dt dt dt
Solution
dw
(a) If w = 3t2 i + cos 2tj, then differentiation with respect to t yields: = 6ti − 2 sin 2tj
dt
dw p p
(b) = (6t) + (−2 sin 2t) = 36t2 + 4 sin2 2t
2 2
dt
d2 w
(c) = 6i − 4 cos 2tj
dt2
It is possible to differentiate more complicated expressions involving vectors provided certain rules
are adhered to as summarized in the following Key Point.
Key Point 9
If w and z are vectors and c is a scalar, all these being functions of time t, then:
d dw dz
(w + z) = +
dt dt dt
d dw dc
(cw) = c + w
dt dt dt
d dz dw
(w · z) = w· + ·z
dt dt dt
d dz dw
(w × z) = w× + ×z
dt dt dt
HELM (2006): 57
Example 7
If w = 3ti − t2 j and z = 2t2 i + 3j, verify the result
d dz dw
(w · z) = w · + ·z
dt dt dt
Solution
w · z = (3ti − t2 j) · (2t2 i + 3j) = 6t3 − 3t2 .
d
Therefore (w · z) = 18t2 − 6t (1)
dt
dw dz
Also = 3i − 2tj and = 4ti
dt dt
dz dw
so w· +z· = (3ti − t2 j) · (4ti) + (2t2 i + 3j) · (3i − 2tj)
dt dt
= 12t2 + 6t2 − 6t
= 18t2 − 6t (2)
d dz dw
We have verified (w · z) = w · + · z since (1) is the same as (2).
dt dt dt
Example 8
If w = 3ti − t2 j and z = 2t2 i + 3j, verify the result
d dz dw
(w × z) = w × + ×z
dt dt dt
Solution

i j k
d
w × z = 3t −t2 0 = (9t + 2t4 )k implying (w × z) = (9 + 8t3 )k (1)
2t2 3 0

dt

i j k
dz
= 4t3 k
w× = 3t −t2
0 (2)
dt 4t 0

0

i j k
dw
= (9 + 4t3 )k
× z = 3 −2t 0
(3)
dt 2t2 3 0

We can see that (1) is the same as (2) + (3) as required.
58 HELM (2006):
®
Exercises
dr d2 r
1. If r = 3ti + 2t2 j + t3 k, find (a) (b)
dt dt2
dB d2 B
2. Given B = te−t i + cos t j find (a) (b)
dt dt2
dr
3. If r = 4t2 i + 2tj − 7k evaluate r and when t = 1.
dt
4. If w = t3 i − 7tk and z = (2 + t)i + t2 j − 2k
dw dz d dz dw
(a) find w · z, (b) find , (c) find , (d) show that (w · z) = w · + ·z
dt dt dt dt dt
5. Given r = sin t i + cos t j
(a) find ṙ, (b) find r̈, (c) find |r|
(d) Show that the position vector r and velocity vector ṙ are perpendicular.
Answers
1. (a) 3i + 4tj + 3t2 k (b) 4j + 6tk
2. (a) (−te−t + e−t )i − sin tj (b) e−t (t − 2)i − cos tj
3. 4i + 2j − 7k, 8i + 2j
4. (a) t(t3 + 2t2 + 14) (b) 3t2 i − 7k (c) i + 2tj
5. (a) cos ti − sin tj (b) − sin ti − cos tj (c) 1 (d) Follows by showing r · ṙ = 0.
HELM (2006): 59
Case Study:
Complex Impedance 12.6
Electronic Filters
Electronic filters are used widely, for example in audio equipment to correct for imperfections in
microphones or loudspeakers, or to introduce special effects. The purpose of a filter is to produce
an alternating current (a.c.) output voltage that varies with the frequency of the input voltage. A
filter must have at least one component which has an impedance that varies with frequency. The
impedance is given by the time dependent ratio of ‘voltage across the component’ to ‘current through
the component’. This means that a filter must contain at least one inductance or capacitance. An
inductor consists of a large number of coils of wire. When the current i flowing through an inductor
changes, the associated magnetic field changes and produces a voltage v across the inductor which
is proportional to the rate of change of the current. The constant of proportionality (inductance)
is given the symbol L.
In electronics, it is usual to use lower case symbols for the time varying quantities. The standard
representations for a.c. electronic signals are
v = V0 ejωt and i = I0 ejωt
where V0√is the (real) amplitude of the a.c. voltage and I0 is the (real) amplitude of the a.c. current
and j = −1.
v v
i i
C
L
(a) (b)
Figure 33: (a) an inductor (b) a capacitor

An inductor (see Figure 33) gives rise to an a.c. voltage
di
v=L = jωLi
dt
Hence v/i = jwL is the impedance of the inductor. The purely imaginary quantity, jwL, is called
the reactance of the inductor. Usually a coil of wire forming an inductor also has resistance but
this can be designed or assumed to be negligible. A capacitor consists of two conducting plates
separated by a thin insulator. The charge (q) on the plates is proportional to the voltage (v) between
the plates. The constant of proportionality (capacitance) is given the symbol C. So q = Cv. The
current (i) into the capacitor is equal to the rate of change of the charge on the capacitor i.e.
60 HELM (2006):
®
dq dv
i= =C = jωCv.
dt dt
Hence, for a capacitor, the impedance Zc = v/i = 1/jwC. This purely imaginary quantity is also a
reactance. Because of Ohm’s law (v = iR), a resistance R provides a constant (real) contribution
of R to the impedance of a circuit. If two resistors R1 and R2 are in series the same current passes
through both of them and the combined resistance is R1 + R2 . In the circuit shown in Figure 34
(consider the left-hand representation of this circuit first but note that the right-hand version is
equivalent), the input voltage across both resistors and the output voltage across R2 are related by
vout R2
vin = i(R1 + R2 ) and vout = iR2 so = .
vin R1 + R2
Such a circuit is called a potential divider.
R1
R1
vin vout
vin R2
R2 vout
Figure 34: Two representations of a potential divider circuit

Now consider this circuit with the resistor R2 replaced by a capacitor C as in Figure 35.
vin vout
C
Figure 35: Low pass filter circuit containing a resistor and a capacitor
If R1 is replaced by R and R2 by ZC = 1/jwC, in the relevant expression for the potential divider
circuit, then
vout 1/jωC 1
= =
vin R + 1/jωC 1 + jωRC
The square of the magnitude of the voltage ratio is given by multiplying the existing complex expres-
sion by its complex conjugate, i.e.
vout 2

1 1
vin = (1 + jωRC)(1 − jωRC) = (1 + ω 2 R2 C 2 )

HELM (2006): 61
Section 12.6: Case Study: Complex Impedance
Figure 36 shows a plot of the magnitude of the voltage ratio as a function of ω, i.e. the frequency
response for R = 10 Ω and C =1 µF (i.e. 10−6 F). Note that the magnitude of the output voltage
is close to that of the input voltage at low frequencies but decreases rapidly as frequency increases.
This is an ideal low pass filter response.
Output voltage/Input voltage

0.75
0.5
0.25
2 × 105 4 × 105 6 × 105 8 × 105 1 × 106

Angular frequency rad/s
Figure 36: Frequency response of a low pass filter

Engineering problem stated in words
R
L
vin vout
C
Figure 37: An LC filter circuit

Plot the frequency response of the LC filter circuit shown in Figure 37 if R = 10 Ω,
L = 0.1 mH (i.e. 10−4 H) and C = 1 µF. After plotting the response for two values of R below 10
Ω, comment on the way in which the response varies as R varies. Identify the frequency for which
the response is maximum.
Engineering problem expressed mathematically
(a) Noting that the resistor and inductor are in series, replace R1 by (R + jwL) and R2 by
vout R2
1/jwC in the equation =
vin R1 + R2
2
vout
(b) Derive an expression for
vin

vout
(c) Hence plot as a function of ω for R = 10 Ω.
vin
62 HELM (2006):
®

vout
(d) Plot
for two further values of R < 10 Ω (e.g. 5 Ω and 2 Ω).
vin

vout
(e) Find an expression for the value of ω = ωres at which is maximum.
vin
(a) The substitutions R1 → (R + jwL) and R2 → 1/jwC in the equation

vout R2 vout 1/jωC 1
= yield = = 2
vin R1 + R2 vin R + jωL + 1/jωC (1 − ω LC + jωRC)
(b) Multiplying by the complex conjugate of the denominator
vout 2

1 1
vin = (1 − ω 2 LC + jωRC)(1 − ω 2 LC − jωRC) = (1 − ω 2 LC)2 + ω 2 R2 C 2

(c) See the solid line in Figure 38.

5
Output voltage/Input voltage
R = 2Ω
4
2
R = 5Ω
1
R = 10Ω
0
0 5 × 104 1 × 105 1.5 × 105 2 × 105
Angular frequency rad/s
Figure 38: Frequency response of LC filter

(d) See the other broken lines in Figure 38.
There is a peak in the voltage output, which can exceed the voltage input by a considerable
amount. It is particularly noticeable for small values of the resistance and decreases as
the
resistance
increases.
vout
(e) will be maximum when the first term in the denominator is zero (the other term
vin
is always positive for ω > 0) i.e. when
1 ωres 1
ω = ωres = √ or fres = = √
LC 2π 2π LC
The corresponding frequency is known as the resonant frequency of the circuit.
Additional comment
The resonant behaviour depicted in Figure 38 is found in certain vibrating systems as well as electronic
circuits. This gives rise to an electrical analogy for such mechanical systems and will be explored
further after 19 on differential equations.
HELM (2006): 63
Section 12.6: Case Study: Complex Impedance
Contents 13
Integration
13.1 Basic Concepts of Integration 2
13.2 Definite Integrals 14
13.3 The Area Bounded by a Curve 24
13.4 Integration by Parts 33
13.5 Integration by Substitution and Using Partial Fractions 40
13.6 Integration of Trigonometric Functions 48
Learning outcomes
In this Workbook you will learn about integration and about some of the common techniques
employed to obtain integrals. You will learn that integration is the inverse operation to
differentiation and will also appreciate the distinction between a definite and an indefinite
integral. You will understand how a definite integral is related to the area under a curve.
You will understand how to use the technique of integration by parts to obtain integrals
involving the product of functions. You will also learn how to use partial fractions and
trigonometric identities in integration.
Basic Concepts
of Integration 13.1
Introduction
df
When a function f (x) is known we can differentiate it to obtain its derivative . The reverse process
dx
is to obtain the function f (x) from knowledge of its derivative. This process is called integration.
Applications of integration are numerous and some of these will be explored in subsequent Sections.
First, what is important is to practise basic techniques and learn a variety of methods for integrating
functions.

Prerequisites • thoroughly understand the various techniques

of differentiation

'
$
• evaluate simple integrals by reversing the
process of differentiation
• use a table of integrals

Learning Outcomes
• explain the need for a constant of integration
On completion you should be able to . . . when finding indefinite integrals
• use the rules for finding integrals of sums of

functions and constant multiples of functions
& %
2 HELM (2006):
Workbook 13: Integration
®
1. Integration as differentiation in reverse

dy
Suppose we differentiate the function y = x2 . We obtain = 2x. Integration reverses this process
dx
2
and we say that the integral of 2x is x . Pictorially we can regard this as shown in Figure 1:
differentiate
x2 2x
integrate
Figure 1
The situation is just a little more complicated because there are lots of functions we can differentiate
to give 2x. Here are some of them: x2 + 4, x2 − 15, x2 + 0.5
All these functions have the same derivative, 2x, because when we differentiate the constant term we
obtain zero. Consequently, when we reverse the process, we have no idea what the original constant
term might have been. So we include in our answer an unknown constant, c say, called the constant
of integration. We state that the integral of 2x is x2 + c.
d
When we want to differentiate a function, y(x), we use the notation as an instruction to differ-
dx
d
entiate, and write y(x) . In a similar way, when we want to integrate a function we use a special
Z dx
notation: y(x) dx.
Z
The symbol for integration, , is known as an integral sign. To integrate 2x we write
!
2x dx = x2 + c
integral
sign
this term is constant of integration
called the
integrand there must always be a
term of the form dx
Note that along with the integral sign there is a term of the form dx, which must always be written,
and which indicates the variable involved, in this case x. We say that 2x is being integrated with
respect to x . The function being integrated is called the integrand. Technically, integrals of this
sort are called indefinite integrals, to distinguish them from definite integrals which are dealt with
subsequently. When you find an indefinite integral your answer should always contain a constant of
integration.
Exercises
1. (a) Write down the derivatives of each of: x3 , x3 + 17, x3 − 21
Z
(b) Deduce that 3x2 dx = x3 + c.
2. Explain why, when finding an indefinite integral, a constant of integration is always needed.
HELM (2006): 3
Section 13.1: Basic Concepts of Integration
Answers
1. (a) 3x2 , 3x2 , 3x2 (b) Whatever the constant, it is zero when differentiated.
2. Any constant will disappear (i.e. become zero) when differentiated so one must be reintroduced
to reverse the
process.
2. A table of integrals
We could use a table of derivatives to find integrals, but the more common ones are usually found
in a ‘Table of Integrals’ such as that shown below. You could check the entries in this table using
your knowledge of differentiation. Try this for yourself.
Table 1: Integrals of Common Functions

function indefinite
Z integral
f (x) f (x) dx
constant, k
kx + c
1 2
x 2
x +c
1 3
x2 3
x +c
xn+1
xn + c, n 6= −1
n+1
1
x−1 (or ) ln |x| + c
x
cos x sin x + c
sin x − cos x + c
1
cos kx sin kx + c
k
1
sin kx − cos kx + c
k
1
tan kx ln | sec kx|+c
k
ex ex + c
e−x −e−x + c
1 kx
ekx e +c
k
When dealing with the trigonometric functions the variable x must always be measured in radians
and not degrees. Note that the fourth entry in the Table, for xn , is valid for any value of n, positive
or negative, whole number or fractional, except n = −1. When n = −1 use the fifth entry in the
Table.
4 HELM (2006):
®
Example 1 Z
7
Use Table 1 to find the indefinite integral of x : that is, find x7 dx
Solution
xn+1
Z
From Table 1 note that xn dx = + c. In words, this states that to integrate a power
n+1
of x, increase the power by 1, and then divide the result by the new power. With n = 7 we find
Z
1
x7 dx = x8 + c
8
Example 2 Z
Find the indefinite integral of cos 5x: that is, find cos 5x dx
Solution
Z
sin kx
From Table 1 note that cos kx dx = +c
k
Z
1
With k = 5 we find cos 5x dx = sin 5x + c
5
In Table 1 the independent variable is always given as x. However, with a little imagination you will
be able to use it when other independent variables are involved.
Example
Z 3
Find cos 5t dt
Solution
We integrated cos 5x in the previous example. Now the independent variable is t, so simply use
Table 1 and replace every x with a t. With k = 5 we find
Z
1
cos 5t dt = sin 5t + c
5
It follows immediately that, for example,
Z Z
1 1
cos 5ω dω = sin 5ω + c, cos 5u du = sin 5u + c and so on.
5 5
HELM (2006): 5
Example 4 Z
1 1
Find the indefinite integral of : that is, find dx
x x
Solution
This integral deserves special mention. You may be tempted
Z to try to write the integrand as x−1
n+1
x
and use the fourth row of Table 1. However, the formula xn dx = + c is not valid when
n+1
n = −1 as Table 1 makes clear. ZThis is because we can never divide by zero. Look to the fifth
entry of Table 1 and you will see x−1 dx = ln |x| + c.
Example
Z 5 Z
Find 12 dx and 12 dt
Solution
In this Example we are integrating a constant, 12. Using Table 1 we find
Z Z
12 dx = 12x + c Similarly 12 dt = 12t + c.
Task Z
Find t4 dt
Your solution
Answer
Z
1
t4 dt = t5 + c.
5
6 HELM (2006):
®
Task Z
1
Find dx using the laws of indices to write the integrand as x−5 and then use
x5
Table 1:
Your solution
Answer
1 1
− x−4 + c = − 4 + c.
4 4x
Task Z
Find e−2x dx using the entry in Table 1 for integrating ekx :
Your solution
Answer
1
e−2x dx = − e−2x + c.
R
With k = −2, we have
2
Exercises
1. Integrate each of the following functions with respect to x: √
(a) x9 , (b) x1/2 , (c) x−3 , (d) 1/x4 , (e) 4, (f) x, (g) e4x
Z Z Z Z
2
2. Find (a) t dt, (b) 6 dt, (c) sin 3t dt, (d) e7t dt.
Answers
1 10 2 3/2 1 1
1 (a) x + c, (b) x + c, (c) − x−2 + c, (d) − x−3 + c, (e) 4x + c,
10 3 2 3
1
(f) same as (b), (g) e4x + c
4
1 3 1 1 7t
2. (a) t + c, (b) 6t + c, (c) − cos 3t + c, (d) e +c
3 3 7
HELM (2006): 7
3. Some rules of integration
To enable us to find integrals of a wider range of functions than those normally given in a table of
integrals we can make use of the following rules.
The integral of k f (x ) where k is a constant

A constant factor in an integral can be moved outside the integral sign as follows:
Key Point 1
Z Z
k f (x) dx = k f (x) dx
Example 6 Z
2
Find the indefinite integral of 11x : that is, find 11x2 dx
Solution
3
11x3
Z Z
2 2 x
11x dx = 11 x dx = 11 +c = +K where K is a constant.
3 3
Example 7 Z
Find the indefinite integral of −5 cos x; that is, find −5 cos x dx
Solution
Z Z
−5 cos x dx = −5 cos x dx = −5 (sin x + c) = −5 sin x + K where K is a constant.
8 HELM (2006):
®
x) + g (x
The integral of f (x x) − g (x
x) and of f (x x)
When we wish to integrate the sum or difference of two functions, we integrate each term separately
as follows:
Key Point 2
Z Z Z
[ f (x) + g(x) ] dx = f (x) dx + g(x) dx
Z Z Z
[ f (x) − g(x) ] dx = f (x) dx − g(x) dx
Example
Z 8
Find (x3 + sin x) dx
Solution
Z Z Z
1
(x + sin x) dx = x dx + sin x dx = x4 − cos x + c
3 3
4
Note that only a single constant of integration is needed.
Task Z √
Find (3t4 + t) dt
Use Key Points 1 and 2:
Your solution
Answer
3 5 2 3/2
t + t +c
5 3
HELM (2006): 9
Task
The hyperbolic sine and cosine functions, sinh x and cosh x, are defined as follows:
ex − e−x ex + e−x
sinh x = cosh x =
2 2
Note that they are combinations of the exponential functions ex and e−x .
Find the indefinite integrals of sinh x and cosh x.
Your solution Z
ex − e−x
Z
sinh x dx = dx =
2
ex + e−x
Z Z
cosh x dx = dx =
2
Answer
Z Z Z
1 1 1 1 1
x
e−x dx = ex + e−x + c = ex + e−x + c = cosh x + c.

sinh x dx = e dx −
2 2 2 2 2
Z
Similarly cosh x dx = sinh x + c.
Further rules for finding more complicated integrals are dealt with in subsequent Sections.
10 HELM (2006):
®
Electrostatic charge
Introduction
Electrostatic charge is important both where it is wanted, as in the electrostatic precipitator plate
systems used for cleaning gases, and where it is unwanted, such as when charge builds up on moving
belts. This Example is concerned with a charged object with a particular idealised shape - a sphere.
However, similar analytical calculations can be carried out for certain other shapes and numerical
methods can be used for more complicated shapes.
The electric field at all points inside and outside a charged sphere is given by
Qr
E(r) = if r < a (1a)
4πε0 a3
Q
E(r) = if r ≥ a (1b)
4πε0 r2
where ε0 is the permittivity of free space, Q is the total charge, a is the radius of the sphere, and r
is the radial distance between the centre of the sphere and a point of observation (see Figure 2).
Charged sphere
a r
O
Spherical surface S
Figure 2: Geometry and symbols associated with the charged sphere
The electric field associated with electrostatic charge has a scalar potential. The electric field defined
by (1a) and (1b) shows only a radial dependence of position. Therefore, the electric scalar potential
V (r) is related to the field E(r) by
dV
E(r) = − . (2)
dr
Problem in words
A sphere is charged with a uniform density of charge and no other charge is present outside the
sphere in space. Determine the variation of electric potential with distance from the centre of the
sphere.

Determine the electric scalar potential as a function of r, V (r), by integrating (2).
Equation (2) yields V (r) as the negative of the indefinite integral of E(r).
HELM (2006): 11
Z Z
− dV = E(r) dr. (3)
Using (1a) and (1b) with (3) leads to

Z
Q
V (r) = − r dr if r < a (4a)
4πε0 a3
Z
Q dr
V (r) = − if r ≥ a (4b)
4πε0 r2
Z Z
dr 1
Using the facts that r dr = r2 /2 + c1 and 2
= − + c2 ,
r r
(4a) and (4b) become
Qr2
V (r) = − + c1 if r < a (5a)
8πε0 a3
Q
V (r) = + c2 if r ≥ a (5b)
4πε0 r
The integration constant c2 can be determined by assuming that the electric potential is zero at an
infinite distance from the sphere:

Q
lim [V (r)] = 0 ⇒ lim − + c2 = 0 ⇒ c2 = 0.
r→∞ r→∞ 4πε0 r
The constant c1 can be determined by assuming that the potential is continuous at r = a.
From equation (5a)
Qa2
V (a) = − + c1
8πε0 a3
From equation (5b)
Q
V (a) =
4πε0 a
Hence
Q 2Q 3Q
c1 = + = .
4πε0 a 8πε0 a 8πε0 a
Substituting for c1 in (5), the electric potential is obtained for all space is:
2
3a − r2

Q
V (r) = if r < a.
4πε0 2a3
Q
V (r) = if r ≥ a
4πε0 r
Interpretation
The potential of the electrostatic field outside a charged sphere varies inversely with distance from
the centre of the sphere. Inside the sphere, the electrostatic potential varies with the square of the
distance from the centre.
An Engineering Exercise in 29.3 derives the corresponding expressions for the variation of the
electrostatic field and an Engineering Exercise in 27.4 calculates the potential energy due to
the charged sphere.
12 HELM (2006):
®
Exercises
Z
1. Find (2x − ex ) dx
Z
2. Find 3e2x dx
Z
1
3. Find (x + cos 2x) dx
Z 3
4. Find 7x−2 dx
Z
5. Find (x + 3)2 dx, (be careful!)
Answers
1. x2 − ex + c
3
2. e2x + c
2
1 1
3. x2 + sin 2x + c
6 6
7
4. − + c
x
1
5. x3 + 3x2 + 9x + c
3
HELM (2006): 13

Definite Integrals 13.2

Introduction
When you were first introduced to integration as the reverse of differentiation, the integrals you dealt
with were indefinite integrals. The result of finding an indefinite integral is usually a function plus a
constant of integration. In this Section we introduce definite integrals, so called because the result
will be a definite answer, usually a number, with no constant of integration. Definite integrals have
many applications, for example in finding areas bounded by curves, and finding volumes of solids.

• understand integration as the reverse of
Prerequisites differentiation
Before starting this Section you should . . . • be able to use a table of integrals

• find simple definite integrals
Learning Outcomes
• handle some integrals involving an infinite
On completion you should be able to . . . limit of integration

14 HELM (2006):
®
1. Definite integrals
Z
We saw in the previous Section that f (x) dx = F (x) + c where F (x) is that function which, when
dF
differentiated, gives f (x). That is, = f (x). For example,
dx
Z
cos(3x)
sin(3x) dx = − +c
3
Here, f (x) = sin(3x) and F (x) = − 13 cos(3x) We now consider a definite integral which is simply
an indefinite integral but with numbers written to the upper and lower right of the integral sign. The
quantity
Z b
f (x) dx
a
is called the definite integral of f (x) from a to b. The numbers a and b are known as the lower
limit and upper limit respectively of the integral. We define
Z b
f (x) dx = F (b) − F (a)
a
so that a definite integral is usually a number. The meaning of a definite integral will be developed
in later Sections. For the present we concentrate on the process of evaluating definite integrals.
2. Evaluating definite integrals

When you evaluate a definite integral the result will usually be a number. To see how to evaluate a
definite integral consider the following Example.
Example 9 Z 4
2
Find the definite integral of x from 1 to 4; that is, find x2 dx
1
Solution
Z
x2 dx = 13 x3 + c
3
Here f (x) = x2 and F (x) = x3 . Thus, according to our definition
Z 4
43 13
x2 dx = F (4) − F (1) = − = 21
1 3 3
HELM (2006): 15
Section 13.2: Definite Integrals
Writing F (b) − F (a) each time we calculate a definite integral becomes laborious so we replace this
b
difference by the shorthand notation F (x) . Thus
a
b
F (x) ≡ F (b) − F (a)
a
Thus, from now on, we shall write

Z b b
f (x) dx = F (x)
a a
so that, for example

Z 4 3 4
2 x 43 13
x dx = = − = 21
1 3 1 3 3
Example 10 Z π/2
π
Find the definite integral of cos x from 0 to ; that is, find cos x dx.
2 0
Solution
R
Since cos x dx = sin x + c then
Z π/2 π/2
cos x dx = sin x
0
π 0
= sin − sin 0 = 1 − 0 = 1
2
Always remember, that if you use a calculator to evaluate any trigonometric functions, you must
work in radian mode.
Task Z 2
2
Find the definite integral of x + 1 from 1 to 2; that is; find (x2 + 1) dx
1
First perform the integration:
Your solution
Answer
2
1 3
x +x .
3 1
16 HELM (2006):
®
Now insert the limits of integration, the upper limit first, and hence evaluat the integral:
Your solution
Answer

8 1 10
+2 − +1 = or 3.333 (3 d.p.).
3 3 3
Task Z 1
Find (x2 + 1) dx.
2
This Task is very similar to the previous Task. Note the limits have been interchanged:
Your solution
Answer
1
1 3 1 8 10
x +x = +1 − +2 =− .
3 2 3 3 3
Note from these two Tasks that interchanging the limits of integration, changes the sign of the
answer.
Key Point 3
If you interchange the limits, you must change the sign:
Z b Z a
f (x) dx = − f (x) dx
a b
HELM (2006): 17
Task
When a spring is fixed at one end and stretched at the free end it exerts a restoring
force that is proportional to the displacement of the free end. The constant
of proportionality k N m−1 is known as the stiffness of the spring. Calculate
the work done in stretching a spring with stiffness k from displacement x1 m to
displacement x2 m (x2 > x1 ) given that the work done (W ) is the product of
force and displacement.
Your solution
Answer
The restoring force varies during the displacement. So the work done during the extension cannot
be determined from a single simple product.
Consider a small element ∆x of the extension beyond an arbitrary displacement x. The element is
sufficiently small that the force during the displacement can be regarded as constant and equal to
the force at displacement x is kx. So the work done ∆W in extending the spring from displacement
x to displacement x + ∆x is approximately kx∆x.
Using the idea of integration as a limit of a sum, in this case as ∆x tends to zero,
Z x2 x
1 2 2 1
W = kx dx = kx = k(x22 − x21 )
x1 2 x1 2
Exercises
Z 1 Z 3 Z 2 Z 1
2 1 x
1. Evaluate (a) x dx, (b) dx (c) e dx (d) (1 + t2 ) dt
0 x2 −1
Z π/3 Z 2π 1
Z 3
2. Find (a) cos 2x dx (b) sin x dx (c) e2t dt
0 0 1
Answers
1 1
1. (a) (b) (c) e2 − e1 = 4.671 (d) 2.667
3 6
√
2 (a) 3/4 = 0.4330 (b) 2 (c) 198.019
18 HELM (2006):
®
Torsion of a mild-steel bar
Introduction
For materials such as mild-steel, the relationship between applied shear stress and shear strain (de-
formation) can be described as follows.
• For small values of the shear strain, the shear stress (τ ) and shear strain (ω) are proportional
to one another, i.e.
1
ω= ×τ (1)
G
(where G is the shear modulus). This is known as elastic behaviour.
• There is a maximum shear stress that the material is capable of supporting. If the shear
strain is increased further, the shear stress remains roughly constant. This is known as plastic
behaviour.
Figure 3 summarises the relationship between shear stress and shear strain; the point (ωY , τY ) is
known as the yield point.
τ
6
τY p p p p p p p
p
shear pp
stress pp
p
p

p
p
p
-ω
ωY
shear strain
Figure 3
Now suppose that one end of a bar of circular cross section is twisted through an angle θ, then the
shear strain on the surface is given by
Rθ
ωS = (2)
L
(where R and L are the radius and length of the bar respectively), while the shear strain, at a distance
r from the central core, is given by
rθ
ω= (3)
L
The torque transmitted by a bar is given by the integral
Z R
T = 2π r2 τ (r) dr (4)
0
HELM (2006): 19
As the shear strain is a function of distance from the central axis of the bar, it may be that the shear
strain on the surface is greater than the critical shear strain ωY . In this scenario the shear stress is
given by
 τ
Y
 ω ω ≤ ωY
ωY

τ= (5)

τY ω > ωY

i.e. the regions near the central axis exhibit elasticity, but in those regions near the surface the elastic
limit has been exceeded and the metal exhibits plasticity (see Figure 4).
R
elastic
zone re
plastic zone
τ
τY
0 re R
Figure 4
Problem in words
Find an expression for the torque transmitted by a bar as a function of the angle θ through which
one end is turned.
Using Equations (3) to (5), find a formula for T in terms of the variable θ.
Substituting (3) into (5)

 τY r θ
 rθ
≤ ωY
 ω L

L
Y
τ =


 τY rθ
> ωY

L
L ωY

τY r θ


 ω L r≤ = re
θ

Y
=


 τ L ωY
r> = re

Y
θ
20 HELM (2006):
®
For small values of θ, re ≥ R so that the whole of the bar will be in the elastic region, i.e.
τY r θ
τ=
ωY L
Now (4) becomes
Z R Z R R
2 τY r θ τY θ 3 τY θ r 4 π τY θ 4
T = 2π r dr = 2π r dr = 2π = R (6)
0 ωY L ωY L 0 ωY L 4 0 2 ωY L
i.e. the torque is directly proportional to the twist, θ.
For larger θ, re < R, so that (4) becomes
Z re Z R
2 τY r θ
T = 2π r dr + 2π r2 τY dr
0 ωY L re
Z re Z R
τY θ 3
= 2π r dr + 2π τY r2 dr
ωY L 0 re
4 re 3 R
τ θ r r
= 2π Y + 2π τY
ωY L 4 0 3 re
π τY θ 4 2π
τY R3 − re3

= re +
2 ωY L 3
But re = L ωY /θ, so
4 4
π τY θ L ωY 2π 3 2π L3 ω 3
T = + τ R − τ Y
2 ωY L θ4 3 Y 3 Y θ3
3 3
L ω

2π 3 1 2
= τY R + π τY − τY Y
3 2 3 θ3
2π 3 π L3 ω 3
= τ R − τY Y
(7)
3 Y 6 θ3
Equation (6) will apply when re ≥ R, i.e. (L ωY /θ) ≥ R or θ ≤ (L ωY /R), so that combining (6)
and (7) gives overall

π τY θ 4 L ωY

 R θ≤
 2 ω L
 R
Y
T = (8)
3 3

 2π π L ω L ω
τ R 3 − τY
 Y Y
 θ>
3 Y 6 θ3 R
Interpretation and further comment
At the critical value of θ, i.e. when the outer edge begins to exhibit plasticity, both formulae in (8)
give
π
Tcrit = τY R3
2
Furthermore, the first derivatives are both
dT π τY R 4
=
dθ 2 ωY L
HELM (2006): 21
i.e. the curves join smoothly.
The second derivatives, though, are not equal (zero in one case). In the theoretical limit as θ → ∞
2π
T = τ R3
3 Y
so this is the total torsional torque which can be carried by the bar. (The critical torque above is
three-quarters of this value.) However, clearly θ → ∞ is merely a theoretical limit since the bar
would, in fact, shear at a finite value of θ.
3. Some integrals with infinite limits

On occasions, and notably when dealing with Laplace and Fourier transforms, you will come across
integrals in which one of the limits is infinite. We avoid a rigorous treatment of such cases here and
instead give some commonly occurring examples.
Example 11 Z ∞
Find the definite integral of e −x
from 0 to ∞; that is, find e−x dx.
0
Solution
Z ∞ ∞
−x −x
The integral is found in the normal way: e dx = −e
0 0
There is no difficulty in evaluating the square bracket at the lower limit. We obtain simply −e−0 =
−1. At the upper limit we must examine the behaviour of −e−x as x gets infinitely large. This is
where it is important that you are familiar with the properties of the exponential function. If you
refer to the graph (Figure 5) you will see that as x tends to infinity e−x tends to zero.
Consequently the contribution to the integral from the upper limit is zero. So
! ∞ " #∞
−x −x
e dx = −e
0 0
= (−e ) − (−e−0 )
−∞ e−x
= (0) − (−e−0 )
= 1 x
Figure 5
Z ∞
Thus the value of e−x dx is 1.
0
Another way of achieving this result is as follows:

We change the infinite limit to a finite limit, b, say and then examine the behaviour of the integral
as b tends to infinity, written as
22 HELM (2006):
®
Z ∞ Z b
−x
e dx = lim e−x dx
0 b→∞ 0
Z b b
−x −x
= −e−b − −e−0 = −e−b + 1

Now, e dx = −e
0 0
−b
Then as b tends to infinity −e tends to zero, and the resulting integral has the value 1, as before.
Many integrals having infinite limits cannot be evaluated in a simple way like this, and many cannot
be evaluated at all. Fortunately, most of the integrals you will meet will exhibit the sort of behaviour
seen in the last example.
Exercise
Z ∞ Z ∞ Z ∞ Z ∞
−x −2x −3x 4
Evaluate (a) e dx (b) e dx (c) e dx (d) dt
1 0 2 1 t2
Answer
(a) e−1 ∼ 0.368 (b) 1
2
(c) 13 e−6 = 0.0008 (4 d.p.) (d) 4
HELM (2006): 23
The Area Bounded
by a Curve 13.3
Introduction
One of the important applications of integration is to find the area bounded by a curve. Often such
an area can have a physical significance like the work done by a motor, or the distance travelled by
a vehicle. In this Section we explain how such an area is calculated.
' $
• understand integration as the reverse of
differentiation
• be able to use a table of integrals

Prerequisites • be able to evaluate definite integrals
Before starting this Section you should . . . • be able to sketch graphs of common
functions including polynomials, simple
rational functions, exponential functions and
trigonometric functions
& %

• find the area bounded by a curve and the
Learning Outcomes x-axis
On completion you should be able to . . . • find the area between two curves

24 HELM (2006):
®
1. Calculating the area under a curve

Let us denote the area under y = f (x) between a fixed point a and a variable point x by A(x):
y area is A(x)
y = f (x)
a x x
Figure 6
A(x) is clearly a function of x since as the upper limit changes so does the area. How does the area
change if we change the upper limit by a very small amount δx? See Figure 7 below.
y
y = f (x)
f (x) area is A( x+δx) A( x)
a x x + δx x
Figure 7
To a good approximation the change in the area is:
A(x + δx) − A(x) ≈ f (x)δx
[This is because the shaded area is approximately a rectangle with base δx and height f (x).] This
approximation gets better and better as δx gets smaller and smaller. Rearranging gives:
A(x + δx) − A(x)
f (x) ≈
δx
Clearly, in the limit as δx → 0 we have
A(x + δx) − A(x)
f (x) = lim
δx→0 δx
But this limit on the right-hand side is the derivative of A(x) with respect to x so
dA(x)
f (x) =
dx
Thus A(x) is an indefinite integral of f (x) and we can therefore write:
Z
A(x) = f (x)dx
Now the area under the curve from a to b is clearly A(b) − A(a). But remembering our shorthand
notation for this difference, introduced in the last Section we have, finally
b Z b
A(b) − A(a) ≡ A(x) = f (x)dx
a a
We conclude that the area under the curve y = f (x) from a to b is given by the definite integral of
f (x) from a to b.
HELM (2006): 25
Section 13.3: The Area Bounded by a Curve
2. The area bounded by a curve lying above the x-axis
Consider the graph of the function y = f (x) shown in Figure 8. Suppose we are interested in
calculating the area underneath the graph and above the x-axis, between the points where x = a
and x = b. When such an area R b lies entirely above the x-axis, as is clearly the case here, this area is
given by the definite integral a f (x) dx.
y area required
y = f (x)
a b x
Figure 8
Key Point 4
Z b
The area under the curve y = f (x), between x = a and x = b is given by f (x) dx
a
when the curve lies entirely above the x-axis between a and b.
Example 12
Calculate the area bounded y = x−1 and the x-axis, between x = 1 and x = 4.
Solution
Below is a graph of y = x−1 . The area required is shaded; it lies entirely above the x-axis.
y 1
y=
x area required
1
x
O 1 2 3 4 5
Figure 9
Z 4 4
1
area = dx = ln |x| = ln 4 − ln 1 = ln 4 = 1.386 (3 d.p.)
1 x 1
26 HELM (2006):
®
Task
Find the area bounded by the curve y = sin x and the x-axis between x = 0 and
x = π. (The required area is shown in the figure. Note that it lies entirely above
the x-axis.)
y y =sinx
area required
O π x
Your solution
Answer
Z π π
sin x dx = − cos x = 2.
0 0
Task
Find the area under f (x) = e2x from x = 1 to x = 3 given that the exponential
function e2x is always positive.
Your solution
Answer
Z 3 3
2x 1 2x
area = e dx = e = 198 to 3 significant figures.
1 2 1
HELM (2006): 27
Example 13
The figure shows the graphs of y = sin x and y = cos x for 0 ≤ x ≤ 12 π. The two
graphs intersect at the point where x = 41 π. Find the shaded area.
y
y =cos x
area required y =sinx
π
2 x
Figure 10
Solution
To find the shaded area we could calculate the area under the graph of y = sin x for x between
0 and 14 π, and subtract this from the area under the graph of y = cos x between the same limits.
Alternatively the two processes can be combined into one and we can write
Z π/4
shaded area = (cos x − sin x)dx
0
π/4
= sin x + cos x
0
1 1

= sin + cos 4
π− (sin 0 + cos 0)
4
π
1 1 2 √
= ( √ + √ ) − (0 + 1) = √ − 1 = 2 − 1
2 2 2
So the numeric value of the integral is √2

− 1 = 0.414 to 3 d.p.. (Alternatively you can use your
2
π π
calculator to obtain this result directly by evaluating sin and cos .)
4 4
Exercises
In each question you should check that the required area lies entirely above the horizontal axis.
1. Find the area under the curve y = 7x2 and above the x-axis between x = 2 and x = 5.
2. Find the area bounded by the curve y = x3 and the x-axis between x = 0 and x = 2.
3. Find the area bounded by the curve y = 3t2 and the t-axis between t = −3 and t = 3.
4. Find the area under y = x−2 between x = 1 and x = 10.
Answer
1. 273, 2. 4, 3. 54, 4. 0.9.
28 HELM (2006):
®
3. The area bounded by a curve, not entirely above the x-axis

Figure 11 shows a graph of y = −x2 + 1.
y
y =−x2 +1
-2 -1 1 2 3 x
area required
Figure 11
The shaded area is bounded
R2 by the x-axis and the curve, but lies entirely below the x-axis. Let us
2
evaluate the integral 1 (−x + 1)dx.
Z 2 3 2
2 x
(−x + 1)dx = − + x
1 3
3 1 3
2 1
= − +2 − − +1
3 3
7 4
= − +1=−
3 3
The evaluation of the area yields a negative quantity. There is, of course, no such thing as a negative
area. The area is actually 43 , and the negative sign is an indication that the area lies below the x-axis.
(However, in applications of integration such as work/energy or distance travelled in a given direction
negative values can be meaningful.)
If an area contains parts both above and below the horizontal axis, care must be taken when calcu-
lating this area. It is necessary to determine which parts of the graph lie above the horizontal axis
and which lie below. Separate integrals need to be calculated for each ‘piece’ of the graph. This idea
is illustrated in the next Example.
HELM (2006): 29
Example 14
Find the total area enclosed by the curve y = x3 −5x2 +4x and the x-axis between
x = 0 and x = 3.
Solution
We need to determine which parts of the graph lie above and which lie below the x-axis. To do this
it is helpful to consider where the graph cuts the x-axis. So we consider the function x3 − 5x2 + 4x
and look for its zeros
x3 − 5x2 + 4x = x(x2 − 5x + 4) = x(x − 1)(x − 4)
So the graph cuts the x-axis when x = 0, x = 1 and x = 4. Also, when x is large and positive,
y is large and positive since the term involving x3 dominates. When x is large and negative, y is
large and negative for the same reason. With this information we can sketch a graph showing the
required area:
1 2 3 4 x
y =x3 −5x2 +4x
area required
Figure 12
From the graph we see that the required area lies partly above the x-axis (when 0 ≤ x ≤ 1) and
partly below (when 1 ≤ x ≤ 3). So we evaluate the integral in two parts: Firstly:
Z 1 4 1
5x3 4x2

3 2 x 1 5 7
(x − 5x + 4x)dx = − + = − + 2 − (0) =
0 4 3 2 0 4 3 12
This is the part of the required area which lies above the x-axis. Secondly:
3 3
x4 5x3 4x2
Z
3 2
(x − 5x + 4x)dx = − +
1 4 3 2 1

81 135 1 5 22
= − + 18 − − +2 =−
4 3 4 3 3
This represents the part of the required area which lies below the x-axis. The actual area is 22
3
.
Combining the results of the two separate calculations we can find the total area bounded by the
curve:
7 22 95
area = + =
12 3 12
30 HELM (2006):
®
Task
(a) Sketch the graph of y = sin 2x for 0 ≤ x ≤ π.
(b) Find the total area bounded by the curve and the x-axis between x = 13 π
and x = 43 π.
(a) Sketch the graph and indicate the required area noting where the graph crosses the x-axis:
Your solution
Answer
y
π 3π
2 4 x
π π
3
y =sin 2x
(b) Perform the integration in two parts to obtain the required area:
Your solution
Answer
Z π/2 Z 3π/4
1 1
sin 2x dx = and sin 2xdx = − .
π/3 4 π/2 2
1 1 3
The required area is + = .
4 2 4
HELM (2006): 31
Exercises
1. Find the total area enclosed between the x-axis and the curve y = x3 between x = −1 and
x = 1.
2. Find the area under y = cos 2t from t = 0 to t = 0.5.
3. Find the area enclosed by y = 4 − x2 and the x axis
(a) from x = 0 to x = 2, (b) from x = −2 to x = 1, (c) from x = 1 to x = 3.
4. Calculate the area enclosed by the curve y = x3 and the line y = x.
5. Find the area bounded by y = ex , the y-axis and the line x = 2.
6. Find the area enclosed between y = x(x − 1)(x − 2) and the x axis.
Answers
16 1
1. 0.5 2. 0.4207 3. (a) 3
, (b) 9, (c) 4 4. 0.5 5. e2 − 1 6. 2
32 HELM (2006):
®

Integration by Parts 13.4

Introduction
Integration by Parts is a technique for integrating products of functions. In this Section you will learn
to recognise when it is appropriate to use the technique and have the opportunity to practise using
it for finding both definite and indefinite integrals.
' $
• understand what is meant by definite and
indefinite integrals
Prerequisites • be able to use a table of integrals
• be able to differentiate and integrate a range
of common functions
& %
' $
• decide when it is appropriate to use the
method known as integration by parts
Learning Outcomes • apply the formula for integration by parts to

definite and indefinite integrals
• perform integration by parts repeatedly if
appropriate
& %
HELM (2006): 33
Section 13.4: Integration by Parts
1. Indefinite integration
The technique known as integration by parts is used to integrate a product of two functions, such
as in these two examples:
Z Z 1
(i) 2x
e sin 3x dx (ii) x3 e−2x dx
0
Note that in the first example, the integrand is the product of the functions e2x and sin 3x, and in
the second example the integrand is the product of the functions x3 and e−2x . Note also that we
can change the order of the terms in the product if we wish and write
Z Z 1
(i) 2x
(sin 3x) e dx (ii) e−2x x3 dx
0
What you must never do is integrate each term in the product separately and then multiply - the
integral of a product is not the product of the separate integrals. However, it is often possible to
find integrals involving products using the method of integration by parts - you can think of this as
a product rule for integrals.
The integration by parts formula states:
Key Point 5
Integration by Parts for Indefinite Integrals
For indefinite integrals, given functions f (x) and g(x):
Z Z Z Z
df
f · g dx = f · g dx − · gdx dx
dx
Alternatively, given functions u and v:

Z Z
dv du
u dx = u.v − v dx
dx dx
Study the formula carefully and note the following observations. Firstly, to apply the formula we must
df
be able to differentiate the function f to find , and we must be able to integrate the function, g.
dx
Secondly the formula replaces one integral, the one on the left, with a different integral, that on the
far right. The intention is that the latter, whilst it may look more complicated in the formula above,
is simpler to evaluate. Consider the following Example:
34 HELM (2006):
®
Example 15 R
Find the integral of the product of x with sin x; that is, find x sin x dx.
Solution
Compare the required integral with the formula for integration by parts: we choose
f =x and g = sin x
It follows that
Z Z
df
=1 and g dx = sin x dx = − cos x
dx
(When integrating g there is no need to worry about a constant of integration. When you become
confident with the method, you may like to think about why this is the case.)
Applying the formula we obtain
Z Z Z Z
df
x sin x dx = f · g dx − · gdx dx
dx
Z
= x(− cos x) − 1(− cos x) dx
Z
= −x cos x + cos x dx = −x cos x + sin x + c
Task Z
Find (5x + 1) cos 2x dx.
Z
df
Let f = 5x + 1 and g = cos 2x. Now calculate and g dx:
dx
Your solution
Answer Z
df 1
= 5 and cos 2x dx = sin 2x.
dx 2
Substitute these results into the formula for integration by parts and complete the Task:
Your solution
HELM (2006): 35
Answer Z
1 1 1 5
(5x + 1)( sin 2x) − 5( sin 2x)dx = (5x + 1) sin 2x + cos 2x + c
2 2 2 4
Sometimes it is necessary to apply the formula more than once, as the next Example shows.
Example
Z 16
Find 2x2 e−x dx
Solution
Z
df
2 −x
We let f = 2x and g = e . Then = 4x and gdx = −e−x
dx
Using the formula for integration by parts we find
Z Z Z
2x e dx = 2x (−e ) − 4x(−e )dx = −2x e + 4xe−x dx
2 −x 2 −x −x 2 −x
We now need to find 4xe−x dx using integration by parts again. We get

R
Z Z
4xe dx = 4x(−e ) − 4(−e−x )dx
−x −x
Z
= −4xe + 4e−x dx = −4xe−x − 4e−x
−x
Altogether we have
Z
2x2 e−x dx = −2x2 e−x − 4xe−x − 4e−x + c = −2e−x (x2 + 2x + 2) + c
Exercises
In some questions
Z below it will be necessary
Z to apply integration
Z by parts more than once.
3t
1. Find (a) x sin(2x)dx, (b) te dt, (c) x cos x dx.
Z
2. Find (x + 3) sin x dx.
Z
3. By writing ln x as 1 × ln x find ln x dx.
Z Z Z
−1
4. Find (a) tan x dx, (b) −7x cos 3x dx, (c) 5x2 e3x dx,
Z Z
5. Find (a) x cos kx dx, where k is a constant (b) z 2 cos kz dz, where k is a constant.
Z Z
−st
6. Find (a) te dt where s is a constant, (b) Find t2 e−st dt where s is a constant.
36 HELM (2006):
®
Answers
1 1 1 1
1. (a) sin 2x − x cos 2x + c, (b) e3t ( t − ) + c, (c) cos x + x sin x + c
4 2 3 9
2. −(x + 3) cos x + sin x + c.
3. x ln x − x + c.
1 7 7 5
4. (a) x tan−1 x − ln(x2 + 1) + c, (b) − cos 3x − x sin 3x + c, (c) e3x (9x2 − 6x + 2) + c,
2 9 3 27
cos kx x sin kx 2z cos kz z 2 sin kz 2 sin kz
5. (a) + + c, (b) + − + c.
k2 k k2 k k3
−e−st (st + 1) −e−st (s2 t2 + 2st + 2)
6. (a) + c, (b) + c.
s2 s3
2. Definite integration
When dealing with definite integrals the relevant formula is as follows:
Key Point 6
Integration by Parts for Definite Integrals
For definite integrals, given functions f (x) and g(x):
Z b Z b Z b Z
df
f · g dx = f · g dx − · gdx dx
a a a dx
Z b b Z b
dv du
Alternatively, given functions u and v: u dx = uv − v dx
a dx a a dx
Example 17
Z 2
Find xex dx.
0
Solution
Z
df
We let f = x and g = e . Then x
= 1 and g dx = ex . Using integration by parts we obtain
dx
Z 2 2 Z 2 2
x x
xe dx = xe − 1.e dx = 2e − ex = 2e2 −[e2 −1] = e2 +1
x 2
(or 8.389 to 3 d.p.)
0 0 0 0
Sometimes it is necessary to apply the formula more than once as the next Example shows.
HELM (2006): 37
Example 18 Z 2
2 x
Find the definite integral of x e from 0 to 2; that is, find x2 ex dx.
0
Solution
Z
df
2
We let f = x and g = e . Then x
= 2x and g dx = ex . Using integration by parts:
dx
Z 2 2 Z 2 Z 2
2 x 2 x x 2
x e dx = x e − 2xe dx = 4e − 2 xex dx
0 0 0 0
The remainingZ integral must be integrated by parts also but we have just done this in the example
2
above. So x2 ex dx = 4e2 − 2[e2 + 1] = 2e2 − 2 = 12.778 (3 d.p.)
0
Task Z π/4
Find (4 − 3x) sin x dx.
0
What are your choices for f, g?
Your solution
Answer
Take f = 4 − 3x and g = sin x.
Now complete the integral:
Your solution
Z π/4
(4 − 3x) sin x dx =
0
Answer
Z π/4 π/4 Z π/4
(4 − 3x) sin x dx = (4 − 3x)(− cos x) −3 cos x dx
0 0 0
π/4 π/4
= (4 − 3x)(− cos x) −3 sin x
0 0
= 0.716 to 3 d.p.
38 HELM (2006):
®
Exercises
Z 1 Z π/2 Z 1
1. Evaluate the following: (a) x cos 2x dx, (b) x sin 2x dx, (c) te2t dt
Z 2 0 0 −1
2. Find (x + 2) sin x dx
1
Z 1
3. Find (x2 − 3x + 1)ex dx
0
Answers
1. (a) 0.1006, (b) π/4 = 0.7854, (c) 1.9488.
2. 3.3533.
3. −0.5634.
HELM (2006): 39
Integration by
Substitution and Using
Partial Fractions 13.5

Introduction
The first technique described here involves making a substitution to simplify an integral. We let
a new variable equal a complicated part of the function we are trying to integrate. Choosing the
correct substitution often requires experience. This skill develops with practice.
Often the technique of partial fractions can be used to write an algebraic fraction as the sum of simpler
fractions. On occasions this means that we can then integrate a complicated algebraic fraction. We
shall explore this approach in the second half of the section.
' $
• be able to find a number of simple definite
and indefinite integrals
Before starting this Section you should . . . • be familiar with the technique of expressing
an algebraic fraction as the sum of its partial
fractions
&
' %
$
• make simple substitutions in order to find
definite and indefinite integrals
• understand the technique

Z 0used for evaluating
Learning Outcomes f (x)
integrals of the form dx
On completion you should be able to . . . f (x)
• use partial fractions to express an algebraic
fraction in a simpler form and integrate it
& %
40 HELM (2006):
®
1. Making a substitution
The technique described here involves making a substitution in order to simplify an integral. We let
a new variable, u say, equal a more complicated part of the function we are trying to integrate. The
choice of which substitution to make often relies upon experience: don’t worry if at first you cannot
see an appropriate substitution. This skill develops with practice. However, it is not simply a matter
of changing the variable - care must be taken with the differential form dx as we shall see. The
technique is illustrated in the following Example.
Example 19
Z
Find (3x + 5)6 dx.
Solution
First look at the function we are trying to integrate: (3x + 5)6 . It looks quite complicated to
integrate. Suppose we introduce a new variable, u, such that u = 3x + 5. Doing this means that
the function we must integrate becomes u6 . Would you not agree that this looks a much simpler
function to integrate than (3x + 5)6 ? There is a slight complication however. The new function of
u must be integrated with respect to u and not with respect to x. This means that we must take
care of the term dx correctly.
du dx 1
Long Method u = 3x + 5 so = 3, or =
dx du 3
Z Z
6
Let I= (3x + 5) dx = u6 dx (substituting for 3x + 5)
Z
dx
= u6 du (to change from x to u)
du
Z
1 dx
= u6 . du (substituting for )
3 du
u7
Z
1
= u6 dx = + constant
3 21
du 1
Short Method u = 3x + 5 so = 3, so dx = du
dx 3
u7
Z Z Z Z
1 1
Let I = (3x + 5)6 dx = u6 dx = u6 . . du = u6 du = + constant
3 3 21
To finish off we must rewrite this answer in terms of the original variable x and replace u by 3x + 5:
(3x + 5)7
Z
(3x + 5)6 dx = +c
21
HELM (2006): 41
Section 13.5: Integration by Substitution and Using Partial Fractions
In practice the short method is generally used but mathematicians don’t like to separate the ‘dx’
1
from the ‘du’ as in the statement ‘dx = du’ as it is meaningless mathematically (but it works!). In
3
the future we will use the short method, with apologies to the mathematicians!
Task Z
By making the substitution u = sin x find cos x sin2 x dx
du
You are given the substitution u = sin x. Find :
dx
Your solution
Answer
du
= cos x
dx
Now make the substitution, simplify the result, and finally perform the integration:
Your solution
Answer
Z Z
1
cos x sin x dx simplifies to u2 du. The final answer is sin3 x + c.
2
3
Exercise
Use suitable substitutions to find
Z Z
(a) (4x + 1) dx (b) t2 sin(t3 + 1)dt
7
(Hint: you need to simplify sin(t3 + 1))
Answer
(4x + 1)8 cos(t3 + 1)
(a) +c (b) − +c
32 3
42 HELM (2006):
®
2. Substitution and definite integration

If you are dealing with definite integrals (ones with limits of integration) you must be particularly
careful when you substitute. Consider the following example.
Example 20 Z 3
Find the definite integral t sin(t2 )dt by making the substitution u = t2 .
2
Solution
du du
Note that if u = t2 then = 2t so that dt = . We find
dt 2t
Z t=3 Z t=3
1 t=3
Z
2 du
t sin(t )dt = t sin u = sin u du
t=2 t=2 2t 2 t=2
An important point to note is that the limits of integration are limits on the variable t, not u. To
emphasise this they have been written explicitly as t = 2 and t = 3. When we integrate with respect
to the variable u, the limits must be written in terms of u. From the substitution u = t2 , note that
when t = 2 then u = 4 and when t = 3 then u = 9 so the integral becomes
9
1 u=9
Z
1 1
sin u du = − cos u = (− cos 9 + cos 4) = 0.129 to 3 d.p.
2 u=4 2 4 2
Exercise
Z 2 Z 1
7 3
Use suitable substitutions to find (a) (2x + 3) dx, (b) 3t2 et dt.
1 0
Answer
(a) u = 2x + 3 is suitable; 3.359 × 105 to 4 sig. figs. (b) 1.718 to 3 d.p.
HELM (2006): 43
3. Integrals giving rise to logarithms
Example 21
3x2 + 1
Z
Find dx
x3 + x + 2
Solution
Let us consider what happens when we make the substitution z = x3 + x + 2. Note that
dz
= 3x2 + 1 so that we can write dz = (3x2 + 1)dx
dx
Then
3x2 + 1
Z Z
1
3
dx = dz = ln |z| + c = ln |x3 + x + 2|
x +x+2 z
Note that in the last Example, the numerator of the integrand (3x2 + 1) is the derivative of the
denominator (x3 + x + 2). The result is the logarithm of the denominator. This is a special case of
the following rule:
Key Point 7
f 0 (x)
Z
dx = ln |f (x)| + c
f (x)
Note that it is the modulus of f (x) in the answer.
Task
Write
Z down, purely by inspection,
Z the following Z
integrals:
1 2x 1
(a) dx, (b) 2
dx, (c) dx.
x+1 x +8 x−3
Hint: In each case the numerator of the integrand is the derivative of the denominator.
Your solution
(a) (b) (c)
Answer
(a) ln |x + 1| + c, (b) ln |x2 + 8| + c, (c) ln |x − 3| + c
44 HELM (2006):
®
Task 4
3t2 + 2t
Z
Evaluate the definite integral dt.
2 t3 + t2 + 1
Your solution
Answer
4
3 2
ln |t + t + 1| = ln 81 − ln 13 = 1.83
2
Sometimes it is necessary to make slight adjustments to the integrand to obtain a form for which
the rule in Key Point 7 is suitable. Consider the next Example.
Example 22
x2
Z
Find the indefinite integral dx.
x3 + 1
Solution
In this Example the derivative of the denominator is 3x2 whereas the numerator is just x2 . We
adjust the numerator as follows:
x2 3x2
Z Z
1
dx = dx and integrate by the rule to get 13 ln |x3 + 1| + c
x3 + 1 3 x3 + 1
Note that the sort of procedure in the last Example is only possible because we can move constant
factors through the integral sign. It would be wrong to try to move terms involving the variable x
in a similar way.
Exercise
Write
Z down the result
Z of finding the following
Z integrals. Z
1 2t 1 2
(a) dx, (b) 2
dt, (c) dx, (d) dx.
x t +1 2x + 5 3x − 2
Answer
1 2
(a) ln |x| + c, (b) ln |t2 + 1| + c, (c) 2
ln |2x + 5| + c, (d) 3
ln |3x − 2| + c.
HELM (2006): 45
4. Integration using partial fractions
Sometimes expressions which at first sight look impossible to integrate using the techniques already
met may in fact be integrated by first expressing them as simpler partial fractions, and then using
the techniques described earlier in this Section. Consider the following Task.
Task
23 − x
Express as the sum of its partial fractions.
(x − 5)(x + 4)
23 − x
Z
Hence find dx
(x − 5)(x + 4)
A B
First produce the partial fractions. Write the fraction in the form + and find A, B.
x−5 x+4
Your solution
Answer
A = 2, B = −3
Now integrate each term separately:
Your solution
23 − x
Z Z Z
A B
dx = dx + dx =
(x − 5)(x + 4) x−5 x+4
Answer
2 ln |x − 5| − 3 ln |x + 4| + c
46 HELM (2006):
®
Exercises
By expressing the following in partial fractions, evaluate each integral:
Z
1
1. 3
dx
x +x
13x − 4
Z
2. dx
6x2 − x − 2
Z
1
3. dx
(x + 1)(x − 5)
Z
2x
4. dx
(x − 1)2 (x + 1)
Answers
1
1. ln |x| − ln |x2 + 1| + c
2
3 2
2. ln |2x + 1| + ln |3x − 2| + c
2 3
1 1
3. ln |x − 5| − ln |x + 1| + c
6 6
1 1 1
4. − ln |x + 1| + ln |x − 1| − +c
2 2 x−1
HELM (2006): 47
Integration of
Trigonometric
Functions 13.6

Introduction
Integrals involving trigonometric functions are commonplace in engineering mathematics. This is
especially true when modelling waves and alternating current circuits. When the root-mean-square
(rms) value of a waveform, or signal is to be calculated, you will often find this results in an integral
of the form
Z
sin2 t dt
In this Section you will learn how such integrals can be evaluated.
' $
• be able to find a number of simple definite
and indefinite integrals
• be familiar with standard trigonometric
identities
&
%

• use trigonometric identities to write
Learning Outcomes integrands in alternative forms to enable
On completion you should be able to . . . them to be integrated

48 HELM (2006):
®
1. Integration of trigonometric functions

Simple integrals involving trigonometric functions have already been dealt with in Section 13.1. See
what you can remember:
Task
Write
Z down the following
Z integrals: Z Z
(a) sin x dx, (b) cos x dx, (c) sin 2x dx, (d) cos 2x dx
Your solution
(a) (b)
(c) (d)
Answer
1 1
(a) − cos x + c, (b) sin x + c, (c) − cos 2x + c, (d) sin 2x + c.
2 2
The basic rules from which these results can be derived are summarised here:
Key Point 8
Z Z
cos kx sin kx
sin kx dx = − +c cos kx dx = +c
k k
In engineering applications it is often necessary to integrate functions involving powers of the trigono-
metric functions such as
Z Z
2
sin x dx or cos2 ωt dt
Note that these integrals cannot be obtained directly from the formulas in Key Point 8 above.
However, by making use of trigonometric identities, the integrands can be re-written in an alternative
form. It is often not clear which identities are useful and each case needs to be considered individually.
Experience and practice are essential. Work through the following Task.
HELM (2006): 49
Section 13.6: Integration of Trigonometric Functions
Task
1
Use the trigonometric identity sin2 θ ≡ (1 − cos 2θ) to express the integral
Z 2
2
sin x dx in an alternative form and hence evaluate it.
(a) First use the identity:

Your
Z solutionZ
2
sin x dx =
Answer Z
1
The integral can be written (1 − cos 2x)dx.
2
Note that the trigonometric identity is used to convert a power of sin x into a function involving
cos 2x which can be integrated directly using Key Point 8.
(b) Now evaluate the integral:
Your solution
Answer
1
x − 12 sin 2x + c = 21 x − 14 sin 2x + K where K = c/2.

2
Task Z
Use the trigonometric identity sin 2x ≡ 2 sin x cos x to find sin x cos x dx
(a) First use the identity:

Your
Z solution Z
sin x cos x dx =
Answer
1
The integrand can be written as 2
sin 2x
(b) Now evaluate the integral:
Your solution
Answer
Z 2π Z 2π 2π
1 1 1 1 1 1
sin x cos x dx = sin 2x dx = − cos 2x + c = − cos 4π + cos 0 = − + = 0
0 0 2 4 0 4 4 4 4
This result is one example of what are called orthogonality relations.
50 HELM (2006):
®
Magnetic flux
Introduction
The magnitude of the magnetic flux density on the axis of a solenoid, as in Figure 13, can be found
by the integral:
Z β2
µ0 nI
B= sin β dβ
β1 2
where µ0 is the permeability of free space (≈ 4π × 10−7 H m−1 ), n is the number of turns and I is
the current.
β2 β1
Figure 13: A solenoid and angles defining its extent
Problem in words
Predict the magnetic flux in the middle of a long solenoid.
We assume that the solenoid is so long that β1 ≈ 0 and β2 ≈ π so that
Z β2 Z π
µ0 nI µ0 nI
B= sin β dβ ≈ sin β dβ
β1 2 0 2
µ0 nI
The factor can be taken outside the integral i.e.
2
π
µ0 nI π
Z
µ0 nI µ0 nI
B= sin β dβ = − cos β = (− cos π + cos 0)
2 0 2 0 2
µ0 nI
= (−(−1) + 1) = µ0 nI
2
Interpretation
The magnitude of the magnetic flux density at the midpoint of the axis of a long solenoid is predicted
to be approximately µ0 nI i.e. proportional to the number of turns and proportional to the current
flowing in the solenoid.
HELM (2006): 51
2. Orthogonality relations
In general two functions f (x), g(x) are said to be orthogonal to each other over an interval a ≤ x ≤ b
if
Z b
f (x)g(x) dx = 0
a
It follows from the previous Task that sin x and cos x are orthogonal to each other over the interval
0 ≤ x ≤ 2π. This is also true over any interval α ≤ x ≤ α + 2π (e.g. π/2 ≤ x ≤ 5π, or
−π ≤ x ≤ π).
More generally there is a whole set of orthogonality relations involving these trigonometric functions
on intervals of length 2π (i.e. over one period of both sin x and cos x). These relations are useful
in connection with a widely used technique in engineering, known as Fourier analysis where we
represent periodic functions in terms of an infinite series of sines and cosines called a Fourier series.
(This subject is covered in 23.)
We shall demonstrate the orthogonality property
Z 2π
Imn = sin mx sin nx dx = 0
0
where m and n are integers such that m 6= n.

The secret is to use a trigonometric identity to convert the integrand into a form that can be readily
integrated.
You may recall the identity
1
sin A sin B ≡ (cos(A − B) − cos(A + B))
2
It follows, putting A = mx and B = nx that provided m 6= n
1 2π
Z
Imn = [cos(m − n)x − cos(m + n)x] dx
2 0
2π
1 sin(m − n)x sin(m + n)x
= −
2 (m − n) (m + n) 0
= 0
because (m − n) and (m + n) will be integers and sin(integer×2π) = 0. Of course sin 0 = 0.

Why does the case m = n have to be excluded from the analysis? (left to the reader to figure out!)
The corresponding orthogonality relation for cosines
Z 2π
Jmn = cos mx cos nx dx = 0
0
follows by use of a similar identity to that just used. Here again m and n are integers such that
m 6= n.
52 HELM (2006):
®
Example 23
1
Use the identity sin A cos B ≡ (sin(A + B) + sin(A − B)) to show that
2
Z 2π
Kmn = sin mx cos nx dx = 0 m and n integers, m 6= n.
0
Solution
1 2π
Z
Kmn = [sin(m + n)x + sin(m − n)x] dx
2 0
2π
1 cos(m + n)x cos(m − n)x
= − −
2 (m + n) (m − n) 0

1 cos(m + n)2π − 1 cos(m − n)2π − 1
= − + =0
2 (m + n) (m − n)
(recalling that cos(integer × 2π) = 1)
Task
Derive the orthogonality relation
Z 2π
Kmn = sin mx cos nx dx = 0 m and n integers, m = n
0
Hint: You will need to use a different trigonometric identity to that used in Example
23.
Your solution
HELM (2006): 53
AnswerZ
2π
Kmn = sin mx cos mx dx
0
Putting m = n 6= 0, and then using the identity sin 2A ≡ 2 sin A cos A we get
Z 2π
Kmm = sin mx cos mx dx
0
1 2π
Z
= sin 2mx dx
2 0
2π
1 cos 2mx 1 1
= − =− (cos 4mπ − cos 0) = − (1 − 1) = 0
2 2m 0 4m 4m
Z 2π
1
Putting m = n = 0 gives K00 = sin 0 cos 0 dx = 0.
2 0
Note that the particular case m = n = 1 was considered earlier in this Section.
3. Reduction formulae
You have seen earlier in this Workbook how to integrate sin x and sin2 x (which is (sin x)2 ). Appli-
cations sometimes arise which involve integrating higher powers of sin x or cos x. It is possible, as
we now show, to obtain a reduction formula to aid in this Task.
Task Z
Given In = sinn (x) dx write down the integrals represented by I2 , I3 , I10
Your solution
I2 = I3 = I10 =
Answer Z Z Z
I2 = sin2 x dx I3 = 3
sin x dx I10 = sin10 x dx
To obtain a reduction formula for In we write

sinn x = sinn−1 (x) sin x
and use integration by parts.
54 HELM (2006):
®
Task
In the notation used earlier in this Workbook for integration by parts
Z (Key Point
df
5, page 31) put f = sinn−1 x and g = sin x and evaluate and g dx.
dx
Your solution
Answer
df
= (n − 1) sinn−2 x cos x (using the chain rule of differentiation),
dx
Z Z
g dx = sin x dx = − cos x
Z
Now use the integration by parts formula on sinn−1 x sin x dx. [Do not attempt to evaluate the
second integral that you obtain.]
Your solution
Answer
Z Z Z Z
n−1 n−1 df
sin x sin x dx = sin (x) g dx −
g dx
dx
Z
= sin (x)(− cos x) + (n − 1) sinn−2 x cos2 x dx
n−1
Z
We now need to evaluate sinn−2 x cos2 xdx. Putting cos2 x = 1 − sin2 x this integral becomes:
Z Z
n−2
sin (x) dx − sinn (x) dx
Z is expressible as In−2 − In so finally, using this and the result from the last Task we have
But this
In = sinn−1 (x) sin x dx = sinn−1 (x)(− cos x) + (n − 1)(In−2 − In )
from which we get Key Point 9:
HELM (2006): 55
Key Point 9
Reduction Formula
Z
Given In = sinn xdx
1 n−1
In = − sinn−1 (x) cos x + In−2
n n
This is our reduction formula for In . It enables us, for example, to evaluate I6 in terms of I4 , then
I4 in terms of I2 and I2 in terms of I0 where
Z Z
0
I0 = sin x dx = 1 dx = x.
Task
Use the reduction formula in Key Point 9 with n = 2 to find I2 .
Your solution
Answer
1 1
I2 = − [sin x cos x] + I0
2 2
1 1 x
= − [ sin 2x] + + c
Z 2 2 2
1 x
i.e. sin2 x dx = − sin 2x + + c
4 2
as obtained earlier by a different technique.
56 HELM (2006):
®
Task Z
Use the reduction formula in Key Point 9 to obtain I6 = sin6 x dx.
Firstly obtain I6 in terms of I4 , then I4 in terms of I2 :
Your solution
Answer
1 5
Using Key Point 9 with n = 6 gives I6 = − sin5 x cos x + I4 .
6 6
1 3
Then, using Key Point 9 again with n = 4, gives I4 = − sin3 x cos x + I2
4 4
Now substitute for I2 from the previous Task to obtain I4 and hence I6 .
Your solution
Answer
1 3 3
I4 = − sin3 x cos x − sin 2x + x+ constant
4 16 8
1 5 5 5
∴ I6 = − sin5 x cos x − sin3 x cos x − sin 2x + x + constant
6 24 32 16
Definite integrals can also be readily evaluated using the reduction formula in Key Point 9. For
example,
Z π/2 Z π/2
n
In = sin x dx so In−2 = sinn−2 x dx
0 0
We obtain, immediately
π/2
1 n−1 n−1
In = − sin (x) cos x + In−2
n 0 n
π (n − 1)
or, since cos = sin 0 = 0, In = In−2
2 n
This simple easy-to-use formula is well known and is called Wallis’ formula.
HELM (2006): 57
Key Point 10
Reduction Formula - Wallis’ Formula
Z π/2 Z π/2
n
Given In = sin x dx or In = cosn x dx
0 0
(n − 1)
In = In−2
n
Task Z π/2
If In = sinn x dx calculate I1 and then use Wallis’ formula, without further
0
integration, to obtain I3 and I5 .
Your solution
Answer
Z π/2 π/2
I1 = sin x dx = − cos x =1
0 0
Then using Wallis’ formula with n = 3 and n = 5 respectively

Z π/2
2 2 2
I3 = sin3 x dx = I1 = × 1 =
0 3 3 3
Z π/2
4 4 2 8
I5 = sin5 x dx = I3 = × =
0 5 5 3 15
58 HELM (2006):
®
Task
The total power P of an antenna is given by
Z π
ηL2 I 2 π
P = 2
sin3 θ dθ
0 4λ
where η, λ, IZ are constants as is the length L of antenna. Using the reduction
formula for sinn x dx in Key Point 9, obtain P .
Your solution
Answer
Ignoring the constants for the moment, consider
Z π
I3 = sin3 θ dθ which we will reduce to I1 and evaluate.
0
Z π π
I1 = sin θ dθ = − cos θ =2
0 0
so by the reduction formula with n = 3

π
1 2 2 2 4
I3 = − sin x cos x + I1 = 0 + × 2 =
3 0 3 3 3
We now consider the actual integral with all the constants.
ηL2 I 2 π π 3 ηL2 I 2 π 4 L2 I 2 π
Z
Hence P = sin θ dθ = × , so P = η .
4λ2 0 4λ2 3 3λ2
Z
A similar reduction formula to that in Key Point 9 can be obtained for cosn x dx (see Exercise 5
at the end of this Workbook). In particular if
Z π/2
(n − 1)
Jn = cosn x dx then Jn = Jn−2
0 n
i.e. Wallis’ formula is the same for cosn x as for sinn x.
HELM (2006): 59
4. Harder trigonometric integrals
The following seemingly innocent integrals are examples, important in engineering, of trigonometric
integrals that cannot be evaluated as indefinite integrals:
Z Z
2
(a) sin(x ) dx and cos(x2 ) dx These are called Fresnel integrals.
Z
sin x
(b) dx This is called the Sine integral.
x
Definite integrals of this type, which are what normally arise in applications, have to be evaluated
by approximate numerical methods.
Fresnel integrals with limits arise in wave and antenna theory and the Sine integral with limits in
filter theory.
It is useful sometimes to be able to visualize the definite integral. For example consider
Z t
sin x
F (t) = dx t>0
0 x
Z 0
sin x sin x
Clearly, F (0) = dx = 0. Recall the graph of against x, x > 0:
0 x x
sin x
x
t π 2π x
Figure 14
For any positive value of t, F (t) is the shaded area shown (the area interpretation of a definite integral
was covered earlier in this Workbook). As t increases from 0 to π, it follows that F (t) increases from
0 to a maximum value
Z π
sin x
F (π) = dx
0 x
whose value could be determined numerically (it is actually about 1.85). As t further increases from
sin x
π to 2π the value of F (t) will decrease to a local minimum at 2π because the curve is below
x
the x-axis between π and 2π. Note that the area below the curve is considered to be negative in
this application.
Continuing to argue in this way we can obtain the shape of the F (t) graph in Figure 15: (can you
60 HELM (2006):
®
see why the oscillations decrease in amplitude?)
F (t)
1.85
π
2
π 2π t
Figure 15
Z ∞
sin x π
The result dx = is clearly illustrated in the graph (you are not expected to know
0 x 2
how this result is obtained). Methods for solving such problems are dealt with in 31.
HELM (2006): 61
Exercises
You will need to refer to a Table of Trigonometric Identities to answer these questions.
Z Z π/2 Z
2 2
1. Find (a) cos xdx (b) cos tdt (c) (cos2 θ + sin2 θ)dθ
0
Z
2. Use the identity sin(A + B) + sin(A − B) ≡ 2 sin A cos B to find sin 3x cos 2xdx
Z
3. Find (1 + tan2 x)dx.
4. The mean square value of a function f (t) over the interval t = a to t = b is defined to be
Z b
1
(f (t))2 dt
b−a a
Find the mean square value of f (t) = sin t over the interval t = 0 to t = 2π.
Z
5. (a) Show that the reduction formula for Jn = cosn x dx is
1 (n − 1)
Jn = cosn−1 (x) sin x + Jn−2
n n
(b) Using the reduction formula in (a) show that
Z
1 4 8
cos5 x dx = cos4 x sin x + cos2 x sin x + sin x
5 15 15
Z π/2
n n−1
(c) Show that if Jn = cos x dx, then Jn = Jn−2 (Wallis’ formula).
0 n
Z π/2
5
(d) Using Wallis’ formula show that cos6 x dx = π.
0 32
Answers
1. (a) 12 x + 41 sin 2x + c (b) π/4 (c) θ + c.

1
2. − 10 cos 5x − 21 cos x + c.
3. tan x + c.
1
4. 2
.
62 HELM (2006):
Contents 14
Applications of
Integration 1
14.1 Integration as the Limit of a Sum 2
14.2 The Mean Value and the Root-Mean-Square Value 10
14.3 Volumes of Revolution 20
14.4 Lengths of Curves and Surfaces of Revolution 27
Learning outcomes
In this Workbook you will learn to interpret an integral as the limit of a sum. You will learn
how to apply this approach to the meaning of an integral to calculate important attributes
of a curve: the area under the curve, the length of a curve segment, the volume and
surface area obtained when a segment of a curve is rotated about an axis. Other quantities
of interest which can also be calculated using integration is the position of the centre of
mass of a plane lamina and the moment of inertia of a lamina about an axis. You will also
learn how to determine the mean value of an integal.
Integration as the
Limit of a Sum 14.1
Introduction
In 13, integration was introduced as the reverse of differentiation. A more rigorous treatment
would show that integration is a process of adding or ‘summation’. By viewing integration from this
perspective it is possible to apply the techniques of integration to finding areas, volumes, centres of
gravity and many other important quantities.
The content of this Section is important because it is here that integration is defined more carefully. A
thorough understanding of the process involved is essential if you need to apply integration techniques
to practical problems.

Prerequisites • be able to calculate definite integrals


• explain integration as the limit of a sum

Learning Outcomes
• evaluate the limit of a sum in simple cases

2 HELM (2006):
Workbook 14: Applications of Integration 1
®
1. The limit of a sum

y
area required
y(x)
a b x
Figure 1: The area under a curve

Consider the graph of the positive function y(x) shown in Figure 1. Suppose we are interested in
finding the area under the graph between x = a and x = b. One way in which this area can be
approximated is to divide it into a number of rectangles of equal width, find the area of each
rectangle, and then add up all these individual rectangular areas. This is illustrated in Figure 2a,
which shows the area divided into n rectangles (with some small discrepancies at the tops), and
Figure 2b which shows the dimensions of a typical rectangle which is located at x = xk .
y y
y(x) y(x)
δx
n rectangles y(xk )
a b x xk x
(a) The area approximated by n rectangles (b) A typical rectangle
Figure 2
We wish to find an expression for the area under a curve based on the sum of many rectangles.
Firstly, we note that the distance from x = a to x = b is b − a. In Figure 2a the area has been
divided into n rectangles. If n rectangles span the distance from a to b the width of each rectangle
b−a
is :
n
b−a
It is conventional to label the width of each rectangle as δx, i.e. δx = . We label the x
n
coordinates at the left-hand side of the rectangles as x1 , x2 up to xn (here x1 = a and xn+1 = b). A
typical rectangle, the kth rectangle, is shown in Figure 2b. Note that its height is y(xk ), so its area
is y(xk ) × δx.
The sum of the areas of all n rectangles is then
y(x1 )δx + y(x2 )δx + y(x3 )δx + · · · + y(xn )δx
which we write concisely using sigma notation as
n
X
y(xk )δx
k=1
HELM (2006): 3
Section 14.1: Integration as the Limit of a Sum
This quantity gives us an estimate of the area under the curve but it is not exact. To improve the
estimate we must take a large number of very thin rectangles. So, what we want to find is the value
of this sum when n tends to infinity and δx tends to zero. We write this value as
n
X
lim y(xk )δx
n→∞
k=1
The lower and upper limits on the sum correspond to the first rectangle and last rectangle where
x = a and x = b respectively and so we can write this limit in the equivalent form
x=b
X
lim y(x)δx (1)
δx→0
x=a
Here, as the number of rectangles increases without bound we drop the subscript k from xk and write
y(x) which is the value of y at a ‘typical’ value of x. If this sum can actually be found, it is called
Z b
the definite integral of y(x), from x = a to x = b and it is written y(x)dx. You are already
a
familiar with the technique for evaluating definite integrals which was studied in Section 14.2.
Therefore we have the following definition:
Key Point 1
Z b x=b
X
The definite integral y(x)dx is defined as lim y(x)δx
a δx→0
x=a
Note that the quantity δx represents the thickness of a small but finite rectangle. When we have
taken the limit as δx tends to zero to obtain the integral, we write dx, which reminds us of the
variable of integration.
This process of dividing an area into very small regions, performing a calculation on each region, and
then adding the results by means of an integral is very important. This will become apparent when
finding volumes, centres of gravity, moments of inertia etc in the following Sections where similar
procedures are followed.
4 HELM (2006):
®
Example 1
The area under the graph of y = x2 between x = 0 and x = 1 is to be found by
approximating it by a large number of thin rectangles and finding the limit of the
Xx=1
sum of their areas. From Equation (1) this is lim y(x) δx. Write down the
δx→0
x=0
integral which this sum defines and evaluate it to obtain the area under the curve.
Solution
1 1 1
x3
Z Z
2 2 1
The limit of the sum defines the integral y(x)dx. Here y = x and so x dx = =
0 0 3 0 3
To show that the process of taking the limit of a sum actually works we investigate the problem in
detail. We use the idea of the limit of a sum to find the area under the graph of y = x2 between
x = 0 and x = 1, as illustrated in Figure 3.
y
1
y = x2
n rectangles
0 1 x
Figure 3: The area under y = x2 is approximated by a number of thin rectangles
Task
Refer to the diagram below to help you answer the questions below.
y
1
y = x2
n rectangles
0 1 x
If the interval between x = 0 and x = 1 is divided into n rectangles what is the width of each
rectangle?
Your solution
HELM (2006): 5
Answer
1/n
Mark this on the diagram. What is the x coordinate at the left-hand side of the first rectangle ?
Your solution
Answer
0
What is the x coordinate at the left-hand side of the second rectangle ?
Your solution
Answer
1/n
What is the x coordinate at the left-hand side of the third rectangle ?

Your solution
Answer
2/n
Mark these coordinates on the diagram.

What is the x coordinate at the left-hand side of the kth rectangle ?
Your solution
Answer
(k − 1)/n
Given that y = x2 , what is the y coordinate at the left-hand side of the kth rectangle ?
Your solution
Answer
2
k−1
n
The area of the kth rectangle is its height × its width. Write down the area of the kth rectangle:
Your solution
Answer
2
(k − 1)2

k−1 1
× =
n n n3
6 HELM (2006):
®
To find the total area An of the n rectangles we must add up all these individual rectangular areas:
n
X (k − 1)2
An =
k=1
n3
This sum can be simplified and then calculated as follows. You will need to make use of the formulas
for the sum of the first n integers, and the sum of the squares of the first n integers:
n n n
X X 1 X 1
1 = n, k = n(n + 1), k 2 = n(n + 1)(2n + 1)
k=1 k=1
2 k=1
6
Then, the total area of the rectangles is given by
n
X (k − 1)2
An =
k=1
n3
n
1 X
= 3
(k − 1)2
n k=1
n
1 X 2
= (k − 2k + 1)
n3 k=1
n n n
!
1 X X X
= k2 − 2 k+ 1
n3 k=1 k=1 k=1
1 n n
= (n + 1)(2n + 1) − 2 (n + 1) + n
n3 6 2
1 (n + 1)(2n + 1)
= − (n + 1) + 1
n2 6

1 (n + 1)(2n + 1)
= −n
n2 6
1 2
1 1 1
= 2
2n − 3n + 1 = − + 2
6n 3 2n 6n
Note that this is a formula for the exact total area of the n rectangles. It is an estimate of the area
1 1
under the graph of y = x2 . However, as n gets larger, the terms and 2 become small and will
2n 6n
1
eventually tend to zero. If we let n tend to infinity we obtain the exact answer of .
3
1
The required area is . It has been found as the limit of a sum and of course agrees with that
3
calculated by integration.
In the calculations which follow in subsequent Sections the need to evaluate complicated limits like
this is avoided by performing the integration using the techniques of 13. Nevertheless it will
sometimes be necessary to go through the process of dividing a region into small sections, performing
a calculation on each section and then adding the results, in order to formulate the integral required.
When numerical methods of integration are studied ( 31) this summation method will prove
fundamental.
HELM (2006): 7
Pulley belt tension
Problem
Consider that a belt is partially wound around a pulley so that there is a difference in the tension
either side of the pulley (see Figure 4). The pulley will be stationary as long as the friction between
belt and pulley is sufficient. The frictional force on the pulley will depend on the extent of the contact
between belt and pulley i.e. on the angle θ shown in Figure 4. Given that the tensions on either side
of the belt are T2 and T1 and that the coefficient of friction between belt and pulley is µ, find an
expression for T2 in terms of T1 , µ and θ.
Solution
Consider a small element of the belt, at angle θ where the tension is T . Changing the angle by a
small amount ∆θ changes the tension from T to T + ∆T .
Δθ
R θ
T2 T1
Figure 4
Take moments about the centre of the pulley, denoting the radius of the pulley by R and assuming
that the frictional force is µT per unit length. For the pulley to remain stationary,
∆T
R∆θµT = R(T + ∆T ) − RT or ∆θ = .
µT
Using integration as the limit of a sum,

Z T2 T2
dT 1 1 T2
θ= = ln T = ln . So T2 = T1 eµθ .
T1 µT µ T1 µ T1
8 HELM (2006):
®
Exercises
1. Find the area under y = x + 1 from x = 0 to x = 10 using the limit of a sum.
2. Find the area under y = 3x2 from x = 0 to x = 2 using the limit of a sum.
3. Write down, but do not evaluate, the integral defined by the limit as δx → 0, or δt → 0 of the
following sums:
x=1
X x=4
X t=1
X x=1
X
3 2 3
(a) x δx, (b) 4πx δx, (c) t δt, (d) 6mx2 δx.
x=0 x=0 t=0 x=0
Answers
1. 60,
2. 8,
Z 1 Z 4 Z 1 Z 1
3 2 3
3. (a) x dx, (b) 4π x dx, (c) t dt, (d) 6m x2 dx.
0 0 0 0
HELM (2006): 9
The Mean Value and
the Root-Mean-Square
Value 14.2
Introduction
Currents and voltages often vary with time and engineers may wish to know the mean value of such
a current or voltage over some particular time interval. The mean value of a time-varying function
is defined in terms of an integral. An associated quantity is the root-mean-square (r.m.s). For
example, the r.m.s. value of a current is used in the calculation of the power dissipated by a resistor.

• be able to calculate definite integrals
Prerequisites
• be familiar with a table of trigonometric
Before starting this Section you should . . . identities

• calculate the mean value of a function
Learning Outcomes
• calculate the root-mean-square value of a
On completion you should be able to . . . function

10 HELM (2006):
®
1. Average value of a function

Suppose a time-varying function f (t) is defined on the interval a ≤ t ≤ b. The area, A, under the
Z b
graph of f (t) is given by the integral A = f (t) dt. This is illustrated in Figure 5.
a
f (t) f (t)
a b t a b t
(a) the area under the curve from t = a to t = b (b) the area under the curve and the area
of the rectangle are equal
Figure 5
On Figure 3 we have also drawn a rectangle with base spanning the interval a ≤ t ≤ b and which
has the same area as that under the curve. Suppose the height of the rectangle is m. Then
Z b Z b
1
area of rectangle = area under curve ⇒ m(b−a) = f (t) dt ⇒ m = f (t) dt
a b−a a
The value of m is the mean value of the function across the interval a ≤ t ≤ b.
Key Point 2
Z b
1
The mean value of a function f (t) in the interval a ≤ t ≤ b is f (t) dt
b−a a
The mean value depends upon the interval chosen. If the values of a or b are changed, then the
mean value of the function across the interval from a to b will in general change as well.
Example 2
Find the mean value of f (t) = t2 over the interval 1 ≤ t ≤ 3.
Solution
Using Key Point 2 with a = 1 and b = 3 and f (t) = t2
Z b Z 3 3
1 1 2 1 t3 13
mean value = f (t) dt = t dt = =
b−a a 3−1 1 2 3 1 3
HELM (2006): 11
Section 14.2: The Mean Value and the Root-Mean-Square Value
Task
Find the mean value of f (t) = t2 over the interval 2 ≤ t ≤ 5.
Use Key Point 2 with a = 2 and b = 5 to write down the required integral:
Your solution
mean value =
Answer
Z 5
1
t2 dt
5−2 2
Now evaluate the integral:

Your solution
mean value =
Answer
Z 5 5
1 t3

1 2 1 125 8 117
t dt = = − = = 13
5−2 2 3 3 2 3 3 3 9
Sonic boom
Introduction
Impulsive signals are described by their peak amplitudes and their duration. Another quantity of
interest is the total energy of the impulse. The effect of a blast wave from an explosion on structures,
for example, is related to its total energy. This Example looks at the calculation of the energy on a
sonic boom. Sonic booms are caused when an aircraft travels faster than the speed of sound in air. An
idealized sonic-boom pressure waveform is shown in Figure 6 where the instantaneous sound pressure
p(t) is plotted versus time t. This wave type is often called an N-wave because it resembles the
shape of the letter N. The energy in a sound wave is proportional to the square of the sound pressure.
p(t)
P0
T
0 t
−P0
Figure 6: An idealized sonic-boom pressure waveform
12 HELM (2006):
®
Problem in words
Calculate the energy in an ideal N-wave sonic boom in terms of its peak pressure, its duration and
the density and sound speed in air.
Represent the positive peak pressure by P0 and the duration by T . The total acoustic energy E
carried across unit area normal to the sonic-boom wave front during time T is defined by
E = < p(t)2 > T /ρc (1)
where ρ is the air density, c the speed of sound and the time average of [p(t)]2 is
1 T
Z
2
< p(t) > = p(t)2 dt (2)
T 0
(a) Find an appropriate expression for p(t).
T P02
(b) Hence show that E can be expressed in terms of P0 , T, ρ and c as E = .
3ρc
(a) The interval of integration needed to compute (2) is [0, T ]. Therefore it is necessary to find an
expression for p(t) only in this interval. Figure 6 shows that, in this interval, the dependence of the
sound pressure p on the variable t is linear, i.e. p(t) = at + b.
From Figure 6 also p(0) = P0 and p(T ) = −P0 . The constants a and b are determined from these
conditions.
At t = 0, a × 0 + b = P0 implies that b = P0 .
At t = T, a × T + b = −P0 implies that a = −2P0 /T.
−2P0
Consequently, the sound pressure in the interval [0, T ] may be written p(t) = t + P0 .
T
(b) This expression for p(t) may be used to compute the integral (2)
T T 2
Z T 2
4P0 2 4P02

−2P0
Z Z
1 2 1 1 2
p(t) dt = t + P0
dt = t − t + P0 dt
T 0 T 0 T T 0 T2 T
T
1 4P02 3 2P02 2

2
= t − t + P0 t
T 3T 2 T 0
2

P0 4 3 2 2
= T − T + T − 0 = P02 /3.
T 3T 2 T
Hence, from Equation (1), the total acoustic energy E carried across unit area normal to the sonic-
T P02
boom wave front during time T is E = .
3ρc
Interpretation
The energy in an N-wave is given by a third of the sound intensity corresponding to the peak pressure
multiplied by the duration.
HELM (2006): 13
Exercises
1. Calculate the mean value of the given functions across the specified interval.
(a) f (t) = 1 + t across [0, 2]

(b) f (x) = 2x − 1 across [−1, 1]
(c) f (t) = t2 across [0, 1]
(d) f (t) = t2 across [0, 2]
(e) f (z) = z 2 + z across [1, 3]
2. Calculate the mean value of the given functions over the specified interval.
(a) f (x) = x3 across [1, 3]

1
(b) f (x) = across [1, 2]
x
√
(c) f (t) = t across [0, 2]
(d) f (z) = z 3 − 1 across [−1, 1]
3. Calculate the mean value of the following:
(a) f (t) = sin t across 0, π2

(b) f (t) = sin t across [0, π]

(c) f (t) = sin ωt across [0, π]
(d) f (t) = cos t across 0, π2

(e) f (t) = cos t across [0, π]

(f) f (t) = cos ωt across [0, π]
(g) f (t) = sin ωt + cos ωt across [0, 1]
4. Calculate the mean value of the following functions:

√
(a) f (t) = t + 1 across [0, 3]
(b) f (t) = et across [−1, 1]
(c) f (t) = 1 + et across [−1, 1]
Answers
1 4 19
1. (a) 2 (b) −1 (c) (d) (e)
3 3 3
2. (a) 10 (b) 0.6931 (c) 0.9428 (d) −1
2 2 1 2 sin(πω)
3. (a) (b) (c) [1 − cos(πω)] (d) (e) 0 (f)
π π πω π πω
1 + sin ω − cos ω
(g)
ω
14
4. (a) (b) 1.1752 (c) 2.1752
9
14 HELM (2006):
®
2. Root-mean-square value of a function

If f (t) is defined on the interval a ≤ t ≤ b, the mean-square value is given by the expression:
Z b
1
[f (t)]2 dt
b−a a
This is simply the mean value of [f (t)]2 over the given interval.
The related quantity: the root-mean-square (r.m.s.) value is given by the following formula.
Key Point 3
Root-Mean-Square Value
s
Z b
1
r.m.s value = [f (t)]2 dt
b−a a
The r.m.s. value depends upon the interval chosen. If the values of a or b are changed, then the
r.m.s. value of the function across the interval from a to b will in general change as well. Note that
when finding an r.m.s. value the function must be squared before it is integrated.
Example 3
Find the r.m.s. value of f (t) = t2 across the interval from t = 1 to t = 3.
Solution
s s s s
b 3 3 3
1 t5
Z Z Z
1 1 1
r.m.s = [f (t)]2 dt = [t2 ]2 dt = t4 dt = ≈ 4.92
b−a a 3−1 1 2 1 2 5 1
HELM (2006): 15
Example 4
Calculate the r.m.s value of f (t) = sin t across the interval 0 ≤ t ≤ 2π.
Solution
s
Z 2π
1
Here a = 0 and b = 2π so r.m.s = sin2 t dt.
2π 0
The integral of sin2 t is performed by using trigonometrical identities to rewrite it in the alternative
form 21 (1 − cos 2t). This technique was described in 13.7.
s s 2π r r
Z 2π
1 (1 − cos 2t) 1 sin 2t 1 1
r.m.s. value = dt = t− = (2π) = = 0.707
2π 0 2 4π 2 0 4π 2
Thus the r.m.s value is 0.707 to 3 d.p.
In the previous Example the amplitude of the sine wave was 1, and the r.m.s. value was 0.707. In
general, if the amplitude of a sine wave is A, its r.m.s value is 0.707A.
Key Point 4
The r.m.s value of any sinusoidal waveform taken across an interval of width equal to one
period is 0.707 × amplitude of the waveform.
Electrodynamic meters
Introduction
A dynamometer or electrodynamic meter is an analogue instrument that can measure d.c. current
or a.c. current up to a frequency of 2 kHz. A typical dynamometer is shown in Figure 7.
It consists of a circular dynamic coil positioned in a magnetic field produced by two wound circular
stator coils connected in series with each other. The torque T on the moving coil depends upon the
mutual inductance between the coils given by:
dM
T = I1 I2
dθ
16 HELM (2006):
®
where I1 is the current in the fixed coil, I2 the current in the moving coil and θ is the angle between
the coils. The torque is therefore proportional to the square of the current. If the current is alternating
the moving coil is unable to follow the current and the pointer position is related to the mean value
of the square of the current. The scale can be suitably graduated so that the pointer position shows
the square root of this value, i.e. the r.m.s. current.
Scale
Pointer
Moving coil
Spring Fixed stator coils
Figure 7: An electrodynamic meter

Problem in words
A dynamometer is in a circuit in series with a 400 Ω resistor, a rectifying device and a 240 V r.m.s
alternating sinusoidal power supply. The rectifier resists current with a resistance of 200 Ω in one
direction and a resistance of 1 kΩ in the opposite direction. Calculate the reading indicated on the
meter.
Mathematical Statement of the problem
We know from Key Point 4 in the text that the r.m.s. value of any sinusoidal waveform taken
across an interval equal to one period is 0.707 × amplitude of the waveform. Where 0.707 is an
1
approximation of √ . This allows us to state that the amplitude of the sinusoidal power supply will
2
be:
Vrms √
Vpeak = 1 = 2Vrms
√
2
In this case the r.m.s power supply is 240 V so we have

√
Vpeak = 240 × 2 = 339.4 V
During the part of the cycle where the voltage of the power supply is positive the rectifier behaves as
a resistor with resistance of 200 Ω and this is combined with the 400 Ω resistance to give a resistance
of 600 Ω in total. Using Ohm’s law
V
V = IR ⇒ I =
R
As V = Vpeak sin(θ) where θ = ωt where ω is the angular frequency and t is time we find that during
the positive part of the cycle
Z π 2
2 1 339.4 sin(θ)
Irms = dθ
2π 0 600
HELM (2006): 17
During the part of the cycle where the voltage of the power supply is negative the rectifier behaves
as a resistor with resistance of 1 kΩ and this is combined with the 400 Ω resistance to give 1400 Ω
in total.
So we find that during the negative part of the cycle

Z 2π 2
2 1 339.4 sin(θ)
Irms = dθ
2π π 1400
Therefore over an entire cycle

Z π 2 Z 2π 2
2 1 339.4 sin(θ) 1 339.4 sin(θ)
Irms = dθ + dθ
2π 0 600 2π π 1400
2
We can calculate this value to find Irms and therefore Irms .
Z π 2 Z 2π 2
2 1 339.4 sin(θ) 1 339.4 sin(θ)
Irms = dθ + dθ
2π 0 600 2π π 1400
π 2π
339.42 sin2 (θ) sin2 (θ)
Z Z
2
Irms = dθ + dθ
2π × 10000 0 36 π 196
1 − cos(2θ)
Substituting the trigonometric identity sin2 (θ) ≡ we get
2
π Z 2π
339.42
Z
2 1 − cos(2θ) 1 − cos(2θ)
Irms = dθ + dθ
4π × 10000 0 36 π 196
π 2π !
339.42

θ sin(2θ) θ sin(2θ)
= − + −
4π × 10000 36 72 0 196 392 π
339.42 π π
= + = 0.0946875 A2
4π × 10000 36 196
Irms = 0.31 A to 2 d.p.
Interpretation
The reading on the meter would be 0.31 A.
18 HELM (2006):
®
Exercises
1. Calculate the r.m.s values of the given functions across the specified interval.
(a) f (t) = 1 + t across [0, 2]

(b) f (x) = 2x − 1 across [−1, 1]
(c) f (t) = t2 across [0, 1]
(d) f (t) = t2 across [0, 2]
(e) f (z) = z 2 + z across [1, 3]
2. Calculate the r.m.s values of the given functions over the specified interval.
(a) f (x) = x3 across [1, 3]

1
(b) f (x) = across [1, 2]
x
√
(c) f (t) = t across [0, 2]
(d) f (z) = z 3 − 1 across [−1, 1]
3. Calculate the r.m.s values of the following:

h πi
(a) f (t) = sin t across 0,
2
(b) f (t) = sin t across [0, π]
(c) f (t) = sin ωt across [0, π]
(d) f (t) = cos t across 0, π2

(e) f (t) = cos t across [0, π]

(f) f (t) = cos ωt across [0, π]
(g) f (t) = sin ωt + cos ωt across [0, 1]
4. Calculate the r.m.s values of the following functions:

√
(a) f (t) = t + 1 across [0, 3]
(b) f (t) = et across [−1, 1]
(c) f (t) = 1 + et across [−1, 1]
Answers
1. (a) 2.0817 (b) 1.5275 (c) 0.4472 (d) 1.7889 (e) 6.9666
2. (a) 12.4957 (b) 0.7071 (c) 1 (d) 1.0690
r
1 sin πω cos πω
3. (a) 0.7071 (b) 0.7071 (c) −
2 2πω
r r
1 sin πω cos πω sin2 ω
(d) 0.7071 (e) 0.7071 (f) + (g) 1 +
2 2πω ω
4. (a) 1.5811 (b) 1.3466 (c) 2.2724
HELM (2006): 19

Volumes of Revolution 14.3
Introduction
In this Section we show how the concept of integration as the limit of a sum, introduced in Section
14.1, can be used to find volumes of solids formed when curves are rotated around the x or y axis.


Prerequisites
• understand integration as the limit of a sum

Learning Outcomes • calculate volumes of revolution


20 HELM (2006):
®
1. Volumes generated by rotating curves about the x-axis

Figure 8 shows a graph of the function y = 2x for x between 0 and 3.
y
6
y = 2x
O 3 x
Figure 8: A graph of the function y = 2x, for 0 ≤ x ≤ 3
Imagine rotating the line y = 2x by one complete revolution (3600 or 2π radians) around the x-axis.
The surface so formed is the surface of a cone as shown in Figure 9. Such a three-dimensional shape
is known as a solid of revolution. We now discuss how to obtain the volumes of such solids of
revolution.
y
6
y = 2x
O 3 x
Figure 9: When the line y = 2x is rotated around the axis, a solid is generated
Task
Find the volume of the cone generated by rotating y = 2x, for 0 ≤ x ≤ 3, around
the x-axis, as shown in Figure 9.
In order to find the volume of this solid we assume that it is composed of lots of
thin circular discs all aligned perpendicular to the x-axis, such as that shown in
the diagram. From the diagram below we note that a typical disc has radius y,
which in this case equals 2x, and thickness δx.
HELM (2006): 21
Section 14.3: Volumes of Revolution
y
6
y = 2x
(x, y)
δx
O 3 x
The cone is divided into a number of thin circular discs.
The volume of a circular disc is the circular area multiplied by the thickness.
Write down an expression for the volume of this typical disc:
Your solution
Answer
π(2x)2 δx = 4πx2 δx
To find the total volume we must sum the contributions from all discs and find the limit of this sum
as the number of discs tends to infinity and δx tends to zero. That is
x=3
X
lim 4πx2 δx
δx→0
x=0
This is the definition of a definite integral. Write down the corresponding integral:
Your solution
Answer
Z 3
4πx2 dx
0
Find the required volume by performing the integration:

Your solution
22 HELM (2006):
®
Answer
3
4πx3

= 36π
3 0
Task
A graph of the function y = x2 for x between 0 and 4 is shown in the diagram. The
graph is rotated around the x-axis to produce the solid shown. Find its volume.
y
y = x2
16
(x, y)
δx
O 4 x
The solid of revolution is divided into a number of thin circular discs.
As in the previous Task, the solid is considered to be composed of lots of circular discs of radius y,
(which in this example is equal to x2 ), and thickness δx.
Write down the volume of each disc:
Your solution
Answer
π(x2 )2 δx = πx4 δx
Write down the expression which represents summing the volumes of all such discs:
Your solution
Answer
x=4
X
πx4 δx
x=0
Write down the integral which results from taking the limit of the sum as δx → 0:
HELM (2006): 23
Your solution
Answer
Z 4
πx4 dx
0
Perform the integration to find the volume of the solid:

Your solution
Answer
45 π
= 204.8π
5
Task
In general, suppose the graph of y(x) between x = a and x = b is rotated about
the x-axis, and the solid so formed is considered to be composed of lots of circular
discs of thickness δx.
Write down an expression for the radius of a typical disc:

Your solution
Answer
y
Write down an expression for the volume of a typical disc:
Your solution
Answer
πy 2 δx
The total volume is found by summing these individual volumes and taking the limit as δx tends to
zero:
x=b
X
lim πy 2 δx
δx→0
x=a
Write down the definite integral which this sum defines:

Your solution
Answer
Z b
πy 2 dx
a
24 HELM (2006):
®
Key Point 5
If the graph of y(x), between x = a and x = b, is rotated about the x-axis the volume of the solid
formed is Z b
πy 2 dx
a
Exercises
1. Find the volume of the solid formed when that part of the curve between y = x2 between
x = 1 and x = 2 is rotated about the x-axis.
2. The parabola y 2 = 4x for 0 ≤ x ≤ 1, is rotated around the x-axis. Find the volume of the
solid formed.
Answers 1. 31π/5, 2. 2π.
2. Volumes generated by rotating curves about the y-axis

We can obtain a different solid of revolution by rotating a curve around the y-axis instead of around
the x-axis. See Figure 10.
y(x)
δy (x, y)
O x
Figure 10: A solid generated by rotation around the y-axis

To find the volume of this solid it is divided into a number of circular discs as before, but this time
the discs are horizontal. The radius of a typical disc is x and its thickness is δy. The volume of the
disc will be πx2 δy.
The total volume is found by summing these individual volumes and taking the limit as δy → 0. If
the lower and upper limits on y are c and d, we obtain for the volume:
y=d Z d
X
2
lim πx δy which is the definite integral πx2 dy
δy→0 c
y=c
HELM (2006): 25
Key Point 6
If the graph of y(x), between y = c and y = d, is rotated about the y-axis the volume of the solid
formed is Z d
πx2 dy
c
Task
Find the volume generated when the graph of y = x2 between x = 0 and x = 1
is rotated around the y-axis.
Using Key Point 6 write down the required integral:

Your solution
Answer
Z 1
πx2 dy
0
This integral can be written entirely in terms of y, using the fact that y = x2 to eliminate x. Do
this now, and then evaluate the integral:
Your solution
Answer
Z 1 1 1
πy 2
Z
2 π
πx dy = πy dy = =
0 0 2 0 2
Exercises
1. The curve y = x2 for 1 < x < 2 is rotated about the y-axis. Find the volume of the solid
formed.
2. The line y = 2 − 2x for 0 ≤ x ≤ 2 is rotated around the y-axis. Find the volume of revolution.
Answers
15π 16π
1. 2. .
2 3
26 HELM (2006):
®
Lengths of Curves and
Surfaces of Revolution 14.4

Introduction
Integration can be used to find the length of a curve and the area of the surface generated when a
curve is rotated around an axis. In this Section we state and use formulae for doing this.

Prerequisites • be able to calculate definite integrals


• find the length of curves
Learning Outcomes
• find the area of the surface generated when a
On completion you should be able to . . . curve is rotated about an axis

HELM (2006): 27
Section 14.4: Lengths of Curves and Surfaces of Revolution
1. The length of a curve
To find the length of a curve in the xy plane we first divide the curve into a large number of pieces.
We measure (or, at least, approximate) the length of each piece and then by an obvious summation
process obtain an estimate for the length of the curve. Theoretically, we allow the number of pieces
to increase without bound, implying that the length of each piece will tend to zero. In this limit the
summation process becomes an integration process.
y
δy
δx y(x)
a b x
Figure 11
Figure 11 shows the portion of the curve y(x) between x = a and x = b. A small piece of this curve
has been selected and can be considered as the hypotenuse of a triangle with base δx and height
δy. (Here δx and δy are intended to be ‘small’ so that the curved segment can be regarded as a
straight segment.) s 2
p
2 2
δy
Using Pythagoras’ theorem, the length of the hypotenuse is: δx + δy = 1 + δx
δx
By summing all such contributions between x = a and x = b, and letting δx → 0 we obtain an
expression for the total length of the curve:
s
x=b 2
X δy
lim 1+ δx
δx→0
x=a
δx
But we already know how to write such an expression in terms of an integral. We obtain the following
result:
Key Point 7
Given a curve with equation y = f (x), then the length of the curve between the points where x = a
and x = b is given by the formula:
s 2
Z b
dy
1+ dx
a dx
Because of the complicated form of the integrand, and in particular the square root, integrals of this
type are often difficult to calculate. In practice, approximate numerical methods rather than exact
methods are normally needed to perform the integration. We shall first illustrate the application of
the formula in Key Point 7 by a problem which could be calculated in a much simpler way, before
looking at some harder problems.
28 HELM (2006):
®
Example 5
Find the length of the curve y = 3x + 2 between x = 1 and x = 5.
Solution
In this Example, the curve is in fact a straight line segment, and its length could be obtained using
Pythagoras’ theorem without the need for integration.
dy
Notice from the formula in Key Point 7 that it is necessary to find , which in this case is 3.
dx
Applying the formula we find
Z 5p
length of curve = 1 + (3)2 dx
1
Z 5√
= 10 dx
1
h√ i5
= 10x
√1 √
= (5 − 1) 10 = 4 10 = 12.65 to 2 d.p.
Thus the length of the curve y = 3x + 2 between the points where x = 1 and x = 5 is 12.65 units.
Task
Find the length of the curve y = cosh x between x = 0 and x = 2 shown in the
diagram.
y
y = cosh x
0 2 x
dy
First write down :
dx
Your solution
dy
=
dx
Answer
dy
= sinh x
dx
HELM (2006): 29
Hence write down the required integral:
Your solution
Answer
Z 2p
1 + sinh2 x dx
0
This integral can be evaluated by making use of the hyperbolic identity cosh2 x − sinh2 x ≡ 1.
Write down the integral which results after applying this identity:
Your solution
Answer
Z 2
cosh x dx
0
Perform the integration to find the required length:

Your solution
Answer
2
sinh x = 3.63 to 2 d.p.
0
Thus the length of y = cosh x between x = 0 and x = 2 is 3.63 units.
The next Task is more complicated still and requires the use of a hyperbolic substitution and knowl-
edge of the hyperbolic identities.
Task
Find the length of the curve y = x2 between x = 0 and x = 3.
dy
Given y = x2 then = 2x. Use this result and apply the formula in Key Point 7 to obtain the
dx
integral required:
Your solution
Answer
Z 3√
1 + 4x2 dx
0
30 HELM (2006):
®
dx 1
Make the substitution x = 21 sinh u, giving = cosh u, to obtain an integral in terms of u:
du 2
Your solution
Answer
Z sinh−1 6 p
1
1 + sinh2 u cosh u du
0 2
Use the hyperbolic identity cosh2 u − sinh2 u ≡ 1 to eliminate sinh2 u:

Your solution
Answer
−1
1 sinh 6
Z
cosh2 u du
2 0
Use the hyperbolic identity cosh2 u ≡ 12 (cosh 2u + 1) to rewrite the integrand in terms of cosh 2u:
Your solution
Answer
−1
1 sinh 6
Z
(cosh 2u + 1) du
4 0
Finally, perform the integration to complete the calculation:
Your solution
Answer
Z sinh−1 6 sinh−1 6
1 1 sinh 2u
(cosh 2u + 1) du = +u
4 0 4 2 0
= 9.75 to 2 d.p.
Thus the length of the curve y = x2 between x = 0 and x = 3 is 9.75 units.
HELM (2006): 31
Exercises
1. Find the length of the line y = 2x + 7 between x = 1 and x = 3 using the technique of this
Section. Verify your result from your knowledge of the straight line.
2. Find the length of y = x3/2 between x = 0 and x = 5.
3. Calculate the length of the curve y 2 = 4x3 between x = 0 and x = 2, in the first quadrant.
Answers
√
1. 2 5 ≈ 4.47.
√ The distance
√ is from
√ (1.9) to (3, 13) along the line. This is given using Pythagoras’
2 2
theorem as 2 + 4 = 20 = 2 5.
2. 12.41
3. 6.06 (first quadrant only).
2. The area of a surface of revolution

In Section 14.2 we found an expression for the volume of a solid of revolution. Here we consider the
more complicated problem of formulating an expression for the surface area of a solid of revolution.
y(x)
y
(x , y ) δy
δx
a b x
Figure 12
Figure 12 shows the portion of the curve y(x) between x = a and x = b which is rotated around
the x axis through 360◦ . A small disc, of thickness δx, of the solid of revolution has been selected.
Its radius is y and so its circumference has length 2πy. (As usual we assume δx is ‘small’ so that
the curved part of y(x) representing the hypotenuse of the highlighted ‘triangle’ can p be regarded
as straight). This surface ‘ribbon’, shown shaded,
p has a length 2πy and a width (δx)2 + (δy)2
2 2
and so its area is, to a good approximation, 2πy (δx) + (δy) . We now let δx → 0 to obtain the
result in Key Point 8:
32 HELM (2006):
®
Key Point 8
Given a curve with equation y = f (x), then the surface area of the solid generated by rotating that
part of the curve between the points where x = a and x = b around the x axis is given by the
formula: s
Z b 2
dy
area of surface = 2πy 1 + dx
a dx
Task
Find the area of the surface generated when the part of the curve y = x3 between
x = 0 and x = 4 is rotated around the x axis.
Using Key Point 8 write down the integral:

Your solution
Answer s 2
b 4 4 √
Z Z Z
dy
q
area = 2πy 1+ dx = 2πx 3
1+ (3x2 )2 dx = 2πx3 1 + 9x4 dx
a dx 0 0
du
Use the substitution u = 1 + 9x4 so = 36x3 to write down the integral in terms of u:
dx
Your solution
Answer
π 2305 √
Z
u du
18 1
Perform the integration:

Your solution
Answer
2305
π 2u3/2

18 3 1
HELM (2006): 33
Apply the limits of integration to find the area:
Your solution
Answer
π
(2305)3/2 − 1
27
Exercises
1. The line y = x between x = 0 and x = 1 is rotated around the x axis.
(a) Find the area of the surface generated.

(b) Verify this result by finding the curved surface area of the corresponding cone. (The
curved surface area of a cone of radius r and slant height ` is πr`.)
√
2. Find the area of the surface generated when y = x in the interval 1 ≤ x ≤ 2 is rotated about
the x axis.
Answers
√
1. π 2
2. 8.28
34 HELM (2006):
Contents 15
Applications of
Integration 2
15.1 Integration of Vectors 2
15.2 Calculating Centres of Mass 5
15.3 Moment of Inertia 24
Learning outcomes
In this Workbook you will learn to interpret an integral as the limit of a sum. You will learn
how to apply this approach to the meaning of an integral to calculate important attributes
of a curve: the area under the curve, the length of a curve segment, the volume and
surface area obtained when a segment of a curve is rotated about an axis. Other quantities
of interest which can also be calculated using integration is the position of the centre of
mass of a plane lamina and the moment of inertia of a lamina about an axis. You will also
learn how to determine the mean value of an integal.
Integration of
Vectors 15.1
Introduction
The area known as vector calculus is used to model mathematically a vast range of engineering
phenomena including electrostatics, electromagnetic fields, air flow around aircraft and heat flow in
nuclear reactors. In this Section we introduce briefly the integral calculus of vectors.
' $
• have a knowledge of vectors, in Cartesian
form
• be able to calculate the scalar product of two

Prerequisites vectors
Before starting this Section you should . . . • be able to calculate the vector product of two
vectors
• be able to integrate scalar functions

&
%

Learning Outcomes • integrate vectors


2 HELM (2006):
®
1. Integration of vectors
If a vector depends upon time t, it is often necessary to integrate it with respect to time. Recall that
i, j and k are constant vectors and must be treated thus in any integration. Hence the integral,
Z
I = (f (t)i + g(t)j + h(t)k) dt
Z Z Z
is evaluated as three scalar integrals i.e. I = f (t) dt i + g(t) dt j + h(t) dt k
Example 1 Z 1
2
If r = 3ti + t j + (1 + 2t)k, evaluate r dt.
0
Solution
Z 1 Z 1 Z 1 Z 1
2
r dt = 3t dt i + t dt j + (1 + 2t) dt k
0 0 0 0
1 3 1
3t2

t 1 3 1
j + t + t2 0 k = i + j + 2k

= i+
2 0 3 0 2 3
Trajectories
To simplify the modelling of the path of a body projected from a fixed point we usually ignore any
air resistance and effects due to the wind. Once this initial model is understood other variables and
effects can be introduced into the model.
A particle is projected from a point O with velocity u and an angle θ above the horizontal as shown
in Figure 1.
y u
θ
O x
Figure 1
The only force acting on the particle in flight is gravity acting downwards, so if m is the mass of the
projectile and taking axes as shown, the force due to gravity is −mgj. Now using Newton’s second
law (rate of change of momentum is equal to the applied force) we have
d(mv)
= −mgj
dt
Cancelling the common factor m and integrating we have
v(t) = −gtj + c where c is a constant vector.
HELM (2006): 3
Section 15.1: Integration of Vectors
dr
However, velocity is the rate of change of position: v(t) = so
dt
dr
= −gtj + c
dt
Integrating once more:
1
r(t) = − gt2 j + ct + d where d is another constant vector.
2
The values of these constant vectors may be determined by using the initial conditions in this
problem: when t = 0 then r = 0 and v = u. Imposing these initial conditions gives
d = 0 and c = u cos θi + u sin θj where u is the magnitude of u. This gives
1
r(t) = ut cos θi + (ut sin θ − gt2 )j.
2
The interested reader might try to show why the path of the particle is a parabola.
Exercises
Z π
1. Given r = 3 sin t i − cos t j + (2 − t)k, evaluate r dt.
0
2. Given v = i − 3j + k, evaluate:
Z 1 Z 2
(a) v dt, (b) v dt
0 0
3. The vector, a, is defined by a = t2 i + e−t j + tk. Evaluate

Z 1 Z 3 Z 4
(a) a dt, (b) a dt, (c) a dt
0 2 1
4. Let a and b be two three-dimensional vectors. Is the following result true?

Z t2 Z t2 Z t2
a dt × b dt = a × b dt
t1 t1 t1
where × denotes the vector product.

Answers
1. 6i + 1.348k
2. (a) i − 3j + k (b) 2i − 6j + 2k
3. (a) 0.333i + 0.632j + 0.5k (b) 6.333i + 0.0855j + 2.5k (c) 21i + 0.3496j + 7.5k
4. No.
4 HELM (2006):
®
Calculating
Centres of Mass 15.2
Introduction
In this Section we show how the idea of integration as the limit of a sum can be used to find the
centre of mass of an object such as a thin plate (like a sheet of metal). Such a plate is also known
as a lamina. An understanding of the term moment is necessary and so this concept is introduced.


Prerequisites

Learning Outcomes • calculate the position of the centre of mass

of a variety of simple plane shapes

HELM (2006): 5
Section 15.2: Calculating Centres of Mass
1. The centre of mass of a collection of point masses
Suppose we have a collection of masses located at a number of known points along a line. The
centre of mass is the point where, for many purposes, all the mass can be assumed to be located.
For example, if two objects each of mass m are placed at distances 1 and 2 units from a point O, as
shown in Figure 2a, then the total mass, 2m, might be assumed to be concentrated at distance 1.5
units as shown in Figure 2b. This is the point where we could imagine placing a pivot to achieve a
perfectly balanced system.
(a)
m m
O 1
2
(b) 2m
O
1.5
Figure 2: Equivalent position of the centre of mass of the objects in (a) is shown in (b)
To think of this another way, if a pivot is placed at the origin O, as on a see-saw, then the two
masses at x = 1 and x = 2 together have the same turning effect or moment as a single mass
2m located at x = 1.5. This is illustrated in Figure 3.
(a)
m m
O
1
2
(b) 2m
O
1.5
Figure 3: The single object of mass 2m has the same turning effect
as the two objects each of mass m
Before we can calculate the position of the centre of mass of a collection of masses it is important
to define the term ‘moment’ more precisely. Given a mass M located a distance d from O, as shown
in Figure 4, its moment about O is defined to be
moment = M × d
O M
Figure 4: The moment of the mass M about O is M × d
6 HELM (2006):
®
In words, the moment of the mass about O, is the mass multiplied by its distance from O. The units
of moment will therefore be kg m if the mass is measured in kilogrammes and the distance in metres.
(N.B. Unless specified otherwise these will be the units we shall always use.)
Task
Calculate the moment of the mass about O in each of the following cases.
O 5
(a) 8
O 5
(b) 10
Your solution
(a) (b)
Answer
(a) 40 kg m (b) 50 kg m
Intuition tells us that a large moment corresponds to a large turning effect. A mass placed 8 metres
from the origin has a smaller turning effect than the same mass placed 10 metres from the origin.
This is, of course, why it is easier to rock a see-saw by pushing it at a point further from the pivot.
Our intuition also tells us the side of the pivot on which the masses are placed is important. Those
placed to the left of the pivot have a turning effect opposite to those placed to the right.
For a collection of masses the moment of the total mass located at the centre of mass is equal to the
sum of the moments of the individual masses. This definition enables us to calculate the position of
the centre of mass. It is conventional to label the x coordinate of the centre of mass as x̄, pronounced
‘x bar’.
Key Point 1
The moment of the total mass located at the centre of mass is equal to the sum of the moments
of the individual masses.
HELM (2006): 7
Task
Objects of mass m and 3m are placed at the locations shown in diagram (a). Find
the distance x̄ of the centre of mass from the origin O as illustrated in diagram
(b).
6
O m 3m O 4m
10 x̄
(a) (b)
First calculate the sum of the individual moments:

Your solution
Answer
6 × m + 10 × 3m = 36m
The moment of the total mass about O is 4m × x̄.

The moment of the total mass is equal to the sum of the moments of the individual masses. Write
down and solve the equation satisfied by x̄:
Your solution
Answer
36m = 4mx̄, so x̄ = 9
So the centre of mass is located a distance 9 units along the x-axis. Note that it is closer to the
position of the 3m mass than to the position of the 1m mass (actually in the ratio 3 : 1).
Example 2
Obtain an equation for the location of the centre of mass of two objects of masses
m1 and m2 :
(a) located at distances x1 and x2 respectively, as shown in Figure 5(a)
(b) positioned on opposite sides of the origin as shown in Figure 5(b)
centre of mass centre of mass

x̄ x̄
O m1 m2 m1 O m2
x1 x1 x2
x2
(a) (b)
Figure 5
8 HELM (2006):
®
Referring to Figure 5(a) we first write down an expression for the sum of the individual moments:
m1 x1 + m2 x2
The total mass is m1 + m2 and the moment of the total mass is (m1 + m2 ) × x̄.
The moment of the total mass is equal to the sum of the moments of the individual masses. The
equation satisfied by x̄ is
m1 x1 + m2 x2
(m1 + m2 )x̄ = m1 x1 + m2 x2 so x̄ =
m1 + m2
For the second case, as depicted in Figure 5(b), the mass m1 positioned on the left-hand side has a
turning effect opposite to that of the mass m2 positioned on the right-hand side. To take account
of this difference we use a minus sign when determining the moment of m1 about the origin. This
gives a total moment
−(m1 x1 ) + (m2 x2 )
leading to
−m1 x1 + m2 x2
(−m1 x1 + m2 x2 ) = (m1 + m2 )x̄ so x̄ =
m1 + m2
However, this is precisely what would have been obtained if, when working out the moment of a
mass, we use its coordinate (which takes account of sign) rather than using its distance from the
origin.
The formula obtained in the Task can be generalised very easily to deal with the general situation of
n masses, m1 , m2 , . . . , mn located at coordinate positions x1 , x2 , . . . xn and is given in Key Point
2.
Key Point 2
The centre of mass of individual masses m1 , m2 , . . . , mn located at positions x1 , x2 , . . . xn is
Xn
mi xi
x̄ = i=1
Xn
mi
i=1
Task
Calculate the centre of mass of the 4 masses distributed as shown below.
9 1 5 2
−1 0 1 2 3 4 5 6 7 8 9
HELM (2006): 9
Use Key Point 2 to calculate x̄:
Your solution
x̄ =
Answer
(9)(−1) + (1)(2) + (5)(6) + (2)(8) 39
x̄ = =
9+1+5+2 17
39
The centre of mass is located a distance 17
≈ 2.29 units along the x-axis from O.
Distribution of particles in a plane

If the particles are distributed in a plane then the position of the centre of mass can be calculated in
a similar way.
y m2
(x2 , y2 )
xi mi
(xi , yi )
M
m1 (x̄, ȳ)
yi
(x1 , y1 )
m3
(x3 , y3 )
x
Figure 6: These masses are distributed throughout the xy plane
Now we must consider the moments of the individual masses taken about the x-axis and about the
y-axis. For example, in Figure 6, mass mi has a moment mi yi about the x-axis and a moment mi xi
about the y-axis. Now we define the centre of mass at that point (x̄, ȳ) such that the total mass
M = m1 + m2 + . . . mn placed at this point would have the same moment about each of the axes
as the sum of the individual moments of the particles about these axes.
Key Point 3
The centre of mass of m1 , m2 , . . . , mn located at (x1 , y1 ), (x2 , y2 ), . . . (xn , yn ) has coordinates
(x̄, ȳ) where
Xn Xn
mi xi mi yi
i=1 i=1
x̄ = n , ȳ = n
X X
mi mi
i=1 i=1
10 HELM (2006):
®
Task
Masses of 5 kg, 3 kg and 9 kg are located at the points with coordinates (−1, 1),
(4, 3), and (8, 7) respectively. Find the coordinates of their centre of mass.
Use Key Point 3:
Your solution
x̄ =
ȳ =
Answer
3
X
mi xi
i=1 5(−1) + 3(4) + 9(8) 79
x̄ = 3
= = ≈ 4.65
X 5+3+9 17
mi
i=1
5(1) + 3(3) + 9(7)

ȳ = = 4.53.
17
Hence the centre of mass is located at the point (4.65, 4.53).
Exercises
1. Find the x coordinate of the centre of mass of 5 identical masses placed at x = 2, x = 5,
x = 7, x = 9, x = 12.
2. Derive the formula for ȳ given in Key Point 3.
Answer 1. x̄ = 7
HELM (2006): 11
2. Finding the centre of mass of a plane uniform lamina
In the previous Section we calculated the centre of mass of several individual point masses. We are
now interested in finding the centre of mass of a thin sheet of material, such as a plane sheet of
metal, called a lamina. The mass is not now located at individual points. Rather, it is distributed
continuously over the lamina. In what follows we assume that the mass is distributed uniformly over
the lamina and you will see how integration as the limit of a sum is used to find the centre of mass.
Figure 6 shows a lamina where the centre of mass has been marked at point G with coordinates (x̄, ȳ).
If the total mass of the lamina is M then the moments about the y- and x-axes are respectively M x̄
and M ȳ. Our approach to locating the position of G, i.e. finding x̄ and ȳ, is to divide the lamina
into many small pieces, find the mass of each piece, and calculate the moment of each piece about
the axes. The sum of the moments of the individual pieces about the y-axis must then be equal to
M x̄ and the sum of the moments of the individual pieces about the x-axis must equal M ȳ.
G(x̄, ȳ)
Figure 6: The centre of mass of the lamina is located at G(x̄, ȳ)
There are no formulae which can be memorized for finding the centre of mass of a lamina because
of the wide variety of possible shapes. Instead you should be familiar with the general technique for
deriving the centre of mass.
An important preliminary concept is ‘mass per unit area’ which we now introduce.
Mass per unit area

Suppose we have a uniform lamina and select a piece of the lamina which has area equal to one
unit. Let ρ stand for the mass of such a piece. Then ρ is called the mass per unit area. The mass
of any other piece can be expressed in terms of ρ. For example, an area of 2 units must have mass
2ρ, an area of 3 units must have mass 3ρ, and so on. Any portion of the lamina which has area A
has mass ρA.
Key Point 4
If a lamina has mass per unit area, ρ, then the mass of part of the lamina having area A is Aρ.
We will investigate the calculation of centre of mass through the following Tasks.
12 HELM (2006):
®
Task
Consider the plane sheet, or lamina, shown below. Find the location of its centre
of mass (x̄, ȳ). (By symmetry the centre of mass of this lamina lies on the x-axis.)
y
3 y = 3x
G(x̄, ȳ)
X x
O x̄
¯ 1
(a) First inspect the figure and note the symmetry of the lamina. Purely from the symmetry, what
must be the y coordinate, ȳ, of the centre of mass ?
Your solution
Answer
ȳ = 0 since the centre of mass must lie on the x-axis
(b) Let ρ stand for the mass per unit area of the lamina. The total area is 3 units. The total mass
is therefore 3ρ. Its moment about the y-axis is 3ρx̄.
To find x̄ first divide the lamina into a large number of thin vertical slices. In the figure below a
typical slice has been highlighted. Note that the slice has been drawn from the point P on the line
y = 3x. The point P has coordinates (x, y). The thickness of the slice is δx.
y
3 y = 3x
P (x, y)
O 1 x
δx
A typical slice of this sheet has been shade.

Assuming that the slice is rectangular in shape, write down its area:
HELM (2006): 13
Your solution
Answer
2yδx
(c) Writing ρ as the mass per unit area, write down the mass of the slice:
Your solution
Answer
(2yδx)ρ
(d) The centre of mass of this slice lies on the x-axis. So the slice can be assumed to be a point
mass, 2yρδx, located a distance x from O.
Write down the moment of the mass of the slice about the y-axis:
Your solution
Answer
(2yδx)ρx
(e) By adding up contributions from all such slices in the lamina we obtain the sum of the moments
of the individual masses:
x=1
X
2ρxyδx
x=0
The limits on the sum are chosen so that all slices are included.
Write down the integral defined by letting δx → 0:
Your solution
Answer
Z x=1
2ρxy dx
x=0
(f) Noting that y = 3x, express the integrand in terms of x and evaluate it:
Your solution
Answer
Z 1 1
2 3
6ρx dx = 2ρx = 2ρ
0 0
14 HELM (2006):
®
(g) Calculate x̄ and hence find the centre of mass of the lamina:
Your solution
Answer
This must equal the moment of the total mass acting at the centre of mass so 3ρx̄ = 2ρ giving
2
x̄ = . Now the coordinates of the centre of mass are thus ( 23 , 0).
3
Task
Find the centre of mass of the plane lamina shown below.
y
2 y=x
(x, y)
δx
x
2
The coordinates of x̄ and ȳ must be calculated separately.

Stage 1: To calculate x̄ x
(a) Let ρ equal the mass per unit area. Write down the total area, the total mass, and its moment
about the y-axis:
Your solution
Answer
2, 2ρ, 2ρx̄
(b) To calculate x̄ the lamina is divided into thin slices; a typical slice is shown in the figure above.
We assume that the shaded slice is rectangular, which is a reasonable approximation.
Write down the height of the typical strip shown in the figure, its area, and its mass:
Your solution
Answer
y, yδx, (yδx)ρ
HELM (2006): 15
(c) Write down the moment about the y-axis of the typical strip:
Your solution
Answer
(yδx)ρx
(d) The sum of the moments of all strips is

x=2
X
ρxyδx
x=0
Write down the integral which follows as δx → 0:

Your solution
Answer
Z 2
ρxy dx
0
(e) In this example, y = x because the line y = x defines the upper limit of each strip (and hence
its height). Substitute this value for y in the integral, and evaluate it:
Your solution
Answer
Z 2
8
ρx2 dx = ρ
0 3
8
(f) Equating the sum of individual moments and the total moment gives 2ρx̄ = ρ. Deduce x̄:
3
Your solution
Answer
x̄ = 43
We will illustrate two alternative ways of calculating ȳ.
16 HELM (2006):
®
Stage 2: To calculate ȳy using vertical strips
y
2 y=x
(x, y)
δx
x
2
(a) Referring to the figure again, which we repeat here, the centre of mass of the slice must lie half
y
way along its length, that is its y coordinate is . Assume that all the mass of the slice, yρδx, acts
2
y
at this point. Then its moment about the x-axis is yρδx . Adding contributions from all slices gives
2
the sum
x=2 2
X y ρ
δx
x=0
2
(b) Write down the integral which is defined as δx → 0:
Your solution
Answer
Z 2
ρy 2
dx
x=0 2
(c) We can write the above as

Z 2 2
y
ρ dx and in this example y = x, so the integral becomes
x=0 2
Z 2 2
x
ρ dx
x=0 2
Evaluate this.
Your solution
Answer
4ρ
3
HELM (2006): 17
(d) This is the sum of the individual moments about the x-axis and must equal the moment of the
total mass about the x-axis which has already been found as 2ρȳ. Therefore
4ρ 2
2ρȳ = from which ȳ =
3 3
(e) Finally deduce ȳ and state the coordinates of the centre of mass:
Your solution
Answer
2 4 2
ȳ = and the coordinates of the centre of mass are ( , )
3 3 3
Stage 3: To calculate ȳy using horizontal strips

(a) This time the lamina is divided into a number of horizontal slices; a typical slice is shown below.
y
2 y=x
(x, y) δy
2 x
2 x
A typical horizontal slice is shaded.

The length of the typical slice shown is 2 − x.
Write down its area, its mass and its moment about the x-axis:
Your solution
Answer
(2 − x)δy, ρ(2 − x)δy, ρ(2 − x)yδy
(b) Write down the expression for the sum of all such moments and the corresponding integral as
δy → 0.
Your solution
Answer
y=2 Z 2
X
ρ(2 − x)yδy, ρ(2 − x)y dy
y=0 0
18 HELM (2006):
®
(c) Now, since y = x the integral can be written entirely in terms of y as

Z 2
ρ(2 − y)y dy
0
Evaluate the integral and hence find ȳ:

Your solution
Answer
4ρ
; As before the total mass is 2ρ, and its moment about the x-axis is 2ρȳ. Hence
3
4ρ 2
2ρȳ = from which ȳ = which was the result obtained before in Stage 2.
3 3
Task
Find the position of the centre of mass of a uniform semi-circular lamina of radius
a, shown below.
y x 2 + y 2 = a2
(x, y)
δy x
A typical horizontal strip is shaded.
The equation of a circle centre the origin, and of radius a is x2 + y 2 = a2 .

By symmetry x̄ = 0. However it is necessary to calculate ȳ.
(a) The lamina is divided into a number of horizontal strips and a typical strip is shown. Assume
that each strip is rectangular. Writing the mass per unit area as ρ, state the area and the mass of
the strip:
Your solution
Answer
2xδy, 2xρδy
HELM (2006): 19
(b) Write down the moment of the mass about the x-axis:
Your solution
Answer
2xρyδy
(c) Write down the expression representing the sum of the moments of all strips and the corresponding
integral obtained as δy → 0:
Your solution
Answer
y=a Z a
X
2xρyδy, 2xρy dy
y=0 0
p
(d) Now since x2 + y 2 = a2 we have x = a2 − y 2 and the integral becomes:
Z a p
2ρy a2 − y 2 dy
0
Evaluate this integral by making the substitution u2 = a2 − y 2 to obtain the total moment.
Your solution
Answer
2ρa3
3
(e) The total area is half that of a circle of radius a, that is 12 πa2 . The total mass is 12 πa2 ρ and its
moment is 12 πa2 ρȳ.
Deduce ȳ:
Your solution
Answer
1 2 2ρa3 4a
πa ρȳ = from which ȳ =
2 3 3π
20 HELM (2006):
®
Suspended cable
Introduction
A cable of constant line density is suspended between two vertical poles of equal height such that it
takes the shape of a curve, y = 6 cosh(x/6). The origin of the curve is a point mid-way between the
feet of the poles and y is the height above the ground. If the cable is 600 metres long show that the
distance between the poles is 55 metres to the nearest metre. Find the height of the centre of mass
of the cable above the ground to the nearest metre.
We can draw a picture of the cable as in Figure 7 where A and B denote the end points.
A B
G(0, ȳ)
c
−d O d
Figure 7
For the first part of this problem we use the result found
s in 14 that the distance along a curve
Z b 2
dy
y = f (x) from x = a to x = b is given by s = 1+ dx
a dx
x dy x
where in this case we are given y = 6 cosh and therefore = sinh .
6 dx 6
If we take the distance between the poles to be 2d then the values of x in this integration go from
−d to +d. So we need to find d such that:
Z dr x 2
600 = 1 + sinh dx. (1)
−d 6
For the second part of this problem we need to find the centre of mass of the cable. From the
symmetry of the problem we know that the centre of mass must lie on the y-axis. To find the height
of the centre of mass we need to take each section of the cable and consider the moment about the
x-axis through the origin. A section of the cable has mass ρδs where ρ is the line density of the cable
and δs is the length of a small section of the cable.
x=d
X
so the moment about the x-axis will be ρyδs
x=−d
s 2
dy
taking the limit as δs → 0 and using the fact that δs = 1+ δx
dx
HELM (2006): 21
s 2
Z d
dy
we get that the moment about the x-axis to be ρ y 1+ dx
−d dx
This must equal the moment of a single point mass, equal to the total mass of the cable, placed at
its centre of mass. As the length of the cable is 600 metres then the mass of the cable is 600ρ and
we have
s 2
Z d
dy
600ρȳ = ρ y 1+ dx
−d dx
Dividing both sides of this equation by ρ we get:
s 2
Z d
dy
600ȳ = y 1+ dx
−d dx
where we have already established the value of d from Equation (1) so we can solve this equation to
find ȳ.
Mathematical analysis r
Z d x 2
We need to find d so that 600 = 1 + sinh dx
−d 6 p
Rearranging the hyperbolic identity cosh2 (u) − sinh2 (u) ≡ 1 we obtain 1 + (sinh(u))2 = cosh(u)
Z d x x i+d
h d d
so the integral becomes cosh dx = 6 sinh = 6 sinh − sinh −
−d 6 6 −d 6 6
so
d
12 sinh = 600 and d = 6 sinh−1 (50).
6
Using the log identity for the sinh−1 function:
√
sinh−1 (x) ≡ ln(x + x2 + 1)
we find that d = 27.63 m so the distance between the poles is 55 m to the nearest metre.
To find the height of the centre of mass above the ground we use
s 2
Z d
dy
600ȳ = y 1+ dx
−d dx
s 2 r
x dy x 2 x
Substituting y = 6 cosh and therefore 1 + = 1 + sinh = cosh we
6 dx 6 6
get
Z d x x Z d x
6 cosh cosh dx = 6 cosh2 dx
−d 6 6 −d 6
1
From the hyperbolic identities we know that cosh2 (x) ≡ (cosh(2x) + 1)
Z d 2
x h x i+d d
so this integral becomes 3 cosh + 1 dx = 9 sinh + 3x = 18 sinh + 6d
−d 3 3 −d 3
d
So we have that 600ȳ = 18 sinh + 6d
3
From the first part of this problem we found that d = 27.63 so substituting for d we find ȳ = 150
metres to the nearest metre.
22 HELM (2006):
®
Interpretation
We have found that the two vertical poles holding the cable have a distance between them of 55
metres and the height of the centre of mass of the cable above the ground is 150 metres.
Exercise
Find the centre of mass of a lamina bounded by y 2 = 4x, for 0 ≤ x ≤ 9.
27
Answer ( , 0).
5
HELM (2006): 23

Moment of Inertia 15.3
Introduction
In this Section we show how integration is used to calculate moments of inertia. These are essential
for an understanding of the dynamics of rotating bodies such as flywheels.


Prerequisites

Learning Outcomes • calculate the moment of inertia of a number

of simple plane bodies

24 HELM (2006):
®
1. Introduction
Figure 8: A lamina rotating about an axis through O

Figure 8 shows a lamina which is allowed to rotate about an axis perpendicular to the plane of the
lamina and through O. The moment of inertia about this axis is a measure of how difficult it is
to rotate the lamina. It plays the same role for rotating bodies that the mass of an object plays
when dealing with motion in a line. An object with large mass needs a large force to achieve a given
acceleration. Similarly, an object with large moment of inertia needs a large turning force to achieve
a given angular acceleration. Thus knowledge of the moments of inertia of laminas and of solid
bodies is essential for understanding their rotational properties.
In this Section we show how the idea of integration as the limit of a sum can be used to find the
moment of inertia of a lamina.
2. Calculating the moment of inertia

Suppose a lamina is divided into a large number of small pieces or elements. A typical element is
shown in Figure 9.
O r
δm
Figure 9: The moment of inertia of the small element is δm r2

The element has mass δm, and is located a distance r from the axis through O. The moment of
inertia of this small piece about the given axis is defined to be δm r2 , that is, the mass multiplied by
the square of its distance from the axis of rotation. To find the total moment of inertia we sum the
individual contributions to give
X
r2 δm
where the sum must be taken in such a way that all parts of the lamina are included. As δm → 0
we obtain the following integral as the definition of moment of inertia:
HELM (2006): 25
Section 15.3: Moment of Inertia
Key Point 5
Z
moment of inertia I = r2 dm
where the limits of integration are chosen so that the entire lamina is included.
The unit of moment of inertia is kg m2 .
We shall illustrate how the moment of inertia is actually calculated in practice, in the following Tasks.
Task
Calculate the moment of inertia about the y-axis of the square lamina of mass M
and width b, shown below. (The moment of inertia about the y-axis is a measure
of the resistance to rotation around this axis.)
y
b/2
x
b/2 O b/2 x
δx
b/2
A square lamina rotating about the y-axis.
Let the mass per unit area of the lamina be ρ. Then, because its total area is b2 , its total mass
M is b2 ρ. Imagine that the lamina has been divided into a large number of thin vertical strips. A
typical strip is shown in the figure above. The strips are chosen in this way because each point on a
particular strip is approximately the same distance from the axis of rotation (the y-axis).
(a) Referring to the figure, write down the width of each strip:
Your solution
Answer
δx
26 HELM (2006):
®
(b) Write down the area of the strip:

Your solution
Answer
bδx
(c) With ρ as the mass per unit area write down the mass of the strip:
Your solution
Answer
ρbδx
(d) The distance of the strip from the y-axis is x. Write down its moment of inertia :
Your solution
Answer
mbx2 δx
ρbx2 δx where the sum must be such

P
(e) Adding contributions from all strips gives the expression
that the entire lamina is included. As δx → 0 the sum defines an integral. Write down this integral:
Your solution
Answer
Z b/2
I= ρbx2 dx
−b/2
(f) Note that the limits on the integral have been chosen so that the whole lamina is included. Then
Z b/2
I = ρb x2 dx
−b/2
Evaluate this integral:

Your solution
I=
HELM (2006): 27
Answer
3 b/2
x ρb4
I = ρb =
3 −b/2 12
(g) Write down an expression for M in terms of b and ρ:
Your solution
Answer
M = b2 ρ
(h) Finally, write an expression for I in terms of M and b:
Your solution
Answer
M b2
I=
12
Task
Find the moment of inertia of a circular disc of mass M and radius a about an
axis passing through its centre and perpendicular to the disc.
O a
A circular disc rotating about an axis through O.
The figure above shows the disc lying in the plane of the paper. Because of the circular symmetry
the disc is divided into concentric rings of width δr. A typical ring is shown below. Note that each
28 HELM (2006):
®
point on the ring is approximately the same distance from the axis of rotation.
δr
r a
The lamina is divided into many circular rings.

The ring has radius r and inner circumference 2πr. Imagine cutting the ring and opening it up. Its
area will be approximately that of a long thin rectangle of length 2πr and width δr. Given that ρ is
the mass per unit area write down an expression for the mass of the ring:
Your solution
Answer
2πrρδr
The moment of inertia of the ring about O is its mass multiplied by the square of its distance from
the axis of rotation. This is (2πrρδr) × r2 = 2πr3 ρδr.
The contribution from all rings must be summed. This gives rise to the sum
r=a
X
2πr3 ρδr
r=0
Note the way that the limits have been chosen so that all rings are included in the sum. As δr → 0
the limit of the sum defines the integral
Z a
2ρπr3 dr
0
Evaluate this integral to give the moment of inertia I:

Your solution
Answer a
2ρπr4 ρπa4

I= =
4 0 2
Write down the radius and area of the whole disc:

Your solution
Answer
a, πa2
HELM (2006): 29
With ρ as the mass per unit area, write down the mass of the disc M :
Your solution
M=
Answer
M = πa2 ρ
Finally express I in terms of M and a:
Your solution
I=
Answer
M a2
I=
2
30 HELM (2006):
®
Exercises
1. The moment of inertia about
Z a diameter of a sphere of radius 1 m and mass 1 kg is found by
3 1
evaluating the integral (1 − x2 )2 dx. Show that this moment of inertia is 0.4 kg m2 .
8 −1
2. Find the moment of inertia of the square lamina below about one of its sides.
y
b/2
x
b/2 O b/2 x
δx
b/2
3. Calculate the moment of inertia of a uniform thin rod of mass M and length ` about a
perpendicular axis of rotation at its end.
4. Calculate the moment of inertia of the rod in Exercise 3 about an axis through its centre and
perpendicular to the rod.
5. The parallel axis theorem states that the moment of inertia about any axis is equal to the
moment of inertia about a parallel axis through the centre of mass, plus the mass of the body
× the square of the distance between the two axes. Verify this theorem for the rod in Exercise
3 and Exercise 4.
6. The perpendicular axis theorem applies to a lamina lying in the xy plane. It states that the
moment of inertia of the lamina about the z-axis is equal to the sum of the moments of inertia
about the x- and y-axes. Suppose that a thin circular disc of mass M and radius a lies in the
xy plane and the z axis passes through its centre. The moment of inertia of the disc about
this axis is 12 M a2 .
(a) Use this theorem to find the moment of inertia of the disc about the x and y axes.
(b) Use the parallel axis theorem to find the moment of inertia of the disc about a tangential
axis parallel to the plane of the disc.
Answers
M b2 1 1
2. . 3. M `2 . 4. M `2 .
3 3 12
6. (a) The moments of inertia about the x and y axes must be the same by symmetry, and are equal
to 0.25 M a2 .
(b) 1.25 M a2 .
HELM (2006): 31
Contents 16
Sequences and Series
16.1 Sequences and Series 2
16.2 Infinite Series 13
16.3 The Binomial Series 26
16.4 Power Series 32
16.5 Maclaurin and Taylor Series 40
Learning outcomes
In this Workbook you will learn about sequences and series. You will learn about arithmetic
and geometric series and also about infinite series. You will learn how to test the for the
convergence of an infinite series. You will then learn about power series, in particular you
will study the binomial series. Finally you will apply your knowledge of power series
to the process of finding series expansions of functions of a single variable. You will be
able to find the Maclaurin and Taylor series expansions of simple functions about a point
of interest.

Sequences and Series 16.1
Introduction
In this Section we develop the ground work for later Sections on infinite series and on power series.
We begin with simple sequences of numbers and with finite series of numbers. We introduce the
summation notation for the description of series. Finally, we consider arithmetic and geometric series
and obtain expressions for the sum of n terms of both types of series.

• understand and be able to use the basic rules
Prerequisites of algebra
Before starting this Section you should . . . • be able to find limits of algebraic expressions

'
$
• check if a sequence of numbers is
convergent
Learning Outcomes • use the summation notation to specify

series
• recognise arithmetic and geometric series and
find their sums
& %
2 HELM (2006):
Workbook 16: Sequences and Series
1. Introduction
A sequence is any succession of numbers. For example the sequence
1, 1, 2, 3, 5, 8, . . .
which is known as the Fibonacci sequence, is formed by adding two consecutive terms together to
obtain the next term. The numbers in this sequence continually increase without bound and we say
this sequence diverges. An example of a convergent sequence is the harmonic sequence
1 1 1
1, , , , ...
2 3 4
Here we see the magnitude of these numbers continually decrease and it is obvious that the sequence
converges to the number zero. The related alternating harmonic sequence
1 1 1
1, − , , − , . . .
2 3 4
is also convergent to the number zero. Whether or not a sequence is convergent is often easy to
deduce by graphing the individual terms. The diagrams in Figure 1 show how the individual terms
of the harmonic and alternating harmonic series behave as the number of terms increase.
harmonic
1
1/2
1/3
1/4
1 2 3 4 5 term in sequence
alternating harmonic
1
1/3
1 2 3 4 5
− 1/4 term in sequence
− 1/2
Figure 1
Task
Graph the sequence:
1, −1, 1, −1, . . .
Is this convergent?
Your solution
HELM (2006): 3
Section 16.1: Sequences and Series
Answer
1
1 2 3 4 5 term in sequence
−1
Not convergent.
The terms in the sequence do not converge to a particular value. The value oscillates.
A general sequence is denoted by

a1 , a 2 , . . . , a n , . . .
in which a1 is the first term, a2 is the second term and an is the nth term is the sequence. For
example, in the harmonic sequence
1 1
a1 = 1, a2 = , . . . , an =
2 n
whilst for the alternating harmonic sequence the nth term is:
(−1)n+1
an =
n
in which (−1) = +1 if n is an even number and (−1)n = −1 if n is an odd number.
n
Key Point 1
The sequence a1 , a2 , . . . , an , . . . is said to be convergent if the limit of an as n increases
can be found. (Mathematically we say that lim an exists.)
n→∞
If the sequence is not convergent it is said to be divergent.
Task
Verify that the following sequence is convergent
3 4 5
, , , ...
1×2 2×3 3×4
First find the expression for the nth term:

Your solution
Answer
n+2
an =
n(n + 1)
4 HELM (2006):
Now find the limit of an as n increases:
Your solution
Answer
2
 
n+2 1+1
=
 n→ →0 as n increases
n(n + 1) n+1 n+1
Hence the sequence is convergent.
2. Arithmetic and geometric progressions

Consider the sequences:
1, 4, 7, 10, . . . and 3, 1, −1, −3, . . .
In both, any particular term is obtained from the previous term by the addition of a constant value (3
and −2 respectively). Each of these sequences are said to be an arithmetic sequence or arithmetic
progression and has general form:
a, a + d, a + 2d, a + 3d, . . . , a + (n − 1)d, . . .
in which a, d are given numbers. In the first example above a = 1, d = 3 whereas, in the second
example, a = 3, d = −2. The difference between any two successive terms of a given arithmetic
sequence gives the value of d which is called the common difference.
Two sequences which are not arithmetic sequences are:
1, 2, 4, 8, . . .
1 1 1
−1, − , − , − , . . .
3 9 27
In each case a particular term is obtained from the previous term by multiplying by a constant factor
1
(2 and respectively). Each is an example of a geometric sequence or geometric progression
3
with the general form:
a, ar, ar2 , ar3 , . . .
where ‘a’ is the first term and r is called the common ratio, being the ratio of two successive terms.
In the first geometric sequence above a = 1, r = 2 and in the second geometric sequence a = −1,
1
r= .
3
HELM (2006): 5
Task
Find a, d for the arithmetic sequence 3, 9, 15, . . .
Your solution
a= d=
Answer
a = 3, d = 6
Task
8 8
Find a, r for the geometric sequence 8, , , ...
7 49
Your solution
a= r=
Answer
1
a = 8, r=
7
Task
Write out the first four terms of the geometric series with a = 4, r = −2.
Your solution
Answer
4, −8, 16, −32, . . .
The reader should note that many sequences (for example the harmonic sequences) are neither
arithmetic nor geometric.
6 HELM (2006):
3. Series
A series is the sum of the terms of a sequence. For example, the harmonic series is
1 1 1
1+ + + + ···
2 3 4
and the alternating harmonic series is
1 1 1
1− + − + ···
2 3 4
The summation notation

If we consider a general sequence
a1 , a 2 , . . . , a n , . . .
k
X
then the sum of the first k terms a1 + a2 + a3 + · · · + ak is concisely denoted by ap .
p=1
That is,
k
X
a1 + a2 + a3 + · · · + ak = ap
p=1
k
X
When we encounter the expression ap we let the index ‘p’ in the term ap take, in turn, the values
p=1
1, 2, . . . , k and then add all these terms together. So, for example
3
X 7
X
ap = a1 + a2 + a3 ap = a2 + a3 + a4 + a5 + a6 + a7
p=1 p=2
6
X
Note that p is a dummy index; any letter could be used as the index. For example ai , and
i=1
6
X
am each represent the same collection of terms: a1 + a2 + a3 + a4 + a5 + a6 .
m=1
In order to be able to use this ‘summation notation’ we need to obtain a suitable expression for the
‘typical term’ in the series. For example, the finite series
12 + 22 + · · · + k 2
k
X
may be written as p2 since the typical term is clearly p2 in which p = 1, 2, 3, . . . , k in turn.
p=1
In the same way
X (−1)p+1 16
1 1 1 1
1 − + − + ··· − =
2 3 4 16 p=1
p
(−1)p+1
since an expression for the typical term in this alternating harmonic series is ap = .
p
HELM (2006): 7
Task
Write in summation form the series
1 1 1 1
+ + + ··· +
1×2 2×3 3×4 21 × 22
First find an expression for the typical term, “the pth term”:
Your solution
ap =
Answer
1
ap =
p(p + 1)
Now write the series in summation form:

Your solution
1 1 1 1
+ + + ··· + =
1×2 2×3 3×4 21 × 22
Answer
21
1 1 1 X 1
+ + ··· + =
1×2 2×3 21 × 22 p=1
p(p + 1)
Task 5
X (−1)p
Write out all the terms of the series 2
.
p=1
(p + 1)
(−1)p
Give p the values 1, 2, 3, 4, 5 in the typical term :
(p + 1)2
Your solution
5
X (−1)p
=
p=1
(p + 1)2
Answer
1 1 1 1 1
− 2 + 2 − 2 + 2 − 2.
2 3 4 5 6
8 HELM (2006):
4. Summing series
The arithmetic series

Consider the finite arithmetic series with 14 terms
1 + 3 + 5 + · · · + 23 + 25 + 27
A simple way of working out the value of the sum is to create a second series which is the first written
in reverse order. Thus we have two series, each with the same value A:
A = 1 + 3 + 5 + · · · + 23 + 25 + 27
and
A = 27 + 25 + 23 + · · · + 5 + 3 + 1
Now, adding the terms of these series in pairs
2A = 28 + 28 + 28 + · · · + 28 + 28 + 28 = 28 × 14 = 392 so A = 196.
We can use this approach to find the sum of n terms of a general arithmetic series.
If
A = [a] + [a + d] + [a + 2d] + · · · + [a + (n − 2)d] + [a + (n − 1)d]
then again simply writing the terms in reverse order:
A = [a + (n − 1)d] + [a + (n − 2)d] + · · · + [a + 2d] + [a + d] + [a]
Adding these two identical equations together we have
2A = [2a + (n − 1)d] + [2a + (n − 1)d] + · · · + [2a + (n − 1)d]
That is, every one of the n terms on the right-hand side has the same value: [2a + (n − 1)d]. Hence
1
2A = n[2a + (n − 1)d] so A = n[2a + (n − 1)d].
2
Key Point 2
The arithmetic series
[a] + [a + d] + [a + 2d] + · · · + [a + (n − 1)d]
having n terms has sum A where:

1
A = n[2a + (n − 1)d]
2
HELM (2006): 9
As an example
1 + 3 + 5 + · · · + 27 has a = 1, d = 2, n = 14
14
So A = 1 + 3 + · · · + 27 = [2 + (13)2] = 196.
2
The geometric series

We can also sum a general geometric series.
Let
G = a + ar + ar2 + · · · + arn−1
be a geometric series having exactly n terms. To obtain the value of G in a more convenient form
we first multiply through by the common ratio r:
rG = ar + ar2 + ar3 + · · · + arn
Now, writing the two series together:
G = a + ar + ar2 + · · · + arn−1
rG = ar + ar2 + ar3 + · · · arn−1 + arn
Subtracting the second expression from the first we see that all terms on the right-hand side cancel
out, except for the first term of the first expression and the last term of the second expression so
that
G − rG = (1 − r)G = a − arn
Hence (assuming r 6= 1)
a(1 − rn )
G=
1−r
(Of course, if r = 1 the geometric series simplifies to a simple arithmetic series with d = 0 and has
sum G = na.)
Key Point 3
The geometric series
a + ar + ar2 + · · · + arn−1
having n terms has sum G where
a(1 − rn )
G= , if r 6= 1 and G = na, if r = 1
1−r
10 HELM (2006):
Task
Find the sum of each of the following series:
(a) 1 + 2 + 3 + 4 + · · · + 100
1 1 1 1 1 1
(b) + + + + +
2 6 18 54 162 486
(a) In this arithmetic series state the values of a, d, n:

Your solution
a= d= n=
Answer
a = 1, d = 1, n = 100.
Now find the sum:

Your solution
1 + 2 + 3 + · · · + 100 =
Answer
1 + 2 + 3 + · · · + 100 = 50(2 + 99) = 50(101) = 5050.
(b) In this geometric series state the values of a, r, n:

Your solution
a= r= n=
Answer
1 1
a= , r= , n=6
2 3
Now find the sum:

Your solution
1 1 1 1 1 1
+ + + + + =
2 6 18 54 162 486
Answer
6 !
1
1− 6 !
1 1 1 1 3 3 1
+ + ··· + = = 1− = 0.74897
2 6 486 2 1 4 3
1−
3
HELM (2006): 11
Exercises
1. Which of the following sequences is convergent?
π 2π 3π 4π
(a) sin , sin , sin , sin , . . .
2 2 2 2
π 2π 3π 4π
sin sin sin sin
(b) 2 2 2 2
π , 2π , 3π , 4π , . . .
2 2 2 2
2. Write the following series in summation form:
ln 1 ln 3 ln 5 ln 27
(a) + + + ··· +
2×1 3×2 4×3 15 × 14
1 1 1 1
(b) − 2
+ 2
− 2
+ ··· +
2 × (1 + (100) ) 3 × (1 − (200) ) 4 × (1 + (300) ) 9 × (1 − (800)2 )
3. Write out the first three terms and the last term of the following series:
17 17
X 3p−1 X (−p)p+1
(a) (b)
p=1
p!(18 − p) p=4
p(2 + p)
4. Sum the series:

(a) −5 − 1 + 3 + 7 . . . + 27
(b) −5 − 9 − 13 − 17 . . . − 37
1 1 1 1 1 1
(c) − + − + −
2 6 18 54 162 486
Answers
1. (a) no; this sequence is 1, 0, −1, 0, 1, . . . which does not converge.
1 1 1
(b) yes; this sequence is , 0, − , 0, , . . . which converges to zero.
π/2 3π/2 5π/2
14 8
X ln(2p − 1) X (−1)p
2. (a) (b)
p=1
(p + 1)(p) p=1
(p + 1)(1 + (−1)p+1 p2 104 )
1 3 32 316 45 56 67 1718
3. (a) , , , ... , (b) − , , − , ... ,
17 2!(16) 3!(15) 17! (4)(6) (5)(7) (6)(8) (17)(19)
4. (a) This is an arithmetic series with a = −5, d = 4, n = 9. A = 99
(b) This is an arithmetic series with a = −5, d = −4, n = 9. A = −189
1 1
(c) This is a geometric series with a = , r = − , n = 6. G ≈ 0.3745
2 3
12 HELM (2006):
®

Infinite Series 16.2

Introduction
We extend the concept of a finite series, met in Section 16.1, to the situation in which the number
of terms increase without bound. We define what is meant by an infinite series being convergent
by considering the partial sums of the series. As prime examples of infinite series we examine the
harmonic and the alternating harmonic series and show that the former is divergent and the latter is
convergent.
We consider various tests for the convergence of series, in particular we introduce the ratio test which
is a test applicable to series of positive terms. Finally we define the meaning of the terms absolute
and conditional convergence.
' $
X
• be able to use the summation notation
Prerequisites • be familiar with the properties of limits
• be able to use inequalities
&
' %
$
• use the alternating series test on infinite
series
Learning Outcomes • use the ratio test on infinite series
On completion you should be able to . . . • understand the terms absolute and
conditional convergence
& %
HELM (2006): 13
Section 16.2: Infinite Series
1. Introduction
Many of the series considered in Section 16.1 were examples of finite series in that they all involved
the summation of a finite number of terms. When the number of terms in the series increases without
bound we refer to the sum as an infinite series. Of particular concern with infinite series is whether
they are convergent or divergent. For example, the infinite series
1 + 1 + 1 + 1 + ···
is clearly divergent because the sum of the first n terms increases without bound as more and more
terms are taken. It is less clear as to whether the harmonic and alternating harmonic series:
1 1 1 1 1 1
1+ + + + ··· 1 − + − + ···
2 3 4 2 3 4
converge or diverge. Indeed you may be surprised to find that the first is divergent and the second is
convergent. What we shall do in this Section is to consider some simple convergence tests for infinite
series. Although we all have an intuitive idea as to the meaning of convergence of an infinite series
we must be more precise in our approach. We need a definition for convergence which we can apply
rigorously.
First, using an obvious extension of the notation we have used for a finite sum of terms, we denote
the infinite series:
X∞
a1 + a2 + a3 + · · · + ap + · · · by the expression ap
p=1
where ap is an expression for the pth term in the series. So, as examples:
∞
X
1 + 2 + 3 + ··· = p since the pth term is ap ≡ p
p=1
∞
X
2 2 2
1 + 2 + 3 + ··· = p2 since the pth term is ap ≡ p2
p=1
∞
1 1 1 X (−1)p+1 (−1)p+1
1 − + − + ··· = here ap ≡
2 3 4 p=1
p p
Consider the infinite series:

∞
X
a1 + a2 + · · · + ap + · · · = ap
p=1
We consider the sequence of partial sums, S1 , S2 , . . . , of this series where
S 1 = a1
S 2 = a1 + a2
..
.
Sn = a1 + a2 + · · · + an
That is, Sn is the sum of the first n terms of the infinite series. If the limit of the sequence
S1 , S2 , . . . , Sn , . . . can be found; that is
14 HELM (2006):
®
lim Sn = S (say)
n→∞
then we define the sum of the infinite series to be S:

X∞
S= ap
p=1
and we say “the series converges to S”. Another way of stating this is to say that
∞
X n
X
ap = lim ap
n→∞
p=1 p=1
Key Point 4
Convergence of Infinite Series
∞
X
An infinite series ap is convergent if the sequence of partial sums
p=1
k
X
S1 , S2 , S3 , . . . , Sk , . . . in which Sk = ap is convergent
p=1
Divergence condition for an infinite series

An almost obvious requirement that an infinite series should be convergent is that the individual
terms in the series should get smaller and smaller. This leads to the following Key Point:
Key Point 5
The condition:
ap → 0 as p increases (mathematically lim ap = 0)

p→∞
∞
X
is a necessary condition for the convergence of the series ap
p=1
It is not possible for an infinite series to be convergent unless this condition holds.
HELM (2006): 15
Task
Which of the following series cannot be convergent?
1 2 3
(a) + + + · · ·
2 3 4
1 1 1
(b) 1 + + + + · · ·
2 3 4
1 1 1
(c) 1 − + − + · · ·
2 3 4
In each case, use the condition from Key Point 5:

Your solution
(a) ap = lim ap =
p→∞
Answer
p p
ap = limp→∞ =1
p+1 p+1
Hence series is divergent.
Your solution
(b) ap = lim ap =
p→∞
Answer
1
ap = lim ap = 0
p p→∞
So this series may be convergent. Whether it is or not requires further testing.
Your solution
(c) ap = lim ap =
p→∞
Answer
(−1)p+1
ap = lim ap = 0 so again this series may be convergent.
p p→∞
Divergence of the harmonic series

The harmonic series:
1 1 1 1
1 + + + + + ···
2 3 4 5
1
has a general term an = which clearly gets smaller and smaller as n → ∞. However, surprisingly,
n
the series is divergent. Its divergence is demonstrated by showing that the harmonic series is greater
than another series which is obviously divergent. We do this by grouping the terms of the harmonic
series in a particular way:

1 1 1 1 1 1 1 1 1 1 1
1 + + + + + ··· ≡ 1 + + + + + + + + ···
2 3 4 5 2 3 4 5 6 7 8
16 HELM (2006):
®
Now

1 1 1 1 1
+ > + =
3 4 4 4 2

1 1 1 1 1 1 1 1 1
+ + + > + + + =
5 6 7 8 8 8 8 8 2

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
+ + + + + + + > + + + + + + + =
9 10 11 12 13 14 15 16 16 16 16 16 16 16 16 16 2
and so on. Hence the harmonic series satisfies:

1 1 1 1 1 1 1 1 1 1
1+ + + + + + + + ··· > 1 + + + + ···
2 3 4 5 6 7 8 2 2 2
The right-hand side of this inequality is clearly divergent so the harmonic series is divergent.
Convergence of the alternating harmonic series

As with the harmonic series we shall group the terms of the alternating harmonic series, this time to
display its convergence.
The alternating harmonic series is:
1 1 1 1
S =1− + − + − ···
2 3 4 5
This series may be re-grouped in two distinct ways.
1st re-grouping

1 1 1 1 1 1 1 1 1 1
1 − + − + + ··· = 1 − − − − − − ···
2 3 4 5 2 3 4 5 6 7
1 1 1 1
each term in brackets is positive since > , > and so on. So we easily conclude that S < 1
2 3 4 5
since we are subtracting only positive numbers from 1.
2nd re-grouping

1 1 1 1 1 1 1 1 1
1 − + − + + ··· = 1− + − + − + ···
2 3 4 5 2 3 4 5 6
1 1 1 1 1
Again, each term in brackets is positive since 1 > , > , > and so on.
2 3 4 5 6
1
So we can also argue that S > since we are adding only positive numbers to the value of the first
2
1
term, . The conclusion that is forced upon us is that
2
1
<S<1
2
1
so the alternating series is convergent since its sum, S, lies in the range → 1. It will be shown in
2
Section 16.5 that S = ln 2 ' 0.693.
HELM (2006): 17
2. General tests for convergence
The techniques we have applied to analyse the harmonic and the alternating harmonic series are
‘one-off’:- they cannot be applied to infinite series in general. However, there are many tests that
can be used to determine the convergence properties of infinite series. Of the large number available
we shall only consider two such tests in detail.
The alternating series test

An alternating series is a special type of series in which the sign changes from one term to the next.
They have the form
a1 − a2 + a3 − a4 + · · ·
(in which each ai , i = 1, 2, 3, . . . is a positive number)
Examples are:
(a) 1 − 1 + 1 − 1 + 1 · · ·
1 2 3 4
(b) − + − + · · ·
3 4 5 6
1 1 1
(c) 1 − + − + · · · .
2 3 4
For series of this type there is a simple criterion for convergence:
Key Point 6
The Alternating Series Test
The alternating series
a1 − a2 + a3 − a4 + · · ·
(in which each ai , i = 1, 2, 3, . . . are positive numbers) is convergent if and only if
• the terms continually decrease:
a1 > a 2 > a 3 > . . .
• the terms decrease to zero:
ap → 0 as p increases (mathematically lim ap = 0)

p→∞
18 HELM (2006):
®
Task
Which of the following series are convergent?
∞ ∞
X (2p − 1) X (−1)p+1
(a) (−1)p (b)
p=1
(2p + 1) p=1
p2
(a) First, write out the series:

Your solution
Answer
1 3 5
− + − + ···
3 5 7
Now examine the series for convergence:
Your solution
Answer
1
(1 −
)
(2p − 1) 2p
= → 1 as p increases.
(2p + 1) 1
(1 + )
2p
Since the individual terms of the series do not converge to zero this is therefore a divergent series.
(b) Apply the procedure used in (a) to problem (b):

Your solution
Answer
1 1 1
This series 1 − 2 + 2 − 2 + · · · is an alternating series of the form a1 − a2 + a3 − a4 + · · · in
2 3 4
1 1 1
which ap = 2 . The ap sequence is a decreasing sequence since 1 > 2 > 2 > . . .
p 2 3
1
Also lim 2 = 0. Hence the series is convergent by the alternating series test.
p→∞ p
HELM (2006): 19
3. The ratio test
This test, which is one of the most useful and widely used convergence tests, applies only to series
of positive terms.
Key Point 7
The Ratio Test

∞
X ap+1
Let ap be a series of positive terms such that, as p increases, the limit of equals
p=1
ap
ap+1
a number λ. That is lim = λ.
p→∞ ap
It can be shown that:

∞
X
• if λ > 1, then ap diverges
p=1
∞
X
• if λ < 1, then ap converges
p=1
∞
X
• if λ = 1, then ap may converge or diverge.
p=1
That is, the test is inconclusive in this case.
20 HELM (2006):
®
Example 1
Use the ratio test to examine the convergence of the series
1 1 1
(a) 1 + + + + ··· (b) 1 + x + x2 + x3 + · · ·
2! 3! 4!
Solution
1
(a) The general term in this series is i.e.
p!
∞
1 1 X 1 1 1
1+ + + ··· = ap = ∴ ap+1 =
2! 3! p=1
p! p! (p + 1)!
and the ratio

ap+1 p! p(p − 1) . . . (3)(2)(1) 1
= = =
ap (p + 1)! (p + 1)p(p − 1) . . . (3)(2)(1) (p + 1)
ap+1 1
∴ lim = lim =0
p→∞ ap p→∞ (p + 1)
Since 0 < 1 the series is convergent. In fact, it will be easily shown, using the techniques outlined
in 16.5, that
1 1
1 + + + · · · = e − 1 ≈ 1.718
2! 3!
(b) Here we must assume that x > 0 since we can only apply the ratio test to a series of positive
terms.
Now
∞
X
2 3
1 + x + x + x + ··· = xp−1
p=1
so that
ap = xp−1 , ap+1 = xp
and
ap+1 xp
lim = lim p−1 = lim x = x
p→∞ ap p→∞ x p→∞
Thus, using the ratio test we deduce that (if x is a positive number) this series will only converge
if x < 1.
We will see in Section 16.4 that
1
1 + x + x2 + x3 + · · · = provided 0 < x < 1.
1−x
HELM (2006): 21
Task
Use the ratio test to examine the convergence of the series:
1 8 27
+ + + ···
ln 3 (ln 3)2 (ln 3)3
First, find the general term of the series:

Your solution
ap =
Answer
∞
1 8 X p3 p3
+ + · · · = so ap =
ln 3 (ln 3)2 p=1
(ln 3)p (ln 3)p
Now find ap+1 :

Your solution
ap+1 =
Answer
(p + 1)3
ap+1 =
(ln 3)p+1
ap+1
Finally, obtain lim :
p→∞ ap
Your solution
ap+1 ap+1
= ∴ lim =
ap p→∞ ap
Answer
3 3 3
ap+1 p+1 1 p+1 1
= . Now = 1+ → 1 as p increases
ap p (ln 3) p p
ap+1 1
∴ lim = <1
p→∞ ap (ln 3)
Hence this is a convergent series.
Note that in all of these Examples and Tasks we have decided upon the convergence or divergence of
various series; we have not been able to use the tests to discover what actual number the convergent
series converges to.
22 HELM (2006):
®
4. Absolute and conditional convergence

The ratio test applies to series of positive terms. Indeed this is true of many related tests for
convergence. However, as we have seen, not all series are series of positive terms. To apply the ratio
test such series must first be converted into series of positive terms. This is easily done. Consider
X∞ X∞
two series ap and |ap |. The latter series, obviously directly related to the first, is a series of
p=1 p=1
positive terms.
Using imprecise language, it is harder for the second series to converge than it is for the first, since,
in the first, some of the terms may be negative and cancel out part of the contribution from the
positive terms. No such cancellations can take place in the second series since they are all positive
X∞ ∞
X
terms. Thus it is plausible that if |ap | converges so does ap . This leads to the following
p=1 p=1
definitions.
Key Point 8
Conditional Convergence and Absolute Convergence
∞
X ∞
X
A convergent series ap is said to be conditionally convergent if |ap | is divergent.
p=1 p=1
∞
X ∞
X
A convergent series ap is said to be absolutely convergent if |ap | is convergent.
p=1 p=1
For example, the alternating harmonic series:

∞
X (−1)p+1 1 1 1
=1− + − + ···
p=1
p 2 3 4
is conditionally convergent since the series of positive terms (the harmonic series):
∞ ∞
(−1)p+1 X

X
≡ 1 1 1
= 1 + + + ···
p=1
p
p=1
p 2 3
is divergent.
HELM (2006): 23
Task
1 1 1
Show that the series − + − + · · · is absolutely convergent.
2! 4! 6!
First, find the general term of the series:

Your solution
∞
1 1 1 X
− + − + ··· = ( ) ∴ ap ≡
2! 4! 6! p=1
Answer
∞
1 1 1 X (−1)p (−1)p
− + − + ··· = ∴ ap ≡
2! 4! 6! p=1
(2p)! (2p)!
Write down an expression for the related series of positive terms:

Your solution
∞
1 1 1 X
+ + + ··· = ( ) ∴ ap =
2! 4! 6! p=1
Answer
∞
X 1 1
so ap =
p=1
(2p)! (2p)!
Now use the ratio test to examine the convergence of this series:
Your solution
pth term = (p + 1)th term =
Answer
1 1
pth term = (p + 1)th term =
(2p)! (2(p + 1))!
(p + 1)th term

Find lim :
p→∞ pth term
Your solution
(p + 1)th term

lim =
p→∞ pth term
Answer
(2p)! 2p(2p − 1) . . . 1
= = → 0 as p increases.
(2(p + 1))! (2p + 2)(2p + 1)2p(2p − 1) . . . (2p + 2)(2p + 1)
∞
X (−1)p
So the series of positive terms is convergent by the ratio test. Hence is absolutely
p=1
(2p)!
convergent.
24 HELM (2006):
®
Exercises
1. Which of the following alternating series are convergent?
π
∞ ∞ ∞ p sin(2p + 1)
X (−1)p ln(3) X (−1)p+1 X
2
(a) (b) (c)
p=1
p p=1
p2 + 1 p=1
(p + 100)
2. Use the ratio test to examine the convergence of the series:

∞ ∞ ∞
X e4 X p3 X 1
(a) (b) (c) √
p=1
(2p + 1)p+1 p=1
p! p=1
p
∞ ∞
X 1 X (−1)p+1
(d) (e)
p=1
(0.3)p p=1
3p
3. For what values of x are the following series absolutely convergent?

∞ ∞
X (−1)p xp X (−1)p xp
(a) (b)
p=1
p p=1
p!
Answers
1. (a) convergent, (b) convergent, (c) divergent
2. (a) λ = 0 so convergent
(b) λ = 0 so convergent
1 1
(c) λ = 1 so test is inconclusive. However, since > then the given series is divergent
p1/2 p
by comparison with the harmonic series.
(d) λ = 10/3 so divergent, (e) Not a series of positive terms so the ratio test cannot be
applied.
∞
X |x|p
3. (a) The related series of positive terms is . For this series, using the ratio test we find
p=1
p
λ = |x| so the original series is absolutely convergent if |x| < 1.
∞
X |x|p
(b) The related series of positive terms is . For this series, using the ratio test we
p=1
p!
find λ = 0 (irrespective of the value of x) so the original series is absolutely convergent for
all values of x.
HELM (2006): 25

The Binomial Series 16.3

Introduction
In this Section we examine an important example of an infinite series, the binomial series:
p(p − 1) 2 p(p − 1)(p − 2) 3
1 + px + x + x + ···
2! 3!
We show that this series is only convergent if |x| < 1 and that in this case the series sums to the
value (1 + x)p . As a special case of the binomial series we consider the situation when p is a positive
integer n. In this case the infinite series reduces to a finite series and we obtain, by replacing x with
b
, the binomial theorem:
a
n(n − 1) n−2 2
(b + a)n = bn + nbn−1 a + b a + · · · + an .
2!
Finally, we use the binomial series to obtain various polynomial expressions for (1 + x)p when x is
‘small’.
' $
• understand the factorial notation
Prerequisites • have knowledge of the ratio test for

convergence of infinite series.
• understand the use of inequalities
&
' %
$
• recognise and use the binomial series
Learning Outcomes • state and use the binomial theorem
On completion you should be able to . . . • use the binomial series to obtain numerical
approximations
& %
26 HELM (2006):
®
1. The binomial series

A very important infinite series which occurs often in applications and in algebra has the form:
p(p − 1) 2 p(p − 1)(p − 2) 3
1 + px + x + x + ···
2! 3!
in which p is a given number and x is a variable. By using the ratio test it can be shown that this
series converges, irrespective of the value of p, as long as |x| < 1. In fact, as we shall see in Section
16.5 the given series converges to the value (1 + x)p as long as |x| < 1.
Key Point 9
The Binomial Series
p(p − 1) 2 p(p − 1)(p − 2) 3
(1 + x)p = 1 + px + x + x + ··· |x| < 1
2! 3!
The binomial theorem can be obtained directly from the binomial series if p is chosen to be a positive
integer (here we need not demand that |x| < 1 as the series is now finite and so is always convergent
irrespective of the value of x). For example, with p = 2 we obtain
2(1) 2
(1 + x)2 = 1 + 2x + x + 0 + 0 + ···
2
= 1 + 2x + x2 as is well known.
With p = 3 we get
3(2) 2 3(2)(1) 3
(1 + x)3 = 1 + 3x + x + x + 0 + 0 + ···
2 3!
= 1 + 3x + 3x2 + x3
Generally if p = n (a positive integer) then

n(n − 1) 2 n(n − 1)(n − 2) 3
(1 + x)n = 1 + nx + x + x + · · · + xn
2! 3!
b
which is a form of the binomial theorem. If x is replaced by then
a
n 2 n
b b n(n − 1) b b
1+ =1+n + + ··· +
a a 2! a a
Now multiplying both sides by an we have the following Key Point:
HELM (2006): 27
Section 16.3: The Binomial Series
Key Point 10
The Binomial Theorem
If n is a positive integer then the expansion of (a + b) raised to the power n is given by:
n(n − 1) n−2 2
(a + b)n = an + nan−1 b + a b + · · · + bn
2!
This is known as the binomial theorem.
Task
Use the binomial theorem to obtain (a) (1 + x)7 (b) (a + b)4
(a) Here n = 7:
Your solution
(1 + x)7 =
Answer
(1 + x)7 = 1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7
(b) Here n = 4:
Your solution
(a + b)4 =
Answer
(a + b)4 = a4 + 4a3 b + 6a2 b2 + 4ab3 + b4 .
Task
Given that x is so small that powers of x3 and above may be ignored in comparison
1
to lower order terms, find a quadratic approximation of (1 − x) 2 and check for
accuracy your approximation for x = 0.1.
1 1
First expand (1 − x) 2 using the binomial series with p = and with x replaced by (−x):
2
Your solution
1
(1 − x) 2 =
28 HELM (2006):
®
Answer
1
1 1
(− 12 ) (− 12 )(− 32 )
(1 − x) 2 = 1 − 21 x + 2
2
x2 − 2
6
x3 + · · ·
Now obtain the quadratic approximation:

Your solution
1
(1 − x) 2 '
Answer
1 1 1
(1 − x) 2 ' 1 − x − x2
2 8
Now check on the validity of the approximation by choosing x = 0.1:
Your solution
Answer
On the left-hand side we have
1
(0.9) 2 = 0.94868 to 5 d.p. obtained by calculator
whereas, using the quadratic expansion:
1 1 1
(0.9) 2 ≈ 1 − (0.1) − (0.1)2 = 1 − 0.05 − (0.00125) = 0.94875.
2 8
so the error is only 0.00007.
1
What we have done in this last Task is to replace (or approximate) the function (1−x) 2 by the simpler
1 1
(polynomial) function 1 − x − x2 which is reasonable provided x is very small. This approximation
2 8
1 1 1
is well illustrated geometrically by drawing the curves y = (1 − x) 2 and y = 1 − x − x2 . The two
2 8
curves coincide when x is ‘small’. See Figure 2:
1
(1 − x) 2 y
1 1
1 − x − x2
2 8
1 x
Figure 2
HELM (2006): 29
Task
1
Obtain a cubic approximation of . Check your approximation for accuracy
(2 + x)
using appropriate values of x.
1
First write the term in a form suitable for the binomial series (refer to Key Point 9):
(2 + x)
Your solution
1
=
(2 + x)
Answer
1 1 1 x −1
= x = 1 +
2+x 2 1+ 2 2
2
x
Now expand using the binomial series with p = −1 and instead of x, to include terms up to x3 :
2
Your solution
1 x −1
1+ =
2 2
Answer

1 x −1 1 x (−1)(−2) x 2 (−1)(−2)(−3) x 3
1+ = 1 + (−1) + +
2 2 2 2 2! 2 3! 2
2 3
1 x x x
= − + −
2 4 8 16
x −1
State the range of x for which the binomial series of 1 + is valid:
2
Your solution
The series is valid if
Answer x
valid as long as < 1 i.e. |x| < 2 or −2 < x < 2

2
30 HELM (2006):
®
Choose x = 0.1 to check the accuracy of your approximation:

Your solution
−1
1 0.1
1+ =
2 2
1 0.1 0.01 0.001
− + − =
2 4 8 16
Answer
−1
1 0.1
1+ = 0.47619 to 5 d.p.
2 2
1 0.1 0.01 0.001
− + − = 0.4761875.
2 4 8 16
Figure 3 below illustrates the close correspondence (when x is ‘small’) between the curves y =
1 x −1 1 x x2 x3
(1 + ) and y = − + − .
2 2 2 4 8 16
1 x x2 x3 y
− + −
2 4 8 16
(2 + x)−1
x
2
Figure 3
Exercises
1. Determine the expansion of each of the following
(a) (a + b)3 , (b) (1 − x)5 , (c) (1 + x2 )−1 , (d) (1 − x)1/3 .
2. Obtain a cubic approximation (valid if x is small) of the function (1 + 2x)3/2 .
Answers
1. (a) (a + b)3 = a3 + 3a2 b + 3ab2 + b3

(b) (1 − x)5 = 1 − 5x + 10x2 − 10x3 + 5x4 − x5
(c) (1 + x2 )−1 = 1 − x2 + x4 − x6 + · · ·
1 1 5
(d) (1 − x)1/3 = 1 − x − x2 − x3 + · · ·
3 9 81
3 1
2. (1 + 2x)3/2 = 1 + 3x + x2 − x3 + · · ·
2 2
HELM (2006): 31

Power Series 16.4
Introduction
In this Section we consider power series. These are examples of infinite series where each term
contains a variable, x, raised to a positive integer power. We use the ratio test to obtain the radius
of convergence R, of the power series and state the important result that the series is absolutely
convergent if |x| < R, divergent if |x| > R and may or may not be convergent if x = ±R. Finally,
we extend the work to apply to general power series when the variable x is replaced by (x − x0 ).
#
• have knowledge of infinite series and of the
ratio test
Prerequisites
Before starting this Section you should . . . • have knowledge of inequalities and of the
factorial notation.
"
' !
$
• explain what a power series is
Learning Outcomes • obtain the radius of convergence for a power

series
• explain what a general power series is
& %
32 HELM (2006):
®
1. Power series
A power series is simply a sum of terms each of which contains a variable raised to a non-negative
integer power. To illustrate:
x − x3 + x5 − x7 + · · ·
x2 x3
1+x+ + + ···
2! 3!
are examples of power series. In 16.3 we encountered an important example of a power series,
the binomial series:
p(p − 1) 2 p(p − 1)(p − 2) 3
1 + px + x + x + ···
2! 3!
which, as we have already noted, represents the function (1 + x)p as long as the variable x satisfies
|x| < 1.
A power series has the general form
∞
X
2
b0 + b1 x + b2 x + · · · = bp x p
p=0
where b0 , b1 , b2 , · · · are constants. Note that, in the summation notation, we have chosen to start
the series at p = 0. This is to ensure that the power series can include a constant term b0 since
x0 = 1.
The convergence, or otherwise, of a power series, clearly depends upon the value of x chosen. For
example, the power series
x x2 x3
1+ + + + ···
2 3 4
is convergent if x = −1 (for then it is the alternating harmonic series) and divergent if x = +1 (for
then it is the harmonic series).
2. The radius of convergence

The most important statement one can make about a power series is that there exists a number, R,
called the radius of convergence, such that if |x| < R the power series is absolutely convergent and
if |x| > R the power series is divergent. At the two points x = −R and x = R the power series
may be convergent or divergent.
HELM (2006): 33
Section 16.4: Power Series
Key Point 11
Convergence of Power Series

∞
X
For a power series bp xp with radius of convergence R then
p=0
• the series converges absolutely if |x| < R

• the series diverges if |x| > R
• the series may be convergent or divergent at x = ±R
−R 0 R
x
divergent convergent divergent
∞
X
For any particular power series bp xp the value of R can be obtained using the ratio test. We
p=0
∞
X
know, from the ratio test that bp xp is absolutely convergent if
p=0
|bp+1 xp+1 |

bp+1 bp bp
lim = lim |x| < 1 implying |x| < lim and so R = lim .
p→∞ |bp xp | p→∞ bp p→∞ bp+1 p→∞ bp+1
Example 2
(a) Find the radius of convergence of the series
x x2 x3
1+ + + + ···
2 3 4
(b) Investigate what happens at the end-points x = −1, x = +1 of the region of
absolute convergence.
34 HELM (2006):
®
Solution
∞
x x2 x3 X xp
(a) Here 1 + + + + ··· =
2 3 4 p=0
p+1
so
1 1
bp = ∴ bp+1 =
p+1 p+2
In this case,

p + 2
R = lim =1
p→∞ p + 1
so the given series is absolutely convergent if |x| < 1 and is divergent if |x| > 1.
(b) At x = +1 the series is 1 + 21 + 13 + · · · which is divergent (the harmonic series). However, at
x = −1 the series is 1 − 12 + 31 − 41 + · · · which is convergent (the alternating harmonic series).
Finally, therefore, the series
x x2 x3
1+ + + + ···
2 3 4
is convergent if −1 ≤ x < 1.
Task
Find the range of values of x for which the following power series converges:
x x2 x3
1+ + + 3 + ···
3 32 3
First find the coefficient of xp :

Your solution
bp =
Answer
1
bp = p
3
Now find R, the radius of convergence:
Your solution
bp
R = lim =
p→∞ bp+1
Answer
p+1
bp
= lim 3 = lim (3) = 3.

R = lim
p→∞ bp+1 p→∞ 3p p→∞
When x = ±3 the series is clearly divergent. Hence the series is convergent only if −3 < x < 3.
HELM (2006): 35
3. Properties of power series
Let P1 and P2 represent two power series with radii of convergence R1 and R2 respectively. We can
combine P1 and P2 together by addition and multiplication. We find the following properties:
Key Point 12
If P1 and P2 are power series with respective radii of convergence R1 and R2 then the sum (P1 + P2 )
and the product (P1 P2 ) are each power series with the radius of convergence being the smaller of
R1 and R2 .
Power series can also be differentiated and integrated on a term by term basis:
Key Point 13
If P1 is a power series with radius of convergence R1 then
Z
d
(P1 ) and (P1 ) dx
dx
are each power series with radius of convergence R1
Example 3
p(p − 1) 2
Using the known result that (1 + x)p = 1 + px + x +··· |x| < 1,
2!
1
choose p = 1
2
and by differentiating obtain the power series expression for (1+x)− 2 .
Solution
1
− 12 2 1
− 12 − 32 3

1 x 2 2
(1 + x) = 1 + +
2 x + x + ···
2 2! 3!
1
− 12 − 32 2

1 − 12 1 1 1 2
Differentiating both sides: (1 + x) = + − x+ x + ···
2 2 2 2 2
1 3

1 1 − −
Multiplying through by 2: (1 + x)− 2 = 1 − x + 2 2
x2 + · · ·
2 2
This result can, of course, be obtained directly from the expansion for (1 + x)p with p = − 12 .
36 HELM (2006):
®
Task
Using the known result that
1
= 1 − x + x2 − x3 + · · · |x| < 1,
1+x
(a) Find an expression for ln(1 + x)
(b) Use the expression to obtain an approximation to ln(1.1)
1
(a) Integrate both sides of = 1 − x + x2 − · · · and so deduce an expression for ln(1 + x):
1+x
YourZsolution
dx
=
1+x
Z
(1 − x + x2 − · · · ) dx =
Answer
Z
dx
= ln(1 + x) + c where c is a constant of integration,
1+x
x2 x3
Z
(1 − x + x2 − · · · ) dx = x − + − · · · + k where k is a constant of integration.
2 3
x2 x3
So we conclude ln(1 + x) + c = x − + − ··· + k if |x| < 1
2 3
Choosing x = 0 shows that c = k so they cancel from this equation.
(b) Now choose x = 0.1 to approximate ln(1 + 0.1) using terms up to cubic:
Your solution
(0.1)2 (0.1)3
ln(1.1) = 0.1 − + − ··· '
2 3
Answer
ln(1.1) ' 0.0953 which is easily checked by calculator.
HELM (2006): 37
4. General power series
A general power series has the form
∞
X
b0 + b1 (x − x0 ) + b2 (x − x0 )2 + · · · = bp (x − x0 )p
p=0
Exactly the same considerations apply to this general power series as apply to the ‘special’ series
X∞
bp xp except that the variable x is replaced by (x − x0 ). The radius of convergence of the general
p=0
series is obtained in the same way:

bp
R = lim
p→∞ bp+1
and the interval of convergence is now shifted to have centre at x = x0 (see Figure 4 below). The
series is absolutely convergent if |x − x0 | < R, diverges if |x − x0 | > R and may or may not converge
if |x − x0 | = R.
0 x0−R x0 x0+R
x
Figure 4
Task
Find the radius of convergence of the general power series
1 − (x − 1) + (x − 1)2 − (x − 1)3 + · · ·
First find an expression for the general term:

Your solution
∞
X
1 − (x − 1) + (x − 1)2 − (x − 1)3 + · · · =
p=0
Answer
X∞
(x − 1)p (−1)p so bp = (−1)p
p=0
Now obtain the radius of convergence:

Your solution

bp
lim = ∴ R=
p→∞ bp+1
Answer
(−1)p

bp
lim = lim
= 1.
p→∞ bp+1 p→∞ (−1)p+1
Hence R = 1, so the series is absolutely convergent if |x − 1| < 1.
38 HELM (2006):
®
Finally, decide on the convergence at |x − 1| = 1 (i.e. at x − 1 = −1 and x − 1 = 1 i.e. x = 0 and

x = 2):
Your solution
Answer
At x = 0 the series is 1 + 1 + 1 + · · · which diverges and at x = 2 the series is 1 − 1 + 1 − 1 · · ·
which also diverges. Thus the given series only converges if |x − 1| < 1 i.e. 0 < x < 2.
0 2
x
Exercises
1
1. From the result = 1 + x + x2 + x3 + . . . , |x| < 1
1−x
(a) Find an expression for ln(1 − x)
(b) Use this expression to obtain an approximation to ln(0.9) to 4 d.p.
2. Find the radius of convergence of the general power series 1−(x+2)+(x+2)2 −(x+2)3 +. . .
x x2 x3
3. Find the range of values of x for which the power series 1 + + + 3 + . . . converges.
4 42 4
4. By differentiating the series for (1 + x)1/3 find the power series for (1 + x)−2/3 and state its
radius of convergence.
x x2 x3
5. (a) Find the radius of convergence of the series 1 + + + + ...
3 4 5
(b) Investigate what happens at the points x = −1 and x = +1
Answers
x2 x3 x4
1. ln(1 − x) = −x − − − − ... ln(0.9) ≈ −0.1054 (4 d.p.)
2 3 4
2. R = 1. Series converges if −3 < x < −1. If x = −1 series diverges. If x = −3 series diverges.
3. Series converges if −4 < x < 4.
2 5
4. (1 + x)−2/3 = 1 − x + x2 + . . . valid for |x| < 1.
3 3
5. (a) R = 1. (b) At x = +1 series diverges. At x = −1 series converges.
HELM (2006): 39
Maclaurin and Taylor
Series 16.5
Introduction
In this Section we examine how functions may be expressed in terms of power series. This is an
extremely useful way of expressing a function since (as we shall see) we can then replace ‘complicated’
functions in terms of ‘simple’ polynomials. The only requirement (of any significance) is that the
‘complicated’ function should be smooth; this means that at a point of interest, it must be possible
to differentiate the function as often as we please.
' $
• have knowledge of power series and of the
ratio test
Prerequisites • be able to differentiate simple functions
• be familiar with the rules for combining
power series
&
' %
$
• find the Maclaurin and Taylor series
expansions of given functions
Learning Outcomes • find Maclaurin expansions of functions by

combining known power series together
• find Maclaurin expansions by using
differentiation and integration
& %
40 HELM (2006):
®
1. Maclaurin and Taylor series

As we shall see, many functions can be represented by power series. In fact we have already seen in
earlier Sections examples of such a representation:
1
= 1 + x + x2 + · · · |x| < 1
1−x
x2 x3
ln(1 + x) = x − + − ··· −1<x≤1
2 3
x x2 x3
e = 1+x+ + + ··· all x
2! 3!
The first two examples show that, as long as we constrain x to lie within the domain |x| < 1
1
(or, equivalently, −1 < x < 1), then in the first case has the same numerical value as
1−x
x2 x3
1 + x + x2 + · · · and in the second case ln(1 + x) has the same numerical value as x − + − · · · .
2 3
2
x
In the third example we see that ex has the same numerical value as 1 + x + + · · · but in this
2!
case there is no restriction to be placed on the value of x since this power series converges for all
values of x. Figure 5 shows this situation geometrically. As more and more terms are used from the
x2 x3
series 1 + x + + · · · the curve representing ex is a better and better approximation. In (a) we
2! 3!
show the linear approximation to ex . In (b) and (c) we show, respectively, the quadratic and cubic
approximations.
x2 x2 x3
1+x+ 1+x+ +
2! 2! 3!
y y y
1+x
ex ex ex
x x x
(a) (b) (c)
Figure 5: Linear, quadratic and cubic approximations to ex

These power series representations are extremely important, from many points of view. Numerically,
1
we can simply replace the function by the quadratic expression 1 + x + x2 as long as x is
1−x
so small that powers of x greater than or equal to 3 can be ignored in comparison to quadratic
terms. This approach can be used to approximate more complicated functions in terms of simpler
polynomials. Our aim now is to see how these power series expansions are obtained.
HELM (2006): 41
Section 16.5: Maclaurin and Taylor Series
2. The Maclaurin series
Consider a function f (x) which can be differentiated at x = 0 as often as we please. For example
ex , cos x, sin x would fit into this category but |x| would not.
Let us assume that f (x) can be represented by a power series in x:
∞
X
2 3 4
f (x) = b0 + b1 x + b2 x + b3 x + b4 x + · · · = bp x p
p=0
where b0 , b1 , b2 , . . . are constants to be determined.

If we substitute x = 0 then, clearly f (0) = b0
The other constants can be determined by further differentiating and, on each differentiation, sub-
stituting x = 0. For example, differentiating once:
f 0 (x) = 0 + b1 + 2b2 x + 3b3 x2 + 4b4 x3 + · · ·
so, putting x = 0, we have f 0 (0) = b1 .
Continuing to differentiate:
f 00 (x) = 0 + 2b2 + 3(2)b3 x + 4(3)b4 x2 + · · ·
so
1
f 00 (0) = 2b2 or b2 = f 00 (0)
2
Further:
1 000
f 000 (x) = 3(2)b3 +4(3)(2)b4 x+· · · so f 000 (0) = 3(2)b3 implying b3 = f (0)
3(2)
Continuing in this way we easily find that (remembering that 0! = 1)
1 (n)
bn = f (0) n = 0, 1, 2, . . .
n!
where f (n) (0) means the value of the nth derivative at x = 0 and f (0) (0) means f (0).
Bringing all these results together we have:
Key Point 14
Maclaurin Series
If f (x) can be differentiated as often as required:
∞
x2 x3 X xp
f (x) = f (0) + xf (0) + f 00 (0) + f 000 (0) + · · · =
0
f (p) (0)
2! 3! p=0
p!
This is called the Maclaurin expansion of f (x).
42 HELM (2006):
®
Example 4
Find the Maclaurin expansion of cos x.
Solution
Here f (x) = cos x and, differentiating a number of times:
f (x) = cos x, f 0 (x) = − sin x, f 00 (x) = − cos x, f 000 (x) = sin x etc.
Evaluating each of these at x = 0:
f (0) = 1, f 0 (0) = 0, f 00 (0) = −1, f 000 (0) = 0 etc.
x2 00 x3
Substituting into f (x) = f (0) + xf 0 (0) + f (0) + f 000 (0) + · · · , gives:
2! 3!
x2 x4 x6
cos x = 1 − + − + ···
2! 4! 6!
The reader should confirm (by finding the radius of convergence) that this series is convergent for
all values of x. The geometrical approximation to cos x by the first few terms of its Maclaurin series
are shown in Figure 6.
x2 x4
1− +
2! 4!
y y y
x x x
cos x cos x cos x
x2
1−
2!
Figure 6: Linear, quadratic and cubic approximations to cos x
Task
Find the Maclaurin expansion of ln(1 + x).
(Note that we cannot find a Maclaurin expansion of the function ln x since ln x
does not exist at x = 0 and so cannot be differentiated at x = 0.)
Find the first four derivatives of f (x) = ln(1 + x):

Your solution
f 0 (x) = f 00 (x) = f 000 (x) = f 0000 (x) =
HELM (2006): 43
Answer
1 −1 2
f 0 (x) = , f 00 (x) = , f 000 (x) = ,
1+x (1 + x)2 (1 + x)3
(−1)n+1 (n − 1)!
generally: f (n) (x) =
(1 + x)n
Now obtain f (0), f 0 (0), f 00 (0), f 000 (0):

Your solution
f (0) = f 0 (0) = f 00 (0) = f 000 (0) =
Answer
f (0) = 0 f 0 (0) = 1, f 00 (0) = −1, f 000 (0) = 2,
generally: f (n) (0) = (−1)n+1 (n − 1)!
Hence, obtain the Maclaurin expansion of ln(1 + x):

Your solution
ln(1 + x) =
Answer
x2 x3 (−1)n+1 n
ln(1 + x) = x − + ... + x + · · · (This was obtained in Section 16.4, page 37.)
2 3 n
Now obtain the radius of convergence and consider the situation at the boundary values:
Your solution
Radius of convergence R=
Answer
R = 1. Also at x = 1 the series is convergent (alternating harmonic series) and at x = −1 the
series is divergent. Hence this Maclaurin expansion is only valid if −1 < x ≤ 1.
The geometrical closeness of the polynomial terms with the function ln(1 + x) for −1 < x ≤ 1 is
displayed in Figure 7:
x 2 x3
x x− +
y y y 2 3
ln(1 + x) ln(1 + x) ln(1 + x)
x x x
x2
x−
2
Figure 7: Linear, quadratic and cubic approximations to ln(1 + x)
44 HELM (2006):
®
1 1 1
Note that when x = 1 ln 2 = 1 − + − · · · so the alternating harmonic series converges to
2 3 4
ln 2 ' 0.693, as stated in Section 16.2, page 17.
The Maclaurin expansion of a product of two functions: f (x)g(x) is obtained by multiplying together
the Maclaurin expansions of f (x) and of g(x) and collecting like terms together. The product series
will have a radius of convergence equal to the smaller of the two separate radii of convergence.
Example 5
Find the Maclaurin expansion of ex ln(1 + x).
Solution
Here, instead of finding the derivatives of f (x) = ex ln(1 + x), we can more simply multiply together
the Maclaurin expansions for ex and ln(1 + x) which we already know:
x2 x3
ex = 1 + x + + + ··· all x
2! 3!
and
x2 x3
ln(1 + x) = x − + + ··· −1<x≤1
2 3
The resulting power series will only be convergent if −1 < x ≤ 1. Multiplying:
x2 x3 x2 x3

x
e ln(1 + x) = 1+x+ + + ··· x− + + ···
2! 3! 2 3
x2 x3 x4
= x− + − + ···
2 3 4
x3 x4
+ x2 − + + ···
2 3
x3 x4
+ − ···
2 4
x4
+ ···
6
x2 x3 3x5
= x+ + + + ··· −1<x≤1
2 3 40
(You must take care not to miss relevant terms when carrying through the multiplication.)
HELM (2006): 45
Task
Find the Maclaurin expansion of cos2 x up to powers of x4 . Hence write down
the expansion of sin2 x to powers of x6 .
First, write down the expansion of cos x:

Your solution
cos x =
Answer
x2 x4
cos x = 1 − + + ···
2! 4!
Now, by multiplication, find the expansion of cos2 x:
Your solution
cos2 x =
Answer
x2 x4 x2 x4

2
cos x = 1 − + ··· 1− + ···
2! 4! 2! 4!
x2 x4 x2 x4 x4 x4 2x6
= (1 − + · · · ) + (− + · · · ) + ( · · · ) + · · · = 1 − x2 + − ···
2! 4! 2! 4 4! 3 45
Now obtain the expansion of sin2 x using a suitable trigonometric identity:
Your solution
sin2 x =
Answer
x4 2x6 x4 2x6

2 2 2
sin x = 1 − cos x = 1 − 1 − x + − + ··· = x2 − + + ···
3 45 3 45
As an alternative approach the reader could obtain the power series expansion for cos2 x by using the
1
trigonometric identity cos2 x ≡ (1 + cos 2x).
2
46 HELM (2006):
®
Example 6
Find the Maclaurin expansion of tanh x up to powers of x5 .
Solution
The first two derivatives of f (x) = tanh x are
f 0 (x) = sech2 x f 00 (x) = −2sech2 x tanh x f 000 (x) = 4sech2 x tanh2 x − 2sech4 x ···
giving f (0) = 0, f 0 (0) − 1, f 00 (0) = 0 f 000 (0) = −2 ···
1 2
This leads directly to the Maclaurin expansion as tanh x = 1 − x3 + x5 ···
3 15
Example 7
The relationship between the wavelength, L, the wave period, T , and
thewater
gT 2 2πd
depth, d, for a surface wave in water is given by: L= tanh
2π L
In a particular case the wave period was 10 s and the water depth was 6.1 m.
Taking the acceleration due to gravity, g, as 9.81 m s−2 determine the wave
length.
[Hint: Use the series expansion for tanh x developed in Example 6.]
Solution
Substituting for the wave period, water depth and g we get
9.81 × 102

2π × 6.1 490.5 12.2π
L= tanh = tanh
2π L π L
x3 2x5
The series expansion of tanh x is given by tanh x = x −
+ + ···
3 15
Using the series expansion of tanh x we can approximate the equation as
( 3 )
490.5 12.2π 1 12.2π
L= − + ···
π L 3 L
Multiplying through by πL3 the equation becomes

490.5
πL4 = 490.5 × 12.2πL2 − × (12.2π)3
3
This equation can be rewritten as L4 − 5984.1L2 + 2930198 = 0
Solving this as a quadratic in L2 we get L = 74 m.
Using Newton-Raphson iteration this can be further refined to give a wave length of 73.9 m.
HELM (2006): 47
3. Differentiation of Maclaurin series
We have already noted that, by the binomial series,
1
= 1 + x + x2 + x3 + · · · |x| < 1
1−x
Thus, with x replaced by −x
1
= 1 − x + x2 − x3 + · · · |x| < 1
1+x
We have previously obtained the Maclaurin expansion of ln(1 + x):
x2 x3 x4
ln(1 + x) = x − + − + ··· −1<x≤1
2 3 4
Now, we differentiate both sides with respect to x:
1
= 1 − x + x2 − x3 + · · ·
1+x
This result matches that found from the binomial series and demonstrates that the Maclaurin ex-
pansion of a function f (x) may be differentiated term by term to give a series which will be the
df
Maclaurin expansion of .
dx
As we noted in Section 16.4 the derived series will have the same radius of convergence as the
original series.
Task
Find the Maclaurin expansion of (1 − x)−3 and state its radius of convergence.
First write down the expansion of (1 − x)−1 :

Your solution
1
1−x
Answer
1
= 1 + x + x2 + · · · |x| < 1
1−x
1
Now, by differentiation, obtain the expansion of :
(1 − x)2
Your solution
1 d 1
= =
(1 − x)2 dx 1 − x
Answer
1 d
2
= (1 + x + x2 + · · · ) = 1 + 2x + 3x2 + 4x3
(1 − x) dx
48 HELM (2006):
®
Differentiate again to obtain the expansion of (1 − x)−3 :

Your solution
1 1 d 1 1
= = [ ]
(1 − x)3 2 dx (1 − x)2 2
=
Answer
1 1 d 1 1
3
= = [2 + 6x + 12x2 + 20x3 + · · · ] = 1 + 3x + 6x2 + 10x3 + · · ·
(1 − x) 2 dx (1 − x)2 2
Finally state its radius of convergence:
Your solution
Answer
The final series: 1 + 3x + 6x2 + 10x3 + · · · has radius of convergence R = 1 since the original series
bn
has this radius of convergence. This can also be found directly using the formula R = lim
n→∞ bn+1
1
and using the fact that the coefficient of the nth term is bn = n(n + 1).
2
4. The Taylor series

The Taylor series is a generalisation of the Maclaurin series being a power series developed in powers
of (x − x0 ) rather than in powers of x. Thus
Key Point 15
Taylor Series
If the function f (x) can be differentiated as often as required at x = x0 then:
(x − x0 )2 00
f (x) = f (x0 ) + (x − x0 )f 0 (x0 ) + f (x0 ) + · · ·
2!
This is called the Taylor series of f (x) about the point x0 .
The reader will see that the Maclaurin expansion is the Taylor expansion obtained if x0 is chosen to
be zero.
HELM (2006): 49
Task
1
Obtain the Taylor series expansion of about x = 2. (That is, find a power
1−x
series in powers of (x − 2).)
1
First, obtain the first three derivatives and the nth derivative of f (x) = :
1−x
Your solution
f 0 (x) = f 00 (x) = f 000 (x) = f (n) (x) =
Answer
1 2 6 n!
f 0 (x) = , f 00 (x) = , f 000 (x) = , ··· f (n) (x) =
(1 − x)2 (1 − x)3 (1 − x)4 (1 − x)n+1
Now evaluate these derivatives at x0 = 2:

Your solution
f 0 (2) = f 00 (2) = f 000 (2) = f (n) (2) =
Answer
f 0 (2) = 1, f 00 (2) = −2, f 000 (2) = 6, f (n) (2) = (−1)n+1 n!
1
Hence, write down the Taylor expansion of f (x) = about x = 2:
1−x
Your solution
1
=
1−x
Answer
1
= −1 + (x − 2) − (x − 2)2 + (x − 2)3 + · · · + (−1)n+1 (x − 2)n + · · ·
1−x
50 HELM (2006):
®
Exercises
1. Show that the series obtained in the last Task is convergent if |x − 2| < 1.
1
2. Sketch the linear, quadratic and cubic approximations to obtained from the series in the
1−x
1
last task and compare to .
1−x
Answer
2. In the following diagrams some of the terms from the Taylor series are plotted to compare with
1
.
(1 − x)
1 2
x
−1 + (x − 2)
1
1−x
1 2
x
−1 + (x − 2) − (x − 2)2
1
1−x
−1 + (x − 2) − (x − 2)2 + (x − 2)3
y
1 2
x
1
1−x
HELM (2006): 51
Contents 17
Conics and
Polar Coordinates
17.1 Conic Sections 2
17.2 Polar Coordinates 23
17.3 Parametric Curves 33
Learning outcomes
In this Workbook you will learn about some of the most important curves in the whole of
mathematics - the conic sections: the ellipse, the parabola and the hyperbola. You will
learn how to recognise these curves and how to describe them in Cartesian and in polar
form. In the final block you will learn how to describe cruves using a parametric approach
and, in particular, how the conic sections are described in parametric form.

Conic Sections 17.1
Introduction
The conic sections (or conics) - the ellipse, the parabola and the hyperbola - play an important
role both in mathematics and in the application of mathematics to engineering. In this Section we
look in detail at the equations of the conics in both standard form and general form.
Although there are various ways that can be used to define a conic, we concentrate in this Section on
defining conics using Cartesian coordinates (x, y). However, at the end of this Section we examine
an alternative way to obtain the conics.
' $
• be able to factorise simple algebraic
expressions
Prerequisites • be able to change the subject in simple

algebraic equations
• be able to complete the square in quadratic
expressions
&
' %
$
• understand how conics are obtained as curves
of intersection of a double-cone with a plane
Learning Outcomes • state the standard form of the equations of

the ellipse, the parabola and the hyperbola
• classify quadratic expressions in x, y in terms
of conics
& %
2 HELM (2006):
Workbook 17: Conics and Polar Coordinates
®
1. The ellipse, parabola and hyperbola

Mathematicians, engineers and scientists encounter numerous functions in their work: polynomials,
trigonometric and hyperbolic functions amongst them. However, throughout the history of sci-
ence one group of functions, the conics, arise time and time again not only in the development of
mathematical theory but also in practical applications. The conics were first studied by the Greek
mathematician Apollonius more than 200 years BC.
Essentially, the conics form that class of curves which are obtained when a double cone is intersected
by a plane. There are three main types: the ellipse, the parabola and the hyperbola. From the
ellipse we obtain the circle as a special case, and from the hyperbola we obtain the rectangular
hyperbola as a special case. These curves are illustrated in the Figures 1 and 2.
cone-axis
plane of intersection
generator lines Circle: obtained by intersection of a plane

perpendicular to the cone-axis with cone.
(A degenerate case is a single point.)
As the plane of intersection tilts the other

conics are obtained:
Ellipse: obtained by a plane, which is not

perpendicular to the cone-axis, but cutting
the cone in a closed curve.
Various ellipses are obtained as the plane
continues to rotate.
Figure 1: Circle and ellipse
HELM (2006): 3
Section 17.1: Conic Sections
cone-axis
generator line
Parabola: obtained when the plane is

P parallel to the generator of the cone.
Different parabolas are obtained as the point
P moves along a generator.
cone-axis
generator line
Hyperbola: obtained when the plane

intersects both parts of the cone. The
rectangular hyperbola is obtained when the
plane is parallel to the cone-axis.
(A degenerate case is two straight lines.)
Figure 2: Parabola and hyperbola
The ellipse
We are all aware that the paths followed by the planets around the sun are elliptical. However, more
generally the ellipse occurs in many areas of engineering. The standard form of an ellipse is shown
in Figure 3.
y
minor-axis
major-axis
b
− ae ae
a −a a a x
−e e
−b
foci
directrix directrix
Figure 3
4 HELM (2006):
®
If a > b (as in Figure 1) then the x-axis is called the major-axis and the y-axis is called the minor-
axis. On the other hand if b > a then the y-axis is called the major-axis and the x-axis is then
the minor-axis. Two points, inside the ellipse are of importance; these are the foci. If a > b these
are located at coordinate positions ±ae (or at ±be if b > a) on the major-axis, with e, called the
eccentricity, given by
b2 a2
e2 = 1 − (b < a) or by e 2
= 1 − (a < b)
a2 b2
The foci of an ellipse have the property that if light rays are emitted from one focus then on reflection
at the elliptic curve they pass through at the other focus.
Key Point 1
x2 y 2
The standard Cartesian equation of the ellipse with its centre at the origin is + 2 =1
a2 b
This ellipse has intercepts on the x-axis at x = ±a and on the y-axis at ±b. The curve is also
symmetrical about both axes. The curve reduces to a circle in the special case in which a = b.
Example 1
x2 y 2
(a) Sketch the ellipse + =1 (b) Find the eccentricity e
4 9
(c) Locate the positions of the foci.
Solution
(a) We can calculate the values of y as x changes from 0 to 2:
x 0 0.30 0.60 0.90 1.20 1.50 1.80 2

y 3 2.97 2.86 2.68 2.40 1.98 1.31 0
From this table of values, and using the symmetry of the curve, a sketch can be drawn (see Figure
4). Here b = 3 and a = 2 so the y-axis is the major axis and the x-axis is the minor axis.
Here b = 3 and a = 2 so the y-axis is the major axis and the x-axis is the minor axis.
√
(b) e2 = 1 − a2 /b2 = 1 − 4/9 = 5/9 ∴ e = 5/3
√ √
(c) Since b > a and be = 5, the foci are located at ± 5 on the y-axis.
HELM (2006): 5
Solution (contd.)
y
3
√
5
foci
−2 2 x
−√5
−3
Figure 4
Key Point 1 gives the equation of the ellipse with its centre at the origin. If the centre of the ellipse
has coordinates (α, β) and still has its axes parallel to the x- and y-axes the standard equation
becomes
(x − α)2 (y − β)2
+ = 1.
a2 b2
Task
Consider the points A and B with Cartesian coordinates (c, 0) and (−c, 0) re-
spectively. A curve has the property that for every point P on it the sum of the
distances P A and P B is a constant (which we will call 2a). Derive the Cartesian
form of the equation of the curve and show that it is an ellipse.
y P (x, y)
B c c A x
O
Your solution
6 HELM (2006):
®
Answer
We use Pythagoras’s theorem to work out the distances P A and P B:
Let R1 = P B = [(x + c)2 + y 2 ]1/2 and let R2 = P A = [(c − x)2 + y 2 ]1/2
We now take the given equation R1 + R2 = 2a and multiply both sides by R1 − R2 . The quantity
R12 − R22 on the left is calculated to be 4cx, and 2a(R1 − R2 ) is on the right. We thus obtain a pair
2cx
of equations: R1 + R2 = 2a and R1 − R2 =
a
cx
Adding these equations together gives R1 = a + and squaring this equation gives
a
c2 x2
x2 + c2 + 2cx + y 2 = a2 + 2 + 2cx
a
2
c x2 y2
Simplifying: x2 (1 − 2 ) + y 2 = a2 − c2 whence + =1
a a2 (a2 − c2 )
This is the standard equation of an ellipse if we set b2 = a2 − c2 , which is the traditional equation
which relates the two semi-axis lengths a and b to the distance c of the foci from the centre of the
ellipse.
The foci A and B have optical properties; a beam of light travelling from A along AP and undergoing
a mirror reflection from the ellipse at P will return along the path P B to the other focus B.
The circle
The circle is a special case of the ellipse; it occurs when a = b = r so the equation becomes
x2 y 2
+ 2 =1 or, more commonly x2 + y 2 = r2
r2 r
Here, the centre of the circle is located at the origin (0, 0) and the radius of the circle is r. If the
centre of the circle at a point (α, β) then the equation takes the form:
(x − α)2 + (y − β)2 = r2
Key Point 2
The equation of a circle with centre at (α, β) and radius r is (x − α)2 + (y − β)2 = r2
HELM (2006): 7
Task
Write down the equations of the five circles (A to E) below:
1
circle E
2.5
circle A
1 1
circle B
1
−2 − 0.5 0.5 1 2 3 x
−1
1
0.5
circle C −2 circle D
Your solution
Answer
A (x − 1)2 + (y − 1)2 = 1
B (x − 3)2 + (y − 1)2 = 1
C (x + 0.5)2 + (y + 2)2 = 1
D (x − 2)2 + (y + 2)2 = (0.5)2
E (x + 0.5)2 + (y − 2.5)2 = 1
8 HELM (2006):
®
Example 2
Show that the expression
x2 + y 2 − 2x + 6y + 6 = 0
represents the equation of a circle. Find its centre and radius.
Solution
We shall see later how to recognise this as the equation of a circle simply by examination of the
coefficients of the quadratic terms x2 , y 2 and xy. However, in the present example we will use the
process of completing the square, for x and for y, to show that the expression can be written in
standard form.
Now x2 + y 2 − 2x + 6y + 6 ≡ x2 − 2x + y 2 + 6y + 6.
Also,
x2 − 2x ≡ (x − 1)2 − 1 and y 2 + 6y ≡ (y + 3)2 − 9.
Hence we can write
x2 + y 2 − 2x + 6y + 6 ≡ (x − 1)2 − 1 + (y + 3)2 − 9 + 6 = 0
or, taking the free constants to the right-hand side:
(x − 1)2 + (y + 3)2 = 4.
By comparing this with the standard form we conclude this represents the equation of a circle with
centre at (1, −3) and radius 2.
Task
Find the centre and radius of each of the following circles:
(a) x2 + y 2 − 4x − 6y = −12 (b) 2x2 + 2y 2 + 4x + 1 = 0
Your solution
Answer
√
(a) centre: (2, 3) radius 1 (b) centre: (−1, 0) radius 2/2.
HELM (2006): 9
A circle-cutting machine
Introduction
A cutting machine creates circular holes in a piece of sheet-metal by starting at the centre of the
circle and cutting its way outwards until a hole of the correct radius exists. However, prior to cutting,
the circle is characterised by three points on its circumference, rather than by its centre and radius.
Therefore, it is necessary to be able to find the centre and radius of a circle given three points that
it passes through.
Problem in words
Given three points on the circumference of a circle, find its centre and radius
(a) for three general points
(b) (i) for (−6, 5), (−3, 6) and (2, 1) (ii) for (−0.7, 0.6), (5.9, 1.4) and (0.8, −2.8)
where coordinates are in cm.
A circle passes through the three points. Find the centre (x0 , y0 ) and radius R of this circle when
the three circumferential points are
(a) (x1 , y1 ), (x2 , y2 ) and (x3 , y3 )
(b) (i) (−6, 5), (−3, 6) and (2, 1)

(ii) (−0.7, 0.6), (5.9, 1.4) and (0.8, −2.8)
Measurements are in centimetres; give answers correct to 2 decimal places.
(a) The equation of a circle with centre at (x0 , y0 ) and radius R is
(x − x0 )2 + (y − y0 )2 = R2
and, if this passes through the 3 points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ) then
(x1 − x0 )2 + (y1 − y0 )2 = R2 (1)

(x2 − x0 )2 + (y2 − y0 )2 = R2 (2)
(x3 − x0 )2 + (y3 − y0 )2 = R2 (3)
Eliminating the R2 term between (1) and (2) gives

(x1 − x0 )2 + (y1 − y0 )2 = (x2 − x0 )2 + (y2 − y0 )2
so that
x21 − 2x0 x1 + y12 − 2y0 y1 = x22 − 2x0 x2 + y22 − 2y0 y2 (4)
10 HELM (2006):
®
Similarly, eliminating R2 between (1) and (3) gives

x21 − 2x0 x1 + y12 − 2y0 y1 = x23 − 2x0 x3 + y32 − 2y0 y3 (5)
Re-arranging (4) and (5) gives a system of two equations in x0 and y0 .
2(x2 − x1 )x0 + 2(y2 − y1 )y0 = x22 + y22 − x21 − y12 (6)
2(x3 − x1 )x0 + 2(y3 − y1 )y0 = x23 + y32 − x21 − y12 (7)
Multiplying (6) by (y3 − y1 ), and multiplying (7) by (y2 − y1 ), subtracting and re-arranging gives
1 (y3 − y1 )(x22 + y22 ) + (y1 − y2 )(x23 + y32 ) + (y2 − y3 )(x21 + y12 )

x0 = (8)
2 x2 y3 − x3 y2 + x3 y1 − x1 y3 + x1 y2 − x2 y1
while a similar procedure gives
1 (x1 − x3 )(x22 + y22 ) + (x2 − x1 )(x23 + y32 ) + (x3 − x2 )(x21 + y12 )

y0 = (9)
2 x2 y3 − x3 y2 + x3 y1 − x 1 y3 + x1 y2 − x2 y1
Knowing x0 and y0 , the radius R can be found from
p
R = (x1 − x0 )2 + (y1 − y0 )2 (10)
(or alternatively using x2 and y2 (or x3 and y3 ) as appropriate).
Equations (8), (9) and (10) can now be used to analyse the two particular circles above.
(i) Here x1 = −6 cm, y1 = 5 cm, x2 = −3 cm, y2 = 6 cm, x3 = 2 cm and y3 = 1 cm, so that
x2 y3 − x3 y2 + x3 y1 − x1 y3 + x1 y2 − x2 y1 = −3 − 12 + 10 + 6 − 36 + 15 = −20
and
x21 + y12 = 61 x22 + y22 = 45 x23 + y32 = 5
From (8)

1 −4 × 45 + (−1) × 5 + 5 × 61 −180 − 5 + 305
x0 = = = −3
2 −20 −40
while (9) gives

1 −8 × 45 + 3 × 5 + 5 × 61
y0 =
2 −20
−360 + 15 + 305
= =1
−40
The radius can be found from (10)
p √
R = (−6 − (−3))2 + (5 − 1)2 = 25 = 5
so that the circle has centre at (−3, 1) and a radius of 5 cm.
HELM (2006): 11
(ii) Now x1 = −0.7 cm, y1 = 0.6 cm, x2 = 5.9 cm, y2 = 1.4 cm, x3 = 0.8 cm and y3 = −2.8 cm,
so that
x2 y3 −x3 y2 +x3 y1 −x1 y3 +x1 y2 −x2 y1 = −16.52−1.12+0.48−1.96−0.98−3.54 = −23.64
and
x21 + y12 = 0.85 x22 + y22 = 36.77 x23 + y32 = 8.48
so from (8)

1 −125.018 − 6.784 + 3.57 −128.232
x0 = = = 2.7121827
2 −23.64 −47.28
and from (9)

1 −55.155 + 55.968 − 4.335 −3.522
y0 = = = 0.0744924
2 −23.64 −47.28
and from (10)

p √
R = (−0.7 − 2.7121827)2 + (0.6 − 0.0744924)2 = 11.9191490 = 3.4524121
so that, to 2 d.p., the circle has centre at (2.71, 0.07) and a radius of 3.45 cm.
Mathematical comment
Note that the expression
x2 y3 − x3 y2 + x3 y1 − x1 y3 + x1 y2 − x2 y1
appears in the denominator for both x0 and y0 . If this expression is equal to zero, the calculation
will break down. Geometrically, this corresponds to the three points being in a straight line so that
no circle can be drawn, or not all points being distinct so no unique circle is defined.
12 HELM (2006):
®
The web-flange junction
Introduction
In problems of torsion, the torsion constant, J, which is a function of the shape and structure of
the element under consideration, is an important quantity.
A common beam section is the thick I-section shown here, for
which the torsion constant is given by flange
J = 2J1 + J2 + 2αD4
where the J1 and J2 terms refer to the flanges and web respec-
web
tively, and the D4 term refers to the web-flange junction. In
fact

tf tw r
α = min , 0.15 + 0.1 flange
tw tf tf
where tf and tw are the thicknesses of the flange and web respectively, and r is the radius of the
concave circle element between the flange and the web. D is the diameter of the circle of the
web-flange junction.
p p p p
p p p
p p
p p
6
p D -p
tf
p p
p p ?
p p p
p p p p Q k Qrq
-
tw
As D occurs in the form D4 , the torsion constant is very sensitive to it. Calculation of D is therefore
a crucial part of the calculation of J.
Problem in words
Find D, the diameter of the circle within the web–flange junction as a function of the other dimensions
of the structural element.
(a) Find D, the diameter of the circle, in terms of tf and tw (the thicknesses of the flange and the
web respectively) in the case where r = 0. When tf = 3cm and tw = 2cm, find D.
(b) For r 6= 0, find D in terms of tf , tw and r. In the special case where tf = 3 cm, tw = 2 cm
and r = 0.4 cm, find D.
HELM (2006): 13
(a) Consider a co-ordinate system based on the midpoint of the outer surface of the flange.
pp p p p p p p p
6
-x
ppp R -ppp
pp p
p p p p sp p
A
The centre of the circle will lie at (0, −R) where R is the radius of the circle, i.e. R = D/2.
The equation of the circle is
x2 + (y + R)2 = R2 (1)
In addition, the circle passes through the ‘corner’ at point A (tw /2, −tf ), so
2
tw
+ (−tf + R)2 = R2 (2)
2
On expanding
t2w
+ t2f − 2Rtf + R2 = R2
4
giving
t2 (t2w /4) + t2f t2 tf

2Rtf = w + t2f ⇒ R= = w +
4 2tf 8tf 2
so that
t2w
D = 2R = + tf (3)
4tf
Setting tf = 3 cm, tw = 2 cm gives
22
D= + 3 = 3.33 cm
4×3
14 HELM (2006):
®
(b) Again using a co-ordinate system based on the mid-point of the outer surface of the flange,
consider now the case r 6= 0.
y
p
pp p p p p p p
6 -x
pp R-ppp
pp p
p p
p p ppp
r rB
Point B (tw /2 + r, −tf − r) lies, not on the circle described by (1), but on the slightly larger
circle with the same centre, and radius R + r. The equation of this circle is
x2 + (y + R)2 = (R + r)2 (4)
Putting the co-ordinates of point B into equation (4) gives

2
tw
+r + (−tf − r + R)2 = (R + r)2 (5)
2
which, on expanding gives
t2w
+ tw r + r2 + t2f + r2 + R2 + 2tf r − 2tf R − 2rR = R2 + 2rR + r2
4
Cancelling and gathering terms gives
t2w
+ tw r + r2 + t2f + 2tf r = 4rR + 2tf R
4
= 2R (2r + tf )
so that
(t2w /4) + tw r + r2 + t2f + 2tf r
2R = D =
(2r + tf )
t2w + 4tw r + 4r2 + 4t2f + 8tf r

so D= (6)
(8r + 4tf )
Now putting tf = 3 cm, tw = 2 cm and r = 0.4 cm makes
22 + (4 × 2 × 0.4) + (4 × 0.42 ) + (4 × 32 ) + (8 × 3 × 0.4) 53.44

D= = = 3.52 cm
(8 × 0.4) + (4 × 3) 15.2
Interpretation
Note that setting r = 0 in Equation (6) recovers the special case of r = 0 given by equation (3).
The value of D is now available to be used in calculations of the torsion constant, J.
HELM (2006): 15
The parabola
The standard form of the parabola is shown in Figure 5. Here the x-axis is the line of symmetry of
the parabola.
y
focus
−a a x
directrix
Figure 5
Key Point 3
The standard equation of the parabola with focus at (a, 0) is
y 2 = 4ax
It can be shown that light rays parallel to the x-axis will, on reflection from the parabolic curve, come
together at the focus. This is an important property and is used in the construction of some kinds
of telescopes, satellite dishes and car headlights.
Task
Sketch the curve y 2 = 8x. Find the position of the focus and confirm its light-
focusing property.
Your solution
16 HELM (2006):
®
Answer
This is a standard parabola (y 2 = 4ax) with a = 2. Thus the focus is located at coordinate position
(2, 0).
y
θ
θ
2 x
focus
If your sketch is sufficiently accurate you should find that light-rays (lines) parallel to the x-axis
when reflected off the parabolic surface pass through the focus. (Draw a tangent at the point of
reflection and ensure that the angle of incidence (θ say) is the same as the angle of reflection.)
By changing the equation of the parabola slightly we can change the position of the parabola along
the x-axis. See Figure 6.
y
y 2 = 4a(x + 1) y 2 = 4ax
y 2 = 4a(x − 3)
−1 3 x
Figure 6: Parabola y = 4a(x − b) with vertex at x = b

We can also have parabolas where the y-axis is the line of symmetry (see Figure 7). In this case the
standard equation is
x2
x2 = 4ay or y=
4a
focus
a
Figure 7
HELM (2006): 17
Task
Sketch the curves y 2 = x and x2 = 2(y − 3).
Your solution
Answer
y
x2 = 2(y − 3)
3 y2 = x
x
The focus of the parabola y 2 = 4a(x − b) is located at coordinate position (a + b, 0). Changing the
value of a changes the convexity of the parabola (see Figure 8).
y y 2 = 3x
y 2 = 2x
y2 = x
Figure 8
18 HELM (2006):
®
The hyperbola
The standard form of the hyperbola is shown in Figure 9(a).
This has standard equation
x2 y 2
− 2 =1
a2 b
The eccentricity, e, is defined by
b2
e2 = 1 + (e > 1)
a2
y asymptotes y
focus b focus focus focus
− ae −a a ae x x
−b
(a) (b)
Figure 9
b
Note the change in sign compared to the equivalent expressions for the ellipse. The lines y = ± x
a
are asymptotes to the hyperbola (these are the lines to which each branch of the hyperbola approach
as x → ±∞).
If light is emitted from one focus then on hitting the hyperbolic curve it is reflected in such a way
as to appear to be coming from the other focus. See Figure 9(b). The hyperbola has fewer uses in
applications than the other conic sections and so we will not dwell here on its properties.
Key Point 4
The standard equation of the hyperbola with foci at (±ae, 0) is
x2 y 2 b2
− 2 =1 with eccentricity e given by e2 = 1 + (e > 1)
a2 b a2
HELM (2006): 19
General conics
The conics we have considered above - the ellipse, the parabola and the hyperbola - have all been
presented in standard form:- their axes are parallel to either the x- or y-axis. However, conics may
be rotated to any angle with respect to the axes: they clearly remain conics, but what equations do
they have?
It can be shown that the equation of any conic, can be described by the quadratic expression
Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0
where A, B, C, D, E, F are constants.
If not all of A, B, C are zero (and F is a suitable number) the graph of this equation is
(i) an ellipse if B 2 < 4AC (circle if A = C and B = 0)
(ii) a parabola if B 2 = 4AC
(iii) a hyperbola if B 2 > 4AC
Example 3
Classify each of the following equations as ellipse, parabola or hyperbola:
(a) x2 + 2xy + 3y 2 + x − 1 = 0
(b) x2 + 2xy + y 2 − 3y + 7 = 0
(c) 2x2 + xy + 2y 2 − 2x + 3y = 6
(d) 3x2 + 2x − 5y + 3y 2 − 10 = 0
Solution
(a) Here A = 1, B = 2, C = 3 ∴ B 2 < 4AC. This is an ellipse.
(b) Here A = 1, B = 2, C = 1 ∴ B 2 = 4AC. This is a parabola.
(c) Here A = 2, B = 1, C = 2 ∴ B 2 < 4AC also A = C but B 6= 0. This is an ellipse.
(d) Here A = 3, B = 0, C = 3 ∴ B 2 < 4AC. Also A = C and B = 0. This is a circle.
20 HELM (2006):
®
Task
Classify each of the following conics:
(a) x2 − 2xy − 3y 2 + x − 1 = 0
(b) 2x2 + xy − y 2 − 2x + 3y = 0
(c) 4x2 − y + 3 = 0
(d) −x2 − xy − y 2 + 3x = 0
(e) 2x2 + 2y 2 − x + 3y = 7
Your solution
Answer
(a) A = 1, B = −2, C = −3 B 2 > 4AC ∴ hyperbola
(b) A = 2, B = 1, C = −1 B 2 > 4AC ∴ hyperbola
(c) A = 4, B = 0, C = 0 B 2 = 4AC ∴ parabola
(d) A = −1, B = −1, C = −1 B 2 < 4AC, A = C, B 6= 0 ∴ ellipse
(e) A = 2, B = 0, C = 2 B 2 < 4AC, A = C and B = 0 ∴ circle
HELM (2006): 21
Exercises
1. The equation 9x2 + 4y 2 − 36x + 24y − 1 = 0 represents an ellipse. Find its centre, the
semi-major and semi-minor axes and the coordinate positions of the foci.
2. Find the equation of a circle of radius 3 which has its centre at (−1, 2.2)
3. Find the centre and radius of the circle x2 + y 2 − 2x − 2y − 5 = 0
4. Find the position of the focus of the parabola y 2 − x + 3 = 0
5. Classify each of the following conics
(a) x2 + 2x − y − 3 = 0
(b) 8x2 + 12xy + 17y 2 − 20 = 0
(c) x2 + xy − 1 = 0
(d) 4x2 − y 2 − 4y = 0
(e) 6x2 + 9y 2 − 24x − 54y + 51 = 0
6. An asteroid has an elliptical orbit around the Sun. The major axis is of length 5 × 108 km. If
the distance between the foci is 4 × 108 km find the equation of the orbit.
Answers
√
1. centre: (2, −3), semi-major 3, semi-minor 2, foci: (2, −3 ± 5)
2. (x + 1)2 + (y − 2.2)2 = 9
√
3. centre: (1, 1) radius 7
4. y 2 = (x − 3) ∴ a = 1, b = −3. Hence focus is at coordinate position (4, 0).
5. (a) parabola with vertex (−1, −4)

(b) ellipse
(c) hyperbola
(d) hyperbola
(e) ellipse with centre (2, 3)
6. 9x2 + 25y 2 = 5.625 × 107
22 HELM (2006):
®

Polar Coordinates 17.2
Introduction
In this Section we extend the use of polar coordinates. These were first introduced in 2.8. They
were also used in the discussion on complex numbers in 10.2. We shall examine the application
of polars to the description of curves, particularly conics. Some curves, spirals for example, which
are very difficult to describe in terms of Cartesian coordinates (x, y) are relatively easily defined in
polars [r, θ].
' $
• be familiar with Cartesian coordinates
• be familiar with trigonometric functions and

Prerequisites how to manipulate then
Before starting this Section you should . . . • be able to simplify algebraic expressions and
manipulate algebraic fractions
&
' %
$
• understand how Cartesian coordinates and
polar coordinates are related
Learning Outcomes • find the polar form of a curve given in
On completion you should be able to . . . Cartesian form
• recognise some conics given in polar form

& %
HELM (2006): 23
Section 17.2: Polar Coordinates
1. Polar Coordinates
In this Section we consider the application of polar coordinates to the description of curves; in
particular, to conics.
If the Cartesian coordinates of a point P are (x, y) then P can be located on a Cartesian plane as
indicated in Figure 10.
y y
y P y P
θ
O x x O x x
(a) (b)
Figure 10
However, the same point P can be located by using polar coordinates r, θ where r is the distance
of P from the origin and θ is the angle, measured anti-clockwise, that the line OP makes when
measured from the positive x-direction. See Figure 10(b). In this Section we shall denote the polar
coordinates of a point by using square brackets.
From Figure 10 it is clear that Cartesian and polar coordinates are directly related. The relations are
noted in the following Key Point.
Key Point 5
If (x, y) are the Cartesian coordinates and [r, θ] the polar coordinates of a point P then
x = r cos θ y = r sin θ
and, equivalently, p y
r = + x2 + y 2 tan θ =
x
From these relations we see that it is a straightforward matter to calculate (x, y) given [r, θ]. However,
some care is needed (particularly with the determination of θ) if we want to calculate [r, θ] from (x, y).
24 HELM (2006):
®
Example 4
On a Cartesian plane locate points P, Q, R, S which have their locations specified
π π π √
by polar coordinates [2, ], [2, 3 ], [3, ], [ 2, π] respectively.
2 2 6
Solution
P R
2
S 30◦
√
2 3 x
2
Figure 11
Task
π 5π
Two points P, Q have polar coordinates [3, ] and [2, ] respectively. By locating
3 6
these points on a Cartesian plane find their equivalent Cartesian coordinates.
Your solution
Answer
y P √
π π 3 3 3
P : (3 cos , 3 sin ) ≡ ( , )
Q 3 3 3 2 2
√
2 π π −2 3
Q : (−2 cos , 2 sin ) ≡ ( , 1)
π/3 6 6 2
π/6
x
HELM (2006): 25
The polar coordinates of a point are not unique. So, the polar coordinates [a, θ] and [a, φ] represent
the same point in the Cartesian plane provided θ and φ differ by an integer multiple of 2π. See
Figure 12.
y y y
P P P
a a a
θ θ + 2π θ + 2kπ
x x x
Figure 12
π 7π 5π
For example, the polar coordinates [2, ], [2, ], [2, − ] all represent the same point in the
3 3 3
Cartesian plane.
Key Point 6
By convention, we measure the positive angle θ in an anti-clockwise direction.
The angle −φ is interpreted as the angle φ measured in a clockwise direction.
y y
θ
x x
φ
Figure 13
Exercises
√
1. The Cartesian coordinates of P, Q are (1, −1) and (−1, 3). What are their equivalent polar
coordinates?
π 7π 10π
2. Locate the points P, Q, R with polar coordinates [1, ], [2, ], [2, ]. What do you notice?
3 3 3
26 HELM (2006):
®
Answer
y y
1.
2π/3
2
x x
7π/4 √
2
√ √
(1, −1) → [ 2, 7π/4] (−1, 3) → [2, 2π/3]
2. All these points lie on a straight line through the origin.
2. Simple curves in polar coordinates

We are used to describing the equations of curves in Cartesian variables x, y. Thus x2 + y 2 = 1
represents a circle, centre the origin, and of radius 1, and y = 2x2 is the equation of a parabola whose
axis is the y-axis and with vertex located at the origin. (In colloquial terms the vertex is the ‘sharp
end’ of a conic.) We can convert these equations into polar form by using the relations x = r cos θ,
y = r sin θ.
Example 5
Find the polar coordinate form of
(a) the circle x2 + y 2 = 1 (b) the parabola y = 2x2 .
Solution
(a) Using x = r cos θ, y = r sin θ in the expression x2 + y 2 = 1 we have
(r cos θ)2 + (r sin θ)2 = 1 or r2 (cos2 θ + sin2 θ) = 1
giving r2 = 1. We simplify this to r = 1 (since r = −1 is invalid being a negative distance). Of
course we might have guessed this answer since the relation r = 1 states that every point on the
curve is a constant distance 1 away from the origin.
(b) Repeating the approach used in (a) for y = 2x2 we obtain:
r sin θ = 2(r cos θ)2 i.e. r sin θ − 2r2 cos2 θ = 0
Therefore r(sin θ − 2r cos2 θ) = 0. Either r = 0 (which is a single point, the origin, and is clearly
not a parabola) or
1
sin θ − 2r cos2 θ = 0 giving, finally r= tan θ sec θ.
2
This is the polar equation of this particular parabola, y = 2x2 .
HELM (2006): 27
Task
Sketch the curves
π
(a) y = cos x (b) y = (c) y = x
3
Your solution
Answer
y y y
π/3
x x x
(a) (b) (c)
Task
Sketch the curve r = cos θ.
First complete the table of values. Enter values to 2 d.p. and work in radians:
Your solution
π 2π 3π 4π 5π 6π
θ 0 6 6 6 6 6 6
r
Answer
π 2π 3π 4π 5π 6π
θ 0 6 6 6 6 6 6
r 1.00 0.87 0.50 0.00 -0.50 -0.87 -1.00
π 3π
You will see that the values of θ for < θ < give rise to negative values of r (and hence
2 2
invalid).
Now sketch the curve:
Your solution
28 HELM (2006):
®
Answer
y
! "
1 1
circle: centre , 0 , radius . x
2 2 1
Task
Sketch the curve θ = π/3.
Your solution
Answer
π
Radial line passing through the origin at angle to the positive x-axis.
3
Task
Sketch the curve r = θ.
Your solution
Answer
y
HELM (2006): 29
3. Standard conics in polar coordinates
In the previous Section we merely stated the standard equations of the conics using Cartesian coordi-
nates. Here we consider an alternative definition of a conic and use this different approach to obtain
the equations of the standard conics in polar form. Consider a straight line x = −d (this will be the
directrix of the conic) and let e be the eccentricity of the conic (e is a positive real number). It can
be shown that the set of points P in the (x, y) plane which satisfy the condition
distance of P from origin
=e
perpendicular distance from P to the line
is a conic with eccentricity e. In particular, it is an ellipse if e < 1, a parabola if e = 1 and a
hyperbola if e > 1. See Figure 14.
y
r cos θ
d
P
r
θ
−d O x
Figure 14
We can obtain the polar coordinate form of this conic in a straightforward manner. If P has polar
coordinates [r, θ] then the relation above gives
r
=e or r = e(d + r cos θ)
d + r cos θ
ed
Thus, solving for r: r=
1 − e cos θ
This is the equation of the conic.
In all of these conics it can be shown that one of the foci is located at the origin. See Figure 15 in
which the pertinent details of the conics are highlighted.
y e<1
y y
e=1 e>1
x x x

ed ed d ed
,π ,0 ,π ,π
1+e 1−e 2 1+e
Figure 15
30 HELM (2006):
®
Task
4
Sketch the ellipse r = and locate the coordinates of its vertices.
2 − cos θ
Your solution
Answer
Here
4 2 1
r= = 1 so e=
2 − cos θ 1 − 2 cos θ 2
Then
de 2 4 de 2
de = 2 = 3 = and = 1 =4
1+e 2
3 1−e 2
− 4/3 4 x
HELM (2006): 31
Exercises
1 6
1. Sketch the polar curves (a) r = (b) r = e−θ (c) r = .
1 − cos θ 3 − cos θ
2. Find the polar form of the following curves given in Cartesian form:
(a) y 2 = 1 + 2x (b) 2xy = 1
3. Find the Cartesian form of the following curves given in polar form
2
(a) r = (b) r = 3 cos θ
sin θ + 2 cos θ
Do you recognise these equations?
Answers
1.
(a) parabola e = 1, d = 1 y
−1/2 x
(b) decreasing spiral y
2 y
(c) r= 1
1 − cos θ
3
1
ellipse since e = 3
< 1. Also de = 2
−3/2 3 x
2.
cos θ + 1 1
(a) r2 sin2 θ = 1 + 2r cos θ ∴ r= 2
=
1 − cos θ 1 − cos θ
(b) 2r2 cos θ sin θ = 1 ∴ 2
r = cosec 2θ
3.
(a) r(sin θ + 2 cos θ) = 2 y + 2x = 2 which is a straight line

∴
p 3x
(b) r = 3 cos θ ∴ x2 + y 2 = p ∴ x2 + y 2 = 3x
2
x +y 2
2
3 2 9 3 3
in standard form: x− + y = . i.e. a circle, centre , 0 with radius
2 4 2 2
32 HELM (2006):
®

Parametric Curves 17.3
Introduction
In this Section we examine yet another way of defining curves - the parametric description. We shall
see that this is, in some ways, far more useful than either the Cartesian description or the polar form.
Although we shall only study planar curves (curves lying in a plane) the parametric description can
be easily generalised to the description of spatial curves which twist and turn in three dimensional
space.
' $
• be familiar with Cartesian coordinates
• be familiar with trigonometric and hyperbolic

functions and be able to manipulate them
Prerequisites
• be able to differentiate simple functions
• be able to locate turning points and
distinguish between maxima and minima.
&
' %
$
• sketch planar curves given in parametric form
• understand how the same curve can be

Learning Outcomes described using different parameterisations
• recognise some conics given in parametric
form
& %
HELM (2006): 33
Section 17.3: Parametric Curves
1. Parametric curves
Here we explore the use of a parameter t in the description of curves. We shall see that it has some
advantages over the more usual Cartesian description. We start with a simple example.
Example 6
π
Plot the curve x = 2 cos t y = 3 sin t 0≤t≤
| {z

} | {z 2}

parametric equations of the curve parameter range
Solution
The approach to sketching the curve is straightforward. We simply give the parameter t various
π
values as it ranges through 0 → and, for each value of t, calculate corresponding values of (x, y)
2
which are then plotted on a Cartesian xy plane. The value of t and the corresponding values of x, y
are recorded in the following table:
π 2π 3π 4π 5π 6π 7π 8π 9π 10π
t 0 20 20 20 20 20 20 20 20 20 20
x 2 1.98 1.90 1.78 1.62 1.41 1.18 0.91 0.62 0.31 0
y 0 0.47 0.93 1.36 1.76 2.12 2.43 2.67 2.85 2.96 3
Plotting the (x, y) coordinates gives the curve in Figure 16.

y
π
t=
2 7π
t=
3 20
3π
t=
20
3π
t=0
20
2 x
Figure 16
The curve in Figure 16 resembles part of an ellipse. This can be verified by eliminating t from the
parametric equations to obtain an expression involving x, y only. If we divide the first parametric
equation by 2 and the second by 3, square both and add we obtain
x 2 y 2 x2 y 2
+ = cos2 t + sin2 t ≡ 1 i.e. + =1
2 3 4 9
π
which we easily recognise as an ellipse whose major-axis is the y-axis. Also, as t ranges from 0 →
2
x = 2 cos t decreases from 2 → 0, and y = 3 sin t increases from 0 → 3. We conclude that the
34 HELM (2006):
®
π
parametric equations x = 2 cos t, y = 3 sin t together with the parametric range 0 ≤ t ≤ describe
2
x2 y2
that part of the ellipse + = 1 in the positive quadrant. On the curve in Figure 16 we have
4 9
used an arrow to indicate the direction that we move along the curve as t increases from its initial
value 0.
Task
Plot the curve x=t+1 y = 2t2 − 3 0≤t≤1
Do you recognise this curve as a conic section?
First construct a table of (x, y) values as t ranges from 0 → 1:

Your solution
t 0 0.25 0.5 0.75 1
x
y
Answer
t 0 0.25 0.5 0.75 1
x 1 1.25 1.5 1.75 2
y −3 −2.88 −2.5 −1.88 −1
Now plot the points on a Cartesian plane:

Your solution
Answer
y
0 1 2 x
−1 t=1
−2 t = 0.75
t = 0.5
−3 t=0 t = 0.25
Now eliminate the t-variable from x = t + 1, y = 2t2 − 3 to obtain the xy form of the curve:
HELM (2006): 35
Your solution
Answer
y = 2x2 − 4x − 1 which is the equation of a parabola.
Example 7
Sketch the curve x = t2 + 1 y = 2t4 − 3 0≤t≤1
Solution
This is very similar to the previous Task (except for t4 replacing t2 in the expression for y and t2
replacing t in the expression for x). The corresponding table of values is
t 0 0.25 0.5 0.75 1

x 1 1.06 1.25 1.56 2
y −3 −2.99 −2.88 −2.37 −1
y
0 1 2 x
−1 t=1
−2
t = 0.75
t=0
t = 0.5
−3
t = 0.25
Figure 17
We see that this is identical to the curve drawn previously. This is confirmed by eliminating the
t-parameter from the expressions defining x, y. Here t2 = x − 1 so y = 2(x − 1)2 − 3 which is
the same as obtained in the last Task. The main difference is that particular values of t locate (in
general) different (x, y) points on the curve for the two parametric representations.
We conclude that a given curve in the xy plane can have many (in fact infinitely many) parametric
descriptions.
36 HELM (2006):
®
Task
Show that the two parametric representations below describe the same curve.
π
(a) x = cos t y = sin t 0≤t≤
2
√
(b) x = t y = 1−t 2 0≤t≤1
Eliminate t from the parametric equations in (a):
Your solution
Answer
x2 + y 2 = cos2 t + sin2 t = 1
Eliminate t from the parametric equations in (b):
Your solution
Answer
√
y = 1 − x2 ∴ y 2 = 1 − x2 or x2 + y 2 = 1
What do you conclude?
Your solution
Answer
Both parametric descriptions represent (part of) a circle centred at the origin of radius 1.
2. General parametric form

We will assume that any curve in the xy plane may be written in parametric form:
x = g(t) y = h(t) t0 ≤ t ≤ t1
| {z } | {z }

parametric equations of the curve parameter range
in which g(t), h(t) are given functions of t and the parameter t ranges over the values t0 → t1 . As
we give values to t within this range then corresponding values of x, y are calculated from x = g(t),
y = h(t) which can then be plotted on an xy plane.
dy
In 12.3, we discovered how to obtain the derivative from a knowledge of the parametric
dx
dy dx
derivatives and . We found
dt dt
3
d2 y dx d2 y dy d2 x

dy dy dx dx
= ÷ and 2
= 2
− 2
÷
dx dt dt dx dt dt dt dt dt
HELM (2006): 37
Note that derivatives with respect to the parameter t are often denoted by a dot:
dx dy d2 x
≡ ẋ ≡ ẏ ≡ ẍ etc
dt dt dt2
so that
= and =
dx ẋ dx2 ẋ3
Knowledge of the derivative is sometimes useful in curve sketching.
Example 8
Sketch the curve x = t3 + 3t2 + 2t y = 3 − 2t − t2 − 3 ≤ t ≤ 1.
Solution
x = t3 + 3t2 + 2t = t(t + 2)(t + 1) y = 3 − 2t − t2 = −(t + 3)(t − 1)

so that x = 0 when t = 0, −1, −2 and y = 0 when t = −3, 1. We calculate the values of x, y at
various values of t:
t −3 −2.50 −2 −1.50 −1 −0.50 0 0.50
x −6 −1.88 0 0.38 0 −0.38 0 1.88
y 0 1.75 3 3.75 4 3.75 3 1.75
We see that t = −2 and t = 0 give rise to the same coordinate values for (x, y). This represents a
double-point in the curve which is one where the curve crosses itself. Now
dx dy dy −2(1 + t)
= 3t2 + 6t + 2, = −2 − 2t ∴ = 2
dt dt dx 3t + 6t + 2
d2 y
so there is a turning point when t = −1. The reader is urged to calculate and to show that
dx2
this is negative when t = −1 (i.e. at x = 0, y = 4) indicating a maximum when. (The reader
should check that vertical tangents occur at t = −0.43 and t = −1.47, to 2 d.p.)
We can now make a reasonable sketch of the curve:
y
t = −1
t = −0.5 t = −1.5
t = −2, 0(double point)
t = −2.5 t = 0.5
t increasing
t = −3 t=1
−6 6 x
Figure 18
38 HELM (2006):
®
3. Standard forms of conic sections in parametric form

We have seen above that, given a curve in the xy plane, there is no unique way of representing it
in parametric form. However, for some commonly occurring curves, particularly the conics, there are
accepted standard parametric equations.
The parabola
The standard parametric equations for a parabola are: x = at2 y = 2at
2
y y
Clearly, we have t = and by eliminating t we get x = a or y 2 = 4ax which we recognise
2a 4a2
as the standard Cartesian description of a parabola. As an illustration, Figure 19 shows the curve
with a = 2 and −1 ≤ t ≤ 2.3
y
t=2
t=1
t=0
1 2 3 x
t = −1
Figure 19
The ellipse
Here, the standard equations are x = a cos t y = b sin t
Again, eliminating t (dividing the first equation by a, the second by b, squaring and adding) we have
x 2 y 2 x2 y 2
+ = cos2 t + sin2 t ≡ 1 or, in more familiar form: + 2 = 1.
a b a2 b
7π
If we choose the range for t as 0 ≤ t ≤ the following segment of the ellipse is obtained.
4
y
π
t=
2
3π b π
t= t=
4 4
3π
4 π
4 t=0
t=π
a x
5π 7π
t= t=
4 4
3π
t=
2
Figure 20
Here we note that (except in the special case when a = b, giving a circle) the parameter t is not the
angle that the radial line makes with the the positive x-axis.
HELM (2006): 39
In the study of the orbits of planets and satellites it is often preferable to use plane polar coordinates
1
(r, θ) to treat the problem. In these coordinates an ellipse has an equation of the form = A +
r
B cos θ, with A and B positive numbers such that B < A. Not only is there a difference in the
equations on passing from Cartesian to polar coordinates; there is also a change in the origin of
coordinates. The polar coordinate equation is using a focus of the ellipse as the origin. In the
Cartesian description the foci are two points at +e along the x-axis, where e obeys the equation
e = a − b, if we assume that a < b i.e. we choose the long axis of the ellipse as the x-axis. This
problem gives some practice at algebraic manipulation and also indicates some shortcuts which can
be made once the mathematics of the ellipse has been understood.
Example 9
An ellipse is described in plane polar coordinates by the equation
1
= 2 + cos θ
r
Convert the equation to Cartesian form. [Hint: remember that x = r cos θ.]
Solution
Multiplying the given equation by r and then using x = r cos θ gives the results
1 = 2r + x so that 2r = 1 − x
We now square the second equation, remembering that r2 = x2 + y 2 . We now have
4(x2 + y 2 ) = (1 − x)2 = 1 + x2 − 2x so that 3x2 + 2x + 4y 2 = 1
We now recall the method of completing the square, which allows us to set
2x 2 1
3x2 + 2x = 3(x2 + ) − )
3 9
Putting this result into the equation and collecting terms leads to the final result
1
(x + )2 y 2 2
r
1
3 + = 1 with a = and b = .
a2 b2 3 3
This is the standard Cartesian form for the equation of an ellipse but we must remember that we
1
started from a polar equation with a focus of the ellipse as origin. The presence of the term x +
3
1
in the equation above actually tells us that the focus being used as origin was a distance of to
3
the right of the centre of the ellipse at x = 0.
The preceding piece of algebra was necessary in order to convince us that the original equation in
plane polar coordinates does represent an ellipse. However, now that we are convinced of this we
can go back and try to extract information in a more speedy way from the equation in its original
(r, θ) form.
40 HELM (2006):
®
Solution (contd.)
Try setting θ = 0 and θ = π in the equation
1
= 2 + cos θ
r
1
We find that at θ = 0 we have r = while at θ = π we have r = 1. These r values correspond to
3
1 4 2
the two ends of the ellipse, so the long axis has a total length 1 + = . This tells us that a = ,
3 3 3
exactly as found by our longer algebraic derivation. We can further deduce that the focus acting as
1
origin must be at a distance of from the centre of the ellipse in order to lead to the two r values
3
at θ = 0 and θ = π. If r we now use the equation e = a − b mentioned earlier then we find that
1 4 1
= − b2 , so that b = , as obtained by our lengthy algebra.
9 9 3
The hyperbola
The standard equations are x = a cosh t y = b sinh t.
In this case, to eliminate t we use the identity cosh2 t − sinh2 t ≡ 1 giving rise to the equation of the
hyperbola in Cartesian form:
x2 y 2
− 2 = 1.
a2 b
In Figure 21 we have chosen a parameter range −1 ≤ t ≤ 2.
y
t=2
t = 1.5
t=1
t = 0.5
t=0
x
t = − 0.5
t = −1
Figure 21
To obtain the complete curve the parameter range −∞ < t < ∞ must be used. These parametric
equations only give the right-hand branch of the hyperbola. To obtain the left-hand branch we would
use x = −a cosh t y = b sinh t
HELM (2006): 41
Exercises
1. In the following sketch the given parametric curves. Also, eliminate the parameter to give the
Cartesian equation in x and y.
(a) x = t, y = 2 − t 0≤t≤1 (b) x = 2 − t, y = t + 1 0≤t≤∞

2 πt πt
(c) x = y =t−2 0<t<3 (d) x = 3 sin y = 4 cos − 1 ≤ t ≤ 0.5
t 2 2
2. Find the tangent line to the parametric curve x = t2 − t y − t2 + t at the point where t = 1.
dy
3. For each of the following curves expressed in parametric form obtain expressions for and
dx
d2 y
and use this information to help make a sketch.
dx2
(a) x = t2 − 2t, y = t2 − 4t
(b) x = t3 − 3t − 2, y = t2 − t − 2
Answers
y
2 t=0
1. (a) y = 2 − x
1
1 x
y
(b) y = 3 − x
t=0
1
2 x
y t=3
2
(c) y = −2 ∴ x(y + 2) = 2 x
x
−2 t=0
y
t = 0.5
x2 y 2
(d) + =1
9 16 t = −1
x
dy dx
2. = 2t + 1 = 2t − 1 y
dt dt
dy 2t + 1 dy
∴ = when t = 1 then =3
dx 2t − 1 dx
when t = 1 x = 0, y = 2
∴ tangent line is y = 3x + 2
x
42 HELM (2006):
®
Answer
dy dx
3. (a) = 2t − 4 = 2t − 2
dt dt
d2 y d2 x
= 2 =2
dt2 dt2
dy 2t − 4 t−2 d2 y [(2t − 2) − (2t − 4)]2 1
= = = 3
=
dx 2t − 2 t−1 dx 2 8(t − 1) 2(t − 1)3
t=0 t=4
8 x
−4 t=2
2 2
(b) x = (t − 2)(t + 2t + 1) = (t − 2)(t + 1)
y = (t + 1)(t − 2)
y
dy dx
= 2t − 1 = 3t2 − 3
dt dt
t = −1, 2
d2 y d2 x
= 2 = 6t x
dt2 dt2
d2 y [2(3t2 − 3) − (2t − 1)6t] −6t2 + 6t − 6
= =
dx2 (3t2 − 3)3 27(t2 − 1)3
HELM (2006): 43
Contents 18
Functions of
Several Variables
18.1 Functions of Several Variables 2
18.2 Partial Derivatives 8
18.3 Stationary Points 21
18.4 Errors and Percentage Change 30
Learning outcomes
In this Workbook you will learn about functions of two or more variables. You will learn that
a function of two variables can be interpreted as a surface. You will learn how to sketch
simple surfaces. You will learn what a partial derivative is and how the partial derivative
of any order may be found. As an important application of partial differentiation you will
learn how to locate the turning points of functions of several variables. In particular, for
functions of two variables, you will learn how to distinguish between maxima and minima
points. Finally you will apply your knowledge to the topic of error analysis.
Functions of
Several Variables 18.1
Introduction
A function of a single variable y = f (x) is interpreted graphically as a planar curve. In this Section
we generalise the concept to functions of more than one variable. We shall see that a function of
two variables z = f (x, y) can be interpreted as a surface. Functions of two or more variables often
arise in engineering and in science and it is important to be able to deal with such functions with
confidence and skill. We see in this Section how to sketch simple surfaces. In later Sections we shall
examine how to determine the rate of change of f (x, y) with respect to x and y and also how to
obtain the optimum values of functions of several variables.

• understand the Cartesian coordinates (x, y, z)
Prerequisites of three-dimensional space.
Before starting this Section you should . . . • be able to sketch simple 2D curves

'
$
• understand the mathematical description of a
surface
Learning Outcomes • sketch simple surfaces
• use the notation for a function of several
variables
& %
2 HELM (2006):
Workbook 18: Functions of Several Variables
®
1. Functions of several variables

We know that f (x) is used to represent a function of one variable: the input variable is x and the
output is the value f (x). Here x is the independent variable and y = f (x) is the dependent
variable.
Suppose we consider a function with two independent input variables x and y, for example
f (x, y) = x + 2y + 3.
If we specify values for x and y then we have a single value f (x, y). For example, if x = 3 and
y = 1 then f (x, y) = 3 + 2 + 3 = 8. We write f (3, 1) = 8.
Task
Find the values of f (2, 1), f (−1, −3) and f (0, 0) for the following functions.
(a) f (x, y) = x2 + y 2 + 1 (b) f (x, y) = 2x + xy + y 3
Your solution
Answer
(a) f (2, 1) = 22 + 12 + 1 = 6; f (−1, −3) = (−1)2 + (−3)2 + 1 = 11; f (0, 0) = 1
(b) f (2, 1) = 4 + 2 + 1 = 7; f (−1, −3) = −2 + 3 − 27 = −26; f (0, 0) = 0
In a similar way we can define a function of three independent variables. Let these variables be x, y
and u and the function f (x, y, u).
Example 1
Given f (x, y, u) = x2 + yu + 2, find f (0, 1, 0), f (−1, −1, 2).
Solution
f (0, 1, 0) = 02 + 1 × 0 + 2 = 2; f (−1, −1, 2) = 1 − 2 + 2 = 1
Task
(a) Find f (2, −1, 1) for f (x, y, u) = xy + yu + ux.
(b) Evaluate f (x, y, u, t) = x2 − y 2 − u2 − 2t when x = 1, y = −2, u = 3, t = 1.
Your solution
HELM (2006): 3
Section 18.1: Functions of Several Variables
Answer
(a) f (2, −1, 1) = 2 × (−1) + (−1) × 1 + 1 × 2 = −1
(b) f (1, −2, 3, 1) = 12 − (−2)2 − 32 − 2 × 1 = −14 (this is a function of 4 independent variables).
2. Functions of two variables

The aim of this Section is to enable the reader to gain confidence in dealing with functions of several
variables. In order to do this we often concentrate on functions of just two variables. The latter
have an easy geometrical interpretation and we can therefore use our geometrical intuition to help
understand the meaning of much of the mathematics associated with such functions. We begin by
reminding the reader of the Cartesian coordinate system used to locate points in three dimensions.
A point P is located by specifying its Cartesian coordinates (a, b, c) defined in Figure 1.
z
c y
a
x b
Figure 1
Within this 3-dimensional space we can consider simple surfaces. Perhaps the simplest is the plane.
From 9.6 on vectors we recall the general equation of a plane:
Ax + By + Cz = D
where
A, B,
C, D are constants. This plane intersects the x−axis (where
y = z= 0) at the point
D D
, 0, 0 , intersects the y−axis (where x = z = 0) at the point 0, , 0 and the z−axis
A B
D
(where x = y = 0) at the point 0, 0, . See Figure 2 where the dotted lines are hidden from
C
view behind the plane which passes through three points marked on the axes.
D
C
D y
B
D
A
Figure 2
4 HELM (2006):
®
There are some special cases of note.
• B = C = 0 A 6= 0.
Here the plane is x = D/A. This plane (for any given values of D and A) is parallel to the zy
plane a distance D/A units from it. See Figure 3a.
• A = 0, C = 0 B 6= 0
Here the plane is y = D/B and is parallel to the zx plane at a distance D/B units from it.
See Figure 3b.
• A = 0, B = 0 C 6= 0
Here the plane is z = D/C which is parallel to the xy plane a distance D/C units from it.
See Figure 3c.
z z
z
D
C
D
B
y y y
D
A
x (a) x (b) x (c)
Figure 3
Planes are particularly simple examples of surfaces. Generally, a surface is described by a relation
connecting the three variables x, y, z. In the case of the plane this relation is linear Ax+By+Cz = D.
In some cases, as we have seen, one or two variables may be absent from the relation. In three
dimensions such a relation still defines a surface, for example z = 0 defines the plane of the x- and
y-axes.
Although any relation connecting x, y, z defines a surface, by convention, one of the variables (usually
z) is chosen as the dependent variable and the other two therefore are independent variables. For
the case of a plane Ax + By + Cz = D (and C 6= 0) we would write, for example,
1
z= (D − Ax − By)
C
Generally a surface is defined by a relation of the form
z = f (x, y)
where the expression on the right is any relation involving two variables x, y.
HELM (2006): 5
Sketching surfaces
A plane is relatively easy to sketch since it is flat all we need to know about it is where it intersects
the three coordinate axes. For more general surfaces what we do is to sketch curves (like contours)
which lie on the surface. If we draw enough of these curves our ‘eye’ will naturally interpret the shape
of the surface.
Let us see, for example, how we sketch z = x2 + y 2 .
Firstly we confirm that z = x2 +y 2 is a surface since this is a relation connecting the three coordinate
variables x, y, z. In the standard notation our function of two variables is f (x, y) = x2 +y 2 . To sketch
the surface we fix one of the variables at a constant value.
• Fix x at value x0 .
From our discussion above we remember that x = x0 is the equation of a plane parallel to the zy
plane. In this case our relation becomes:
z = x20 + y 2
Since z is now a function of a single variable y, with x20 held constant, this relation: z = x20 + y 2
defines a curve which lies in the plane x = x0 .
In Figure 4(a) we have drawn this curve (a parabola). Now by changing the value chosen for x0
we will obtain a sequence of curves, each a parabola, lying in a different plane, and each being a
part of the surface we are trying to sketch. In Figure 4(b) we have drawn some of the curves of this
sequence.
z z
y y
x0
x x
(a) (b)
Figure 4
What we have done is to slice the surface by planes parallel to the zy plane. Each slice intersects
the surface in a curve. In this case we have not yet plotted enough curves to accurately visualise the
surface so we need to draw other surface curves.
• Fix y at value y0
Here y = y0 (the equation of a plane parallel to the zx plane.) In this case the surface becomes
z = x2 + y02
Again z is a function of single variable (since y0 is fixed) and describes a curve: again the curve is
a parabola, but this time residing on the plane y = y0 . For each different y0 we choose a different
parabola is obtained: each lying on the surface z = x2 +y 2 . Some of these curves have been sketched
6 HELM (2006):
®
in Figure 5(a). These have then combined with the curves of Figure 4(b) to produce Figure 5(b).
z
z
y y
x x
(a) (b)
Figure 5
We now have an idea of what the surface defined by z = x2 + y 2 looks like but to complete the
picture we draw a final sequence of curves.
• Fix z at value z0 .
We have z = z0 (the equation of a plane parallel to the xy plane.) In this case the surface becomes
z0 = x2 + y 2
√
But this is the equation of a circle centred on x = 0, y = 0 of radius z0 . (Clearly we must choose
z0 ≥ 0 because x2 + y 2 cannot be negative.) As we vary z0 we obtain different circles, each lying on
a different plane z = z0 . In Figure 6 we have combined the circles with the curves of Figure 5(b) to
obtain a good visualisation of the surface z = x2 + y 2 .
z z
y y
x x
Figure 6
(Technically the surface is called a paraboloid, obtained by rotating a parabola about the z−axis.)
With the wide availability of sophisticated graphics packages the need to be able to sketch a surface
is not as important as once it was. However, we urge the reader to attempt simple surface sketching
in the initial stages of this study as it will enhance understanding of functions of two variables.
HELM (2006): 7

Partial Derivatives 18.2
Introduction
When a function of more than one independent input variable changes because of changes in one or
more of the input variables, it is important to calculate the change in the function itself. This can
be investigated by holding all but one of the variables constant and finding the rate of change of the
function with respect to the one remaining variable. This process is called partial differentiation. In
this Section we show how to carry out the process.

Prerequisites • understand the principle of differentiating a

function of one variable

'
$
• understand the concept of partial
differentiation
• differentiate a function partially with

Learning Outcomes respect to each of its variables in turn
On completion you should be able to . . . • evaluate first partial derivatives
• carry out successive partial differentiations
• formulate second partial derivatives

& %
8 HELM (2006):
®
1. First partial derivatives

The x partial derivative
For a function of a single variable, y = f (x), changing the independent variable x leads to a
corresponding change in the dependent variable y. The rate of change of y with respect to x is
df
given by the derivative, written . A similar situation occurs with functions of more than one
dx
variable. For clarity we shall concentrate on functions of just two variables.
In the relation z = f (x, y) the independent variables are x and y and the dependent variable z.
We have seen in Section 18.1 that as x and y vary the z-value traces out a surface. Now both of the
variables x and y may change simultaneously inducing a change in z. However, rather than consider
this general situation, to begin with we shall hold one of the independent variables fixed. This is
equivalent to moving along a curve obtained by intersecting the surface by one of the coordinate
planes.
Consider f (x, y) = x3 + 2x2 y + y 2 + 2x + 1.
Suppose we keep y constant and vary x; then what is the rate of change of the function f ?
Suppose we hold y at the value 3 then
f (x, 3) = x3 + 6x2 + 9 + 2x + 1 = x3 + 6x2 + 2x + 10
In effect, we now have a function of x only. If we differentiate it with respect to x we obtain the
expression:
3x2 + 12x + 2.
We say that f has been partially differentiated with respect to x. We denote the partial derivative
∂f
of f with respect to x by (to be read as ‘partial dee f by dee x’ ). In this example, when y = 3:
∂x
∂f
= 3x2 + 12x + 2.
∂x
In the same way if y is held at the value 4 then f (x, 4) = x3 + 8x2 + 16 + 2x + 1 = x3 + 8x2 + 2x + 17
and so, for this value of y
∂f
= 3x2 + 16x + 2.
∂x
Now if we return to the original formulation
f (x, y) = x3 + 2x2 y + y 2 + 2x + 1
and treat y as a constant then the process of partial differentiation with respect to x gives
∂f
= 3x2 + 4xy + 0 + 2 + 0
∂x
= 3x2 + 4xy + 2.
HELM (2006): 9
Section 18.2: Partial Derivatives
Key Point 1
The Partial Derivative of f with respect to x

For a function of two variables z = f (x, y) the partial derivative of f with respect to x is denoted
∂f
by and is obtained by differentiating f (x, y) with respect to x in the usual way but treating
∂x
the y-variable as if it were a constant.
∂f ∂z
Alternative notations for are fx (x, y) or fx or .
∂x ∂x
Example 2
∂f
Find for (a) f (x, y) = x3 + x + y 2 + y, (b) f (x, y) = x2 y + xy 3 .
∂x
Solution
∂f ∂f
(a) = 3x2 + 1 + 0 + 0 = 3x2 + 1 (b) = 2x × y + 1 × y 3 = 2xy + y 3
∂x ∂x
The y partial derivative

For functions of two variables f (x, y) the x and y variables are on the same footing, so what we have
done for the x-variable we can do for the y-variable. We can thus imagine keeping the x-variable
∂f
fixed and determining the rate of change of f as y changes. This rate of change is denoted by .
∂y
Key Point 2
The Partial Derivative of f with respect to y

For a function of two variables z = f (x, y) the partial derivative of f with respect to y is denoted
∂f
by and is obtained by differentiating f (x, y) with respect to y in the usual way but treating
∂y
the x-variable as if it were a constant.
∂f ∂z
Alternative notations for are fy (x, y) or fy or .
∂y ∂y
10 HELM (2006):
®
Returning to f (x, y) = x3 + 2x2 y + y 2 + 2x + 1 once again, we therefore obtain:

∂f
= 0 + 2x2 × 1 + 2y + 0 + 0 = 2x2 + 2y.
∂y
Example 3
∂f
Find for (a) f (x, y) = x3 + x + y 2 + y (b) f (x, y) = x2 y + xy 3
∂y
Solution
∂f ∂f
(a) = 0 + 0 + 2y + 1 = 2y + 1 (b) = x2 × 1 + x × 3y 2 = x2 + 3xy 2
∂y ∂y
∂f
We can calculate the partial derivative of f with respect to x and the value of at a specific point
∂x
e.g. x = 1, y = −2.
Example 4
Find fx (1, −2) and fy (−3, 2) for f (x, y) = x2 + y 3 + 2xy.
∂f ∂f
[Remember fx means and fy means .]
∂x ∂y
Solution
fx (x, y) = 2x + 2y, so fx (1, −2) = 2 − 4 = −2; fy (x, y) = 3y 2 + 2x, so fy (−3, 2) = 12 − 6 = 6
Task
Given f (x, y) = 3x2 + 2y 2 + xy 3 find fx (1, −2) and fy (−1, −1).
∂f ∂f
First find expressions for and :
∂x ∂y
Your solution
∂f ∂f
= =
∂x ∂y
Answer
∂f ∂f
= 6x + y 3 , = 4y + 3xy 2
∂x ∂y
HELM (2006): 11
Now calculate fx (1, −2) and fy (−1, −1):
Your solution
fx (1, −2) = fy (−1, −1) =
Answer
fx (1, −2) = 6 × 1 + (−2)3 = −2; fy (−1, −1) = 4 × (−1) + 3(−1) × 1 = −7
Functions of several variables

∂f ∂f
As we have seen, a function of two variables f (x, y) has two partial derivatives, and . In an
∂x ∂y
∂f ∂f
exactly analogous way a function of three variables f (x, y, u) has three partial derivatives ,
∂x ∂y
∂f
and , and so on for functions of more than three variables. Each partial derivative is obtained in
∂u
the same way as stated in Key Point 3:
Key Point 3
x, y , u , v , w , . . . )
The Partial Derivatives of f (x
For a function of several variables z = f (x, y, u, v, w, . . . ) the partial derivative of f with respect
∂f
to v (say) is denoted by and is obtained by differentiating f (x, y, u, v, w, . . . ) with respect to
∂v
v in the usual way but treating all the other variables as if they were constants.
∂f ∂z
Alternative notations for when z = f (x, y, u, v, w, . . . ) are fv (x, y, u, v, w . . . ) and fv and .
∂v ∂v
Task
∂f ∂f
Find and for f (x, y, u, v) = x2 + xy 2 + y 2 u3 − 7uv 4
∂x ∂u
Your solution
∂f ∂f
= =
∂x ∂u
Answer
∂f ∂f
= 2x + y 2 + 0 + 0 = 2x + y 2 ; = 0 + 0 + y 2 × 3u2 − 7v 4 = 3y 2 u2 − 7v 4 .
∂x ∂u
12 HELM (2006):
®
Task
The pressure, P , for one mole of an ideal gas is related to its absolute temperature,
T , and specific volume, v, by the equation
P v = RT
where R is the gas constant.
Obtain simple expressions for
(a) the coefficient of thermal expansion, α, defined by:

1 ∂v
α=
v ∂T P
(b) the isothermal compressibility, κT , defined by:

1 ∂v
κT = −
v ∂P T
Your solution
(a)
Answer
RT ∂v R
v= ⇒ =
P ∂T P P

1 ∂v R 1
so α= = =
v ∂T P Pv T
Your solution
(b)
Answer
RT ∂v RT
v= ⇒ =− 2
P ∂P T P

1 ∂v RT 1
so κT = − = 2
=
v ∂P T vP P
HELM (2006): 13
Exercises
∂f ∂f
1. For the following functions find and
∂x ∂y
(a) f (x, y) = x + 2y + 3
(b) f (x, y) = x2 + y 2
(c) f (x, y) = x3 + xy + y 3
(d) f (x, y) = x4 + xy 3 + 2x3 y 2
(e) f (x, y, z) = xy + yz
2. For the functions of Exercise 1 (a) to (d) find fx (1, 1), fx (−1, −1), fy (1, 2), fy (2, 1).
Answers
∂f ∂f
1. (a) = 1, =2
∂x ∂y
∂f ∂f
(b) = 2x, = 2y
∂x ∂y
∂f ∂f
(c) = 3x2 + y, = x + 3y 2
∂x ∂y
∂f ∂f
(d) = 4x3 + y 3 + 6x2 y 2 , = 3xy 2 + 4x3 y
∂x ∂y
∂f ∂f
(e) = y, =x+z
∂x ∂y
2.
fx (1, 1) fx (−1, −1) fy (1, 2) fy (2, 1)
(a) 1 1 2 2
(b) 2 −2 4 2
(c) 4 2 13 5
(d) 11 1 20 38
14 HELM (2006):
®
2. Second partial derivatives

Performing two successive partial differentiations of f (x, y) with respect to x (holding y constant)
∂2f
is denoted by (or fxx (x, y)) and is defined by
∂x2
∂2f

∂ ∂f
≡
∂x2 ∂x ∂x
∂2f
For functions of two or more variables as well as other second-order partial derivatives can be
∂x2
∂2f
obtained. Most obvious is the second derivative of f (x, y) with respect to y is denoted by (or
∂y 2
fyy (x, y)) which is defined as:
∂2f

∂ ∂f
≡
∂y 2 ∂y ∂y
Example 5
∂2f ∂2f
Find and for f (x, y) = x3 + x2 y 2 + 2y 3 + 2x + y.
∂x2 ∂y 2
Solution
∂f
= 3x2 + 2xy 2 + 0 + 2 + 0 = 3x2 + 2xy 2 + 2
∂x
∂2f

∂ ∂f
≡ = 6x + 2y 2 + 0 = 6x + 2y 2 .
∂x2 ∂x ∂x
∂f
= 0 + x2 × 2y + 6y 2 + 0 + 1 = 2x2 y + 6y 2 + 1
∂y
∂2f

∂ ∂f
= = 2x2 + 12y.
∂y 2 ∂y ∂y
We can use the alternative notation when evaluating derivatives.
Example 6
Find fxx (−1, 1) and fyy (2, −2) for f (x, y) = x3 + x2 y 2 + 2y 3 + 2x + y.
Solution
fxx (−1, 1) = 6 × (−1) + 2 × (−1)2 = −4.
fyy (2, −2) = 2 × (2)2 + 12 × (−2) = −16
HELM (2006): 15
Mixed second derivatives
It is possible to carry out a partial differentiation of f (x, y) with respect to x followed by a partial
differentiation with respect to y (or vice-versa). The results are examples of mixed derivatives. We
must be careful with the notation here.
∂2f
We use to mean ‘differentiate first with respect to y and then with respect to x’ and we use
∂x∂y
∂2f
to mean ‘differentiate first with respect to x and then with respect to y’:
∂y∂x
∂2f ∂2f

∂ ∂f ∂ ∂f
i.e. ≡ and ≡ .
∂x∂y ∂x ∂y ∂y∂x ∂y ∂x
(This explains why the order is opposite of what we expect - the derivative ‘operates on the left’.)
Example 7
∂2f
For f (x, y) = x3 + 2x2 y 2 + y 3 find .
∂x∂y
Solution
∂f ∂2f
= 4x2 y + 3y 2 ; = 8xy
∂y ∂x∂y
Theremaining
possibility is to differentiate first with respect to x and then with respect to y i.e.
∂ ∂f
.
∂y ∂x
∂f ∂2f
For the function in Example 7 = 3x2 + 4xy 2 and = 8xy. Notice that for this function
∂x ∂y∂x
∂2f ∂2f
≡ .
∂x∂y ∂y∂x
This equality of mixed derivatives is true for all functions which you are likely to meet in your studies.
∂2f
To evaluate a mixed derivative we can use the alternative notation. To evaluate we write
∂x∂y
∂2f
fyx (x, y) to indicate that the first differentiation is with respect to y. Similarly, is denoted by
∂y∂x
fxy (x, y).
16 HELM (2006):
®
Example 8
Find fyx (1, 2) for the function f (x, y) = x3 + 2x2 y 2 + y 3
Solution
fx = 3x2 + 4xy 2 and fyx = 8xy so fyx (1, 2) = 8 × 1 × 2 = 16.
Task
Find fxx (1, 2), fyy (−2, −1), fxy (3, 3) for f (x, y) ≡ x3 + 3x2 y 2 + y 2 .
Your solution
Answer
fx = 3x2 + 6xy 2 ; fy = 6x2 y + 2y
fxx = 6x + 6y 2 ; fyy = 6x2 + 2; fxy = fyx = 12xy
fxx (1, 2) = 6 + 24 = 30; fyy (−2, −1) = 26; fxy (3, 3) = 108
HELM (2006): 17
The ideal gas law and Redlich-Kwong equation
Introduction
In Chemical Engineering it is often necessary to be able to equate the pressure, volume and temper-
ature of a gas. One relevant equation is the ideal gas law
P V = nR T (1)
where P is pressure, V is volume, n is the number of moles of gas, T is temperature and R is the
ideal gas constant (= 8.314 J mol−1 K−1 , when all quantities are in S.I. units). The ideal gas law
has been in use since 1834, although its special cases at constant temperature (Boyle’s Law, 1662)
and constant pressure (Charles’ Law, 1787) had been in use many decades previously.
While the ideal gas law is adequate in many circumstances, it has been superseded by many other
laws where, in general, simplicity is weighed against accuracy. One such law is the Redlich-Kwong
equation
RT a
P = − √ (2)
V −b T V (V + b)
where, in addition to the variables in the ideal gas law, the extra parameters a and b are dependent
upon the particular gas under consideration.
Clearly, in both equations the temperature, pressure and volume will be positive. Additionally, the
Redlich-Kwong equation is only valid for values of volume greater than the parameter b - in practice
however, this is not a limitation, since the gas would condense to a liquid before this point was
reached.
Problem in words
Show that for both Equations (1) and (2)
(a) for constant temperature, the pressure decreases as the volume increases
(Note : in the Redlich-Kwong equation, assume that T is large.)
(b) for constant volume, the pressure increases as the temperature increases.

For both Equations (1) and (2), and for the allowed ranges of the variables, show that
∂P
(a) <0 for T = constant
∂V
∂P
(b) >0 for V = constant
∂T
Assume that T is sufficiently large so that terms in T −1/2 may be neglected when compared to terms
in T .
18 HELM (2006):
®
1. Ideal gas law

This can be rearranged as
nR T
P =
V
so that
(i) at constant temperature

∂P −nR T
= <0 as all quantities are positive
∂V V2
(ii) for constant volume
∂P nR
= >0 as all quantities are positive
∂T V
2. Redlich-Kwong equation
RT a
P = − √
V −b T V (V + b)
= R T (V − b) − a T −1/2 (V 2 + V b)−1
−1
so that
(i) at constant temperature

∂P
= −R T (V − b)−2 + a T −1/2 (V 2 + V b)−2 (2V + b)
∂V
which, for large T , can be approximated by
∂P −R T
≈ <0 as all quantities are positive
∂V (V − b)2
(ii) for constant volume

∂P 1
= R(V − b)−1 + a T −3/2 (V 2 + V b)−1 >0 as all quantities are positive
∂T 2
Interpretation
∂P
In practice, the restriction on T is not severe, and regions in which < 0 does not apply are those
∂V
in which the gas is close to liquefying and, therefore, the entire Redlich-Kwong equation no longer
applies.
HELM (2006): 19
Exercises
∂2f ∂2f ∂2f ∂2f
1. For the following functions find , , , .
∂x2 ∂y 2 ∂x∂y ∂y∂x
(a) f (x, y) = x + 2y + 3
(b) f (x, y) = x2 + y 2
(c) f (x, y) = x3 + xy + y 3
(d) f (x, y) = x4 + xy 3 + 2x3 y 2
(e) f (x, y, z) = xy + yz
2. For the functions of Exercise 1 (a) to (d) find fxx (1, −3), fyy (−2, −2), fxy (−1, 1).
∂f ∂2f
3. For the following functions find and
∂x ∂x∂t
(a) f (x, t) = x sin(tx) + x2 t (b) f (x, t, z) = zxt − ext (c) f (x, t) = 3 cos(t + x2 )
Answers
∂2f ∂2f ∂2f ∂2f
1. (a) = 0 = = =
∂x2 ∂y 2 ∂x∂y ∂y∂x
∂2f ∂2f ∂2f ∂2f
(b) = 2 = ; = =0
∂x2 ∂y 2 ∂x∂y ∂y∂x
∂2f ∂2f ∂2f ∂2f
(c) = 6x, = 6y; = = 1.
∂x2 ∂y 2 ∂x∂y ∂y∂x
∂2f 2 2 ∂2f 3 ∂2f ∂2f
(d) = 12x + 12xy , = 6xy + 4x , = = 3y 2 + 12x2 y
∂x2 ∂y 2 ∂x∂y ∂y∂x
∂2f ∂2f ∂2f ∂2f
(e) = = 0; = =1
∂x2 ∂y 2 ∂x∂y ∂y∂x
fxx (1, −3) fyy (−2, −2) fxy (−1, 1)

(a) 0 0 0
2. (b) 2 2 0
(c) 6 −12 1
(d) 120 −8 15
∂f ∂2f ∂2f
3. (a) = sin(tx) + xt cos(tx) + 2xt = = 2x cos(tx) − x2 t sin(tx) + 2x
∂x ∂t∂x ∂x∂t
∂f ∂2f ∂2f
(b) = zt − text = = z − ext − txext
∂f 2 ∂2f ∂2f
(c) = −6x sin(t + x ) = = −6x cos(t + x2 )
20 HELM (2006):
®

Stationary Points 18.3
Introduction
The calculation of the optimum value of a function of two variables is a common requirement in many
areas of engineering, for example in thermodynamics. Unlike the case of a function of one variable
we have to use more complicated criteria to distinguish between the various types of stationary point.

• understand the idea of a function of two
Prerequisites variables
Before starting this Section you should . . . • be able to work out partial derivatives

'
$
• identify local maximum points, local
minimum points and saddle points on the
surface z = f (x, y)
Learning Outcomes • use first partial derivatives to locate the
On completion you should be able to . . . stationary points of a function f (x, y)
• use second partial derivatives to determine

the nature of a stationary point
& %
HELM (2006): 21
Section 18.3: Stationary Points
1. The stationary points of a function of two variables
Figure 7 shows a computer generated picture of the surface defined by the function
z = x3 + y 3 − 3x − 3y, where both x and y take values in the interval [−1.8, 1.8].
4
z C
3
D
2
A
1
-1
-2
-3
-4
B
2
1 2
x y 1
0
0
-1
-1
-2 -2
Figure 7
There are four features of particular interest on the surface. At point A there is a local maximum,
at B there is a local minimum, and at C and D there are what are known as saddle points.
At A the surface is at its greatest height in the immediate neighbourhood. If we move on the surface
from A we immediately lose height no matter in which direction we travel. At B the surface is at its
least height in the neighbourhood. If we move on the surface from B we immediately gain height,
no matter in which direction we travel.
The features at C and D are quite different. In some directions as we move away from these points
along the surface we lose height whilst in others we gain height. The similarity in shape to a horse’s
saddle is evident.
At each point P of a smooth surface one can draw a unique plane which touches the surface there.
This plane is called the tangent plane at P . (The tangent plane is a natural generalisation of
the tangent line which can be drawn at each point of a smooth curve.) In Figure 7 at each of
the points A, B, C, D the tangent plane to the surface is horizontal at the point of interest. Such
points are thus known as stationary points of the function. In the next subsections we show how to
locate stationary points and how to determine their nature using partial differentiation of the function
f (x, y),
22 HELM (2006):
®
Task
In Figures 8 and 9 what are the features at A and B?
-4
A
-5
-6
-7
-8
-9
-10
-11
-12
5
4.5
4
B 1.5
1
3.5 0.5
3 0
-0.5
2.5 -1
2 -1.5
Figure 8
10 A
8
2
B
0
-2
-4
-6
-8
5
4
2
3
1
2
1 0
0 -1
-1 -2
Figure 9
Your solution
HELM (2006): 23
Answer
Figure 8 A is a saddle point, B is a local minimum.
Figure 9 A is a local maximum, B is a saddle point.
2. Location of stationary points

As we said in the previous subsection, the tangent plane to the surface z = f (x, y) is horizontal at a
stationary point. A condition which guarantees that the function f (x, y) will have a stationary point
at a point (x0 , y0 ) is that, at that point both fx = 0 and fy = 0 simultaneously.
Task
Verify that (0, 2) is a stationary point of the function f (x, y) = 8x2 +6y 2 −2y 3 +5
and find the stationary value f (0, 2).
First, find fx and fy :

Your solution
Answer
fx = 16x ; fy = 12y − 6y 2
Now find the values of these partial derivatives at x = 0, y = 2:
Your solution
Answer
fx = 0 , fy = 24 − 24 = 0
Hence (0, 2) is a stationary point.
The stationary value is f (0, 2) = 0 + 24 − 16 + 5 = 13
Example 9
Find a second stationary point of f (x, y) = 8x2 + 6y 2 − 2y 3 + 5.
Solution
fx = 16x and fy ≡ 6y(2 − y). From this we note that fx = 0 when x = 0, and fx = 0 and when
y = 0, so x = 0, y = 0 i.e. (0, 0) is a second stationary point of the function.
It is important when solving the simultaneous equations fx = 0 and fy = 0 to find stationary points
not to miss any solutions. A useful tip is to factorise the left-hand sides and consider systematically
all the possibilities.
24 HELM (2006):
®
Example 10
Locate the stationary points of
f (x, y) = x4 + y 4 − 36xy
Solution
First we write down the partial derivatives of f (x, y)
∂f ∂f
= 4x3 − 36y = 4(x3 − 9y) = 4y 3 − 36x = 4(y 3 − 9x)
∂x ∂y
∂f ∂f
Now we solve the equations = 0 and = 0:
∂x ∂y
x3 − 9y = 0 (i)
y 3 − 9x = 0 (ii)
y3
From (ii) we obtain: x = (iii)
9
Now substitute from (iii) into (i)
y9
− 9y = 0
93
⇒ y 9 − 94 y = 0
⇒ y(y 8 − 38 ) = 0 (removing the common factor)
⇒ y(y 4 − 34 )(y 4 + 34 ) = 0 (using the difference of two squares)
We therefore obtain, as the only solutions:
y = 0 or y 4 − 34 = 0 (since y 4 + 34 is never zero)
The last equation implies:
(y 2 − 9)(y 2 + 9) = 0 (using the difference of two squares)
∴ y 2 = 9 and y = ± 3.
Now, using (iii): when y = 0, x = 0, when y = 3, x = 3, and when y = −3, x = −3.

The stationary points are (0, 0), (−3, −3) and (3, 3).
Task
Locate the stationary points of
f (x, y) = x3 + y 2 − 3x − 6y − 1.
First find the partial derivatives of f (x, y):

Your solution
HELM (2006): 25
Answer
∂f ∂f
= 3x2 − 3, = 2y − 6
∂x ∂y
∂f ∂f
Now solve simultaneously the equations = 0 and = 0:
∂x ∂y
Your solution
Answer
3x2 − 3 = 0 and 2y − 6 = 0.
Hence x2 = 1 and y = 3, giving stationary points at (1, 3) and (−1, 3).
3. The nature of a stationary point

We state, without proof, a relatively simple test to determine the nature of a stationary point, once
located. If the surface is very flat near the stationary point then the test will not be sensitive enough
to determine the nature of the point. The test is dependent upon the values of the second order
derivatives: fxx , fyy , fxy and also upon a combination of second order derivatives denoted by D where
2 2
∂2f ∂2f ∂ f
D≡ 2
× 2 − , which is also expressible as D ≡ fxx fyy − (fxy )2
∂x ∂y ∂x∂y
The test is as follows:
Key Point 4
Test to Determine the Nature of Stationary Points
1. At each stationary point work out the three second order partial derivatives.
2. Calculate the value of D = fxx fyy − (fxy )2 at each stationary point.

Then, test each stationary point in turn:
3. If D < 0 the stationary point is a saddle point.

∂2f
If D > 0 and > 0 the stationary point is a local minimum.
∂x2
∂2f
If D > 0 and < 0 the stationary point is a local maximum.
∂x2
If D = 0 then the test is inconclusive (we need an alternative test).
26 HELM (2006):
®
Example 11
The function: f (x, y) = x4 + y 4 − 36xy has stationary points at
(0, 0), (−3, −3), (3, 3). Use Key Point 4 to determine the nature of each sta-
tionary point.
Solution
∂f ∂f
We have = fx = 4x3 − 36y and = fy = 4y 3 − 36x.
∂x ∂y
∂2f 2 ∂ f
2
2 ∂2f
Then = f xx = 12x , = f yy = 12y , = fyx = −36.
∂x2 ∂y 2 ∂x∂y
A tabular presentation is useful for calculating D = fxx fyy − (fxy )2 :
Point Point Point

Derivatives (0, 0) (−3, −3) (3, 3)
fxx 0 108 108
fyy 0 108 108
fxy −36 −36 −36
D <0 >0 >0
(0, 0) is a saddle point; (−3, −3) and (3, 3) are both local minima.
Task
Determine the nature of the stationary points of f (x, y) = x3 + y 2 − 3x − 6y − 1,
which are (1, 3) and (1, −3).
Write down the three second partial derivatives:

Your solution
Answer
fxx = 6x, fyy = 2, fxy = 0.
HELM (2006): 27
Now complete the table below and determine the nature of the stationary points:
Your solution
Point Point
Derivatives (1, 3) (−1, 3)
fxx
fyy
fxy
Answer
Point Point
Derivatives (1, 3) (−1, 3)
fxx 6 −6
fyy 2 2
fxy 0 0
D >0 <0
State the nature of each stationary point:

Your solution
Answer
(1, 3) is a local minimum; (−1, 3) is a saddle point.
28 HELM (2006):
®
For most functions the procedures described above enable us to distinguish between the various types
of stationary point. However, note the following example, in which these procedures fail.
Given f (x, y) = x4 + y 4 + 2x2 y 2 .
∂f ∂f
= 4x3 + 4xy 2 , = 4y 3 + 4x2 y,
∂x ∂y
∂2f ∂2f ∂2f

2
= 12x2 + 4y 2 , 2
= 12y 2 + 4x2 , = 8xy
∂x ∂y ∂x∂y
∂f ∂f
Location: The stationary points are located where = = 0, that is, where
∂x ∂y
4x3 +4xy 2 = 0 and 4y 3 +4x2 y = 0. A simple factorisation implies 4x(x2 +y 2 ) = 0 and 4y(y 2 +x2 ) =
0. The only solution which satisfies both equations is x = y = 0 and therefore the only stationary
point is (0, 0).
Nature: Unfortunately, all the second partial derivatives are zero at (0, 0) and therefore D = 0, so
the test, as described in Key Point 4, fails to give us the necessary information.
However, in this example it is easy to see that the stationary point is in fact a local minimum.
This could be confirmed by using a computer generated graph of the surface near the point (0, 0).
Alternatively, we observe x4 + y 4 + 2x2 y 2 ≡ (x2 + y 2 )2 so f (x, y) ≥ 0, the only point where
f (x, y) = 0 being the stationary point. This is therefore a local (and global) minimum.
Exercises
Determine the nature of the stationary points of the function in each case:
1. f (x, y) = 8x2 + 6y 2 − 2y 3 + 5
2. f (x, y) = x3 + 15x2 − 20y 2 + 10
3. f (x, y) = 4 − x2 − xy − y 2
4. f (x, y) = 2x2 + y 2 + 3xy − 3y − 5x + 8
5. f (x, y) = (x2 + y 2 )2 − 2(x2 − y 2 ) + 1
6. f (x, y) = x4 + y 4 + 2x2 y 2 + 2x2 + 2y 2 + 1
Answers
1. (0, 0) local minimum, (0, 2) saddle point.
2. (0, 0) saddle point, (−10, 0) local maximum.
3. (0, 0) local maximum.
4. (−1, 3) saddle point.
5. (0, 0) saddle point, (1,0) local minimum, (−1, 0) local minimum.
6. f (x, y) ≡ (x2 + y 2 + 1)2 , local minimum at (0, 0).
HELM (2006): 29
Errors and
Percentage Change 18.4
Introduction
When one variable is related to several others by a functional relationship it is possible to estimate
the percentage change in that variable caused by given percentage changes in the other variables.
For example, if the values of the input variables of a function are measured and the measurements
are in error, due to limits on the precision of measurement, then we can use partial differentiation to
estimate the effect that these errors have on the forecast value of the output.

Prerequisites • understand the definition of partial

derivatives and be able to find them

'
$
• calculate small errors in a function of more
than one variable
Learning Outcomes
• calculate approximate values for absolute
On completion you should be able to . . . error, relative error and percentage relative
error
& %
30 HELM (2006):
®
1. Approximations using partial derivatives

Functions of two variables
We saw in 16.5 how to expand a function of a single variable f (x) in a Taylor series:
0 (x − x0 )2 00
f (x) = f (x0 ) + (x − x0 )f (x0 ) + f (x0 ) + . . .
2!
This can be written in the following alternative form (by replacing x − x0 by h so that x = x0 + h):
h2 00
f (x0 + h) = f (x0 ) + hf 0 (x0 ) +f (x0 ) + . . .
2!
This expansion can be generalised to functions of two or more variables:
f (x0 + h, y0 + k) ' f (x0 , y0 ) + hfx (x0 , y0 ) + kfy (x0 , y0 )
where, assuming h and k to be small, we have ignored higher-order terms involving powers of h and
k. We define δf to be the change in f (x, y) resulting from small changes to x0 and y0 , denoted by
h and k respectively. Thus:
δf = f (x0 + h, y0 + k) − f (x0 , y0 )
and so δf ' hfx (x0 , y0 ) + kfy (x0 , y0 ). Using the notation δx and δy instead of h and k for small
increments in x and y respectively we may write
δf ' δx.fx (x0 , y0 ) + δy.fy (x0 , y0 )
Finally, using the more common notation for partial derivatives, we write
∂f ∂f
δf ' δx + δy.
∂x ∂y
Informally, the term δf is referred to as the absolute error in f (x, y) resulting from errors δx, δy
in the variables x and y respectively. Other measures of error are used. For example, the relative
δf δf
error in a variable f is defined as and the percentage relative error is × 100.
f f
Key Point 5
Measures of Error
If δf is the change in f at (x0 , y0 ) resulting from small changes h, k to x0 and y0 respectively, then
δf = f (x0 + h, y0 + k) − f (x0 , y0 ), and
The absolute error in f is δf.
δf
The relative error in f is .
f
δf
The percentage relative error in f is × 100.
f
HELM (2006): 31
Section 18.4: Errors and Percentage Change
Note that to determine the error numerically we need to know not only the actual values of δx and
δy but also the values of x and y at the point of interest.
Example 12
Estimate the absolute error for the function f (x, y) = x2 + x3 y
Solution
fx = 2x + 3x2 y; fy = x3 .
Then δf ' (2x + 3x2 y)δx + x3 δy
Task
Estimate the absolute error for f (x, y) = x2 y 2 + x + y at the point (−1, 2) if
δx = 0.1 and δy = 0.025. Compare the estimate with the exact value of the error.
First find fx and fy :

Your solution
fx = fy =
Answer
fx = 2xy 2 + 1, fy = 2x2 y + 1
Now obtain an expression for the absolute error:

Your solution
Answer
δf ' (2xy 2 + 1)δx + (2x2 y + 1)δy
Now obtain the estimated value of the absolute error at the point of interest:
Your solution
Answer
δf ' (2xy 2 + 1)δx + (2x2 y + 1)δy = (−7)(0.1) + (5)(0.025) = −0.575.
Finally compare the estimate with the exact value:
Your solution
32 HELM (2006):
®
Answer
The actual error is calculated from
δf = f (x0 + δx, y0 + δy) − f (x0 , y0 ) = f (−0.9, 2.025) − f (−1, 2) = −0.5534937.
We see that there is a reasonably close correspondence between the two values.
Functions of three or more variables

If f is a function of several variables x, y, u, v, . . . the error induced in f as a result of making small
errors δx, δy, δu, δv . . . in x, y, u, v, . . . is found by a simple generalisation of the expression for two
variables given above:
∂f ∂f ∂f ∂f
δf ' δx + δy + δu + δv + . . .
∂x ∂y ∂u ∂v
Example 13
Suppose that the area of triangle ABC is to be calculated by measuring two sides
and the included angle. Call the sides b and c and the angle A.
1
Then the area S of the triangle is given by S = bc sin A.
2
Now suppose that the side b is measured as 4.00 m, c as 3.00 m and A as 30o .
Suppose also that the measurements of the sides could be in error by as much
as ± 0.005 m and of the angle by ± 0.01o . Calculate the likely maximum error
induced in S as a result of the errors in the sides and angle.
Solution
1 1
Here S is a function of three variables b, c, A. We calculate S= × 4 × 3 × = 3 m2 .
2 2
∂S 1 ∂S 1 ∂S 1
Now = c sin A, = b sin A and = bc cos A, so
∂b 2 ∂c 2 ∂A 2
∂S ∂S ∂S 1 1 1
δS ' δb + δc + δA = c sin A δb + b sin A δc + bc cos A δA.
∂b ∂c ∂A 2 2 2
π
Here |δb|max = |δc|max = 0.005 and |δA|max = × 0.01 (A must be measured in radians).
180
Substituting these values we see that the maximum error in the calculated value of S is given by
the approximation
√ !
1 1 1 1 1 3 π
|δS|max ' ×3× × 0.005 + ×4× × 0.005 + ×4×3× × 0.01
2 2 2 2 2 2 180
' 0.0097 m2
Hence the estimated value of S is in error by up to about ± 0.01 m2 .
HELM (2006): 33
Measuring the height of a building
The height h of a building is estimated from (i) the known horizontal distance x between the point of
observation M and the foot of the building and (ii) the elevation angle θ between the horizontal and
the line joining the point of observation to the top of the building (see Figure 10). If the measured
horizontal distance is x = 150 m and the elevation angle is θ = 40◦ , estimate the error in measured
building height due to an error of 0.1◦ degree in the measurement of the angle of elevation.
M θ
x
Figure 10: Geometry of the measurement

The variables x, θ, and h are related by
tan θ = h/x.
or
x tan θ = h. (1)
The error in h resulting from a measurement error in θ can be deduced by differentiating (1):
d(x tan θ) dh dx d(tan θ) dh
= ⇒ tan θ +x = .
dθ dθ dθ dθ dθ
This can be written
dx dh
tan θ + x sec2 θ = . (2)
dθ dθ
Equation (2) gives the relationship among the small variations in variables x, h and θ. Since x is
dx
assumed to be without error and independent of θ, = 0 and equation (2) becomes
dθ
dh
x sec2 θ = . (3)
dθ
Equation (3) can be considered to relate the error in building height δh to the error in angle δθ:
δh
' x sec2 θ.
δθ
It is given that x = 150 m.
The incidence angle θ = 40◦ can be converted to radians i.e. θ = 40π/180 rad = 2π/9 rad.
Then the error in angle δθ = 0.1◦ needs to be expressed in radians for consistency of the units in (3).
34 HELM (2006):
®
So δθ = 0.1π/180 rad = π/1800 rad. Hence, from Equation (3)

π
δh = 150 ≈ 0.45 m.
1800 × cos2 (2π/9)
So the error in building height resulting from an error in elevation angle of 0.1◦ is about 0.45 m.
Task
Estimate the maximum error in f (x, y) = x2 + y 2 + xy at the point x = 2, y = 3
if maximum errors ± 0.01 and ± 0.02 are made in x and y respectively.
∂f ∂f
First find and :
∂x ∂y
Your solution
∂f ∂f
= =
∂x ∂y
Answer
∂f ∂f
= 2x + y; = 2y + x.
∂x ∂y
For x = 2 and y = 3 calculate the value of f (x, y):

Your solution
Answer
f (2, 3) = 22 + 32 + 2 × 3 = 19.
Now since the error in the measured value of x is ± 0.01 and in y is ± 0.02 we have
|δx|max = 0.01, |δy|max = 0.02. Write down an expression to approximate to |δf |max :
Your solution
Answer
|δf |max ' |(2x + y)| |δx|max + |(2y + x)| |δy|max
Calculate |δf |max at the point x = 2, y = 3 and give bounds for f (2, 3):
Your solution
Answer
|δf |max ' (2 × 2 + 3) × 0.01 + (2 × 3 + 2) × 0.02

= 0.07 + 0.16 = 0.23.
Hence we quote f (2, 3) = 19 ± 0.23, which can be expressed as 18.77 ≤ f (2, 3) ≤ 19.23
HELM (2006): 35
2. Relative error and percentage relative error
Two other measures of error can be obtained from a knowledge of the expression for the absolute error.

δf δf
As mentioned earlier, the relative error in f is and the percentage relative error is × 100 %.
f f
Suppose that f (x, y) = x2 + y 2 + xy then
∂f ∂f
δf ' δx + δy
∂x ∂y
= (2x + y)δx + (2y + x)δy
The relative error is

δf 1 ∂f 1 ∂f
' δx + δy
f f ∂x f ∂y
(2x + y) (2y + x)
= δx + 2 δy
x2 2
+ y + xy x + y 2 + xy
The actual value of the relative error can be obtained if the actual errors of the independent variables
are known and the values of x and y at the point of interest. In the special case where the function
is a combination of powers of the input variables then there is a short cut to finding the relative error
x2 y 4
in the value of the function. For example, if f (x, y, u) = 3 then
u
∂f 2xy 4 ∂f 4x2 y 3 ∂f 3x2 y 4
= 3 , = , =− 4
∂x u ∂y u3 ∂u u
Hence
2xy 4 4x2 y 3 3x2 y 4
δf ' δx + δy − δu
u3 u3 u4
Finally,
δf 2xy 4 u3 4x2 y 3 u3 3x2 y 4 u3
' 3 × 2 4 δx + × δy − × δu
f u xy u3 x2 y 4 u4 x2 y 4
Cancelling down the fractions,
δf δx δy δu
'2 +4 −3 (1)
f x y u
so that
rel. error in f ' 2× (rel. error in x) + 4× (rel. error in y) − 3× (rel. error in u).
Note that if we write
f (x, y, u) = x2 y 4 u−3
we see that the coefficients of the relative errors on the right-hand side of (1) are the powers of the
appropriate variable.
To find the percentage relative error we simply multiply the relative error by 100.
36 HELM (2006):
®
Task
x3 y
If f = and x, y, u are subject to percentage relative errors of 1%, −1% and
u2
2% respectively find the approximate percentage relative error in f .
∂f ∂f ∂f
First find , and :
∂x ∂y ∂u
Your solution
∂f ∂f ∂f
= = =
∂x ∂y ∂u
Answer
∂f 3x2 y ∂f x3 ∂f 2x3 y
= 2 , = 2, =− 3 .
∂x u ∂y u ∂u u
Now write down an expression for δf

Your solution
δf '
Answer
3x2 y x3 2x3 y
δf ' 2 δx + 2 δy − 3 δu
u u u
Hence write down an expression for the percentage relative error in f :
Your solution
Answer
δf 3x2 y u2 x3 u2 2x3 y u2
× 100 ' 2 × 3 δx × 100 + 2 × 3 δy × 100 − 3 × 3 δu × 100
f u xy u xy u xy
Finally, calculate the value of the percentage relative error:
Your solution
Answer
δf δx δy δu
× 100 ' 3 × 100 + × 100 − 2 × 100
f x y u
= 3(1) − 1 − 2(2) = −2%
Note that f = x3 yu−2 .
HELM (2006): 37
Error in power to a load resistance
Introduction
The power required by an electrical circuit depends upon its components. However, the specification
of the rating of the individual components is subject to some uncertainity. This Example concerns
the calculation of the error in the power required by a circuit shown in Figure 11 given a formula for
the power, the values of the individual components and the percentage errors in them.
Problem in words
The power delivered to the load resistance RL for the circuit shown in Figure 11 is given by
25RL
P =
(R + RL )2
2000Ω
RL
Figure 11: Circuit with a load resistance

If R = 2000 Ω and RL = 1000 Ω with a maximum possible error of 5% in either, find P and estimate
the maximum error in P.
25RL
We can calculate P by substituting R = 2000 and RL = 1000 into P = .
(R + RL )2
We need to calculate the absolute errors in R and RL and use these in the approximation δP ≈
P P
δR + δRL to calculate the error in P.
R RL
At R = 2000 and RL = 1000
25 × 1000 25 25
P = 2
= = × 10−3 ≈ 2.77 × 10−3 watts.
(1000 + 2000) 9000 9
5 5
A 5% error in R gives |δR|max = × 2000 = 100 and |δRL |max = × 1000 = 50
100 100
P P
|δP |max ≈ |δR|max + |δRL |max
R RL
We need to calculate the values of the partial derivatives at R = 2000 and RL = 1000.
25RL
P = = 25RL (R + RL )−2
(R + RL )2
P
= −50RL (R + RL )−3
R
38 HELM (2006):
®
P
= 25(R + RL )−2 − 50RL (R + RL )−3
RL
P −50 50
So (2000, 1000) = −50(1000)(3000)−3 = 2
= − × 10−6
R 1000 × 27 27

P 25 50
(2000, 1000) = 25(3000)−2 − 50(1000)(3000)−3 = − × 10−6
RL 9 27

75 − 50 25
= × 10−6 = × 10−6
27 27
P P
Substituting these values into |δP |max ≈ |δR|max + |δRL |max we get:
R RL

50 25 5000 25 × 50
|δP |max = × 10−6 × 100 + × 10−6 × 50 = + × 10−6 ≈ 2.315 × 10−4
27 27 27 27
Interpretation
At R = 2000 and RL = 1000, P will be 2.77 × 10−3 W and, assuming 5% errors in the values of the
resistors, then the error in P ≈ ±2.315 × 10−4 W. This represents about 8.4% error. So the error in
the power is greater than that in the individual components.
Exercises
1. The sides of a right-angled triangle enclosing the right-angle are measured as 6 m and 8 m. The
maximum errors in each measurement are ± 0.1m. Find the maximum error in the calculated
area.
2. In Exercise 1, the angle opposite the 8 m side is calculated from tan θ = 8/6 as θ = 53◦ 80 .
Calculate the approximate maximum error in that angle.
r
3x
3. If v = find the maximum percentage error in v due to errors of 1% in x and 3% in y.
y
r
1 E
4. If n = and L, E and d can be measured correct to within 1%, how accurate is the
2L d
calculated value of n?
1 2
5. The area of a segment of a circle which subtends an angle θ is given by A = r (θ − sin θ).
2
The radius r is measured with a percentage error of +0.2% and θ is measured as 450 with an
error of = +0.1◦ . Find the percentage error in the calculated area.
HELM (2006): 39
Answers
1 ∂A ∂A y x
1. A = xy δA ≈ δx + δy δA ≈ δx + δy
2 ∂x ∂y 2 2
Maximum error = |yδx| + |xδy| = 0.7 m2 .
y ∂θ ∂θ y x
2. θ = tan−1 so δθ = δx + δy = − 2 2
δx + 2 δy
x ∂x ∂y x +y x + y2

−8 6
Maximum error in θ is 2

2
(0.1) + 2

2
(0.1) = 0.014 rad. This is 0.80 .
6 +8 6 +8
1 1 1 δv δx δy
3. Take logarithms of both sides: ln v = ln 3 + ln x − ln y so ≈ −
2 2 2 v 2x 2y

δx δy 1 3
Maximum percentage error in v = + − = % + % = 2%.

2x 2y 2 2
4. Take logarithms of both sides:
1 1 δn δL δE δd
ln n = − ln 2 − ln L + ln E − ln d so =− + −
2 2 n L 2E 2d

δL δE δd 1 1
Maximum percentage error in n = − + + − = 1% + % + % = 2%.
L 2E 2d 2 2
1 δA 2δr 1 − cos θ
5. A = r2 (θ − sin θ) so = + δθ
2 A r θ − sin θ

1 

1− √ 


δA 2
 π
= 2(0.2)% + × 100% = (0.4 + 0.65)% = 1.05%
A π 1  1800
 −√ 

 
4 2
40 HELM (2006):
Contents 19
Differential Equations
19.1 Modelling with Differential Equations 2
19.2 First Order Differential Equations 11
19.3 Second Order Differential Equations 30
19.4 Applications of Differential Equations 51
Learning outcomes
In this Workbook you will learn what a differential equation is and how to recognise some
of the basic different types. You will learn how to apply some common techniques used
to obtain general solutions of differential equations and how to fit initial or boundary
conditions to obtain a unique solution. You will appreciate how differential equations
arise in applications and you will gain some experience in applying your knowledge to
model a number of engineering problems using differential equations.
Modelling with
Differential Equations 19.1

Introduction
Many models of engineering systems involve the rate of change of a quantity. There is thus a need
to incorporate derivatives into the mathematical model. These mathematical models are examples
of differential equations.
Accompanying the differential equation will be one or more conditions that let us obtain a unique
solution to a particular problem. Often we solve the differential equation first to obtain a general
solution; then we apply the conditions to obtain the unique solution. It is important to know which
conditions must be specified in order to obtain a unique solution.

• be able to differentiate; ( 11)

Prerequisites
• be able to integrate; ( 13)

'
$
• understand the use of differential equations in
modelling engineering systems
• identify the order and type of a differential

equation
Learning Outcomes
• recognise the nature of a general solution
• determine the nature of the appropriate
additional conditions which will give a unique
solution to the equation
& %
2 HELM (2006):
Workbook 19: Differential Equations
®
1. Case study: Newton’s law of cooling

When a hot liquid is placed in a cooler environment, experimental observation shows that its tem-
perature decreases to approximately that of its surroundings. A typical graph of the temperature of
the liquid plotted against time is shown in Figure 1.
Temperature
of Liquid
surrounding
temperature
Time
Figure 1
After an initially rapid decrease the temperature changes progressively less rapidly and eventually the
curve appears to ‘flatten out’.
Newton’s law of cooling states that the rate of cooling of liquid is proportional to the difference
between its temperature and the temperature of its environment (the ambient temperature). To
convert this into mathematics, let t be the time elapsed (in seconds, s), θ the temperature of the
liquid (◦ C),and θ0 the temperature of the liquid at the start (t = 0). The temperature of the
surroundings is denoted by θs .
Task
Write down the mathematical equation which is equivalent to Newton’s law of
cooling and state the accompanying condition.
First, find an expression for the rate of cooling, and an expression for the difference between the
liquid’s temperature and that of the environment:
Your solution
Answer
dθ
The rate of cooling is the rate of change of temperature with time: .
dt
The temperature difference is θ − θs .
HELM (2006): 3
Section 19.1: Modelling with Differential Equations
Now formulate Newton’s law of cooling:
Your solution
Answer
dθ dθ
You should obtain ropto(θ − θs ) or, equivalently: = −k(θ − θs ). k is a positive constant of
dt dt
dθ
proportion and the negative sign is present because (θ−θs ) is positive, whereas must be negative,
dt
since θ decreases with time. The units of k are s−1 . The accompanying condition is θ = θ0 at t = 0
which simply states the temperature of the liquid when the cooling begins.
In the above Task we call t the independent variable and θ the dependent variable. Since the condition
is given at t = 0 we refer to it as an initial condition. For future reference, the solution of the above
differential equation which satisfies the initial condition is θ = θs + (θ0 − θs )e−kt .
2. The general solution of a differential equation

Consider the equation y = Ae2x where A is an arbitrary constant. If we differentiate it we obtain
dy
= 2Ae2x
dx
and so, since y = Ae2x we obtain
dy
= 2y.
dx
Thus a differential equation satisfied by y is
dy
= 2y.
dx
Note that we have eliminated the arbitrary constant.
Now consider the equation
y = A cos 3x + B sin 3x
where A and B are arbitrary constants. Differentiating, we obtain
dy
= −3A sin 3x + 3B cos 3x.
dx
Differentiating a second time gives
d2 y
= −9 A cos 3x − 9 B sin 3x.
dx2
The right-hand side is simply (−9) times the expression for y. Hence y satisfies the differential
equation
d2 y
= −9y.
dx2
4 HELM (2006):
®
Task
Find a differential equation satisfied by y = A cosh 2x + B sinh 2x where A and
B are arbitrary constants.
Your solution
Answer
dy
Differentiating once we obtain = 2A sinh 2x + 2B cosh 2x
dx
d2 y
Differentiating a second time we obtain = 4A cosh 2x + 4B sinh 2x
dx2
d2 y
Hence = 4y
dx2
We have seen that an expression including one arbitrary constant required one differentiation to
obtain a differential equation which eliminated the arbitrary constant. Where two constants were
present, two differentiations were required. Is the converse true? For example, would a differential
dy
equation involving as the only derivative have a general solution with one arbitrary constant and
dx
d2 y
would a differential equation which had 2 as the highest derivative produce a general solution with
dx
two arbitrary constants? The answer is, usually, yes.
Task
Integrate twice the differential equation
d2 y w
= (`x − x2 ),
dx2 2
where w and ` are constants, to find a general solution for y.
Your solution
Answer
w `x2 x3

dy
Integrating once: = − + A where A is an arbitrary constant (of integration).
dx 2 2 3
w `x3 x4

Integrating again: y= − + A x + B where B is a second arbitrary constant.
2 6 12
HELM (2006): 5
Consider the simple differential equation
dy
= 2x.
dx
On integrating, we obtain the general solution
y = x2 + C
where C is an arbitrary constant. As C varies we get different solutions, each of which belongs to
the family of solutions. Figure 2 shows some examples.
y = x2 +1 C= 1
y y = x2 C=0
y = x2 −1 C = −1
y = x2 −3 C = −3
1
−1
−3
Figure 2
It can be shown that no two members of this family of graphs ever meet and that through each
point in the x-y plane passes one, and only one, of these graphs. Hence if we specify the boundary
condition y = 2 when x = 0, written y(0) = 2, then using y = x2 + c:
2=0+C so that C = 2
and y = x2 + 2 is the unique solution.
Task
dy
Find the unique solution of the differential equation = 3x2 which satisfies the
dx
condition y(1) = 4.
Your solution
6 HELM (2006):
®
Answer
You should obtain y = x3 + 3 since, by a single integration we have y = x3 + C, where C is an
arbitrary constant. Now when x = 1, y = 4 so that 4 = 1 + C. Hence C = 3 and the unique
solution is y = x3 + 3.
Example 1
d2 y
Solve the differential equation = 6x subject to the conditions
dx2
(a) y(0) = 2 and y(1) = 3
(b) y(0) = 2 and y(1) = 5
dy
(c) y(0) = 2 and = 1 at x = 0.
dx
Solution
dy
(a) Integrating the differential equation once produces = 3x2 + A. The general solution is found
dx
by integrating a second time to give y = x3 + A x + B, where A and B are arbitrary constants.
Imposing the conditions y(0) = 2 and y(1) = 3: at x = 0 we have y = 2 = 0 + 0 + B = B so that
B = 2, and at x = 1 we have y = 3 = 1 + A + B = 1 + A + 2. Therefore A = 0 and the solution is
y = x3 + 2.
(b) Here the second condition is y(1) = 5 so at x = 1
y =5=1+A+2 so that A = 2
and the solution in this case is
y = x3 + 2x + 2.
(c) Here the second condition is
dy
= 1 at x = 0 i.e. y 0 (0) = 1
dx
dy
then since = 3x2 + A, putting x = 0 we get:
dx
dy
=1=0+A
dx
so that A = 1 and the solution in this case is y = x3 + x + 2.
HELM (2006): 7
3. Classifying differential equations
When solving differential equations (either analytically or numerically) it is important to be able to
recognise the various kinds that can arise. We therefore need to introduce some terminology which
will help us to distinguish one kind of differential equation from another.
• An ordinary differential equation (ODE) is any relation between a function of a single

variable and its derivatives. (All differential equations studied in this workbook are ordinary.)
• The order of a differential equation is the order of the highest derivative in the equation.
• A differential equation is linear if the dependent variable and its derivatives occur to the first
power only and if there are no products involving the dependent variable or its derivatives.
Example 2
Classify the differential equations specifying the order and type (linear/non-linear)
d2 y dy
(a) 2
− = x2
dx dx
3
d2 x dx
(b) = + 3x
dt2 dt
dx
(c) − x = t2
dt
dy
(d) + cos y = 0
dt
dy
(e) + y2 = 4
dt
Solution
(a) Second order, linear.

(b) Second order, non-linear (because of the cubic term).
(c) First order, linear.
(d) First order, non-linear (because of the cos y term).
(e) First order, non-linear (because of the y 2 term).
Note that in (a) the independent variable is x whereas in the other cases it is t.
In (a), (d) and (e) the dependent variable is y and in (b) and (c) it is x.
8 HELM (2006):
®
Exercises
1. In this RL circuit the switch is closed at t = 0 and a constant voltage E is applied.
R L
i
E
+ −
The voltage across the resistor is iR where i is the current flowing in the circuit and R is
di
the (constant) resistance. The voltage across the inductance is L where L is the constant
dt
inductance.
Kirchhoff’s law of voltages states that the applied voltage is the sum of the other voltages in
the circuit. Write down a differential equation for the current i and state the initial condition.
2. The diagram below shows the graph of i against t (from Exercise 1). What information does
this graph convey?
i
E
R
3. In the LCR circuit below the voltage across the capacitor is q/C where q is the charge on the
dq
capacitor, and C is the capacitance. Note that = i. Find a differential equation for i and
dt
write down the initial conditions if the initial charge is zero and the switch is closed at t = 0.
L C R
i
E
+ −
4. Find differential equations satisfied by
(a) y = A cos 4x + B sin 4x

(b) x = A e−2t
(c) y = A sin x + B sinh x + C cos x + D cosh x (harder)
dy
5. Find the family of solutions of the differential equation= −2x. Sketch the curves of some
dx
members of the family on the same axes. What is the solution if y(1) = 3?
6. (a) Find the general solution of the differential equation y 00 = 12x2 .

(b) Find the solution which satisfies y(0) = 2, y(1) = 8
(c) Find the solution which satisfies y(0) = 1, y 0 (0) = −2.
7. Classify the differential equations

2
d2 x d3 y d2 y

dx dy dy dy dy
(a) 2 +3 =x (b) 3 = + (c) +y = sin x (d) +y = 2.
dt dt dx dx dx dx dx2 dx
HELM (2006): 9
Answers
di
1. L + R i = E ; i = 0 at t = 0.
dt
E
2. Current increases rapidly at first, then less rapidly and tends to the value which is what
R
it would be in the absence of L.
d2 q dq q dq
3. L 2 + R + = E; q = 0 and i = = 0 at t = 0.
dt dt C dt
d2 y dx d4 y
4. (a) = −16y (b) = −2x (c) =y
dx2 dt dx4
5. y = −x2 + C
y
−1 y = 1 − x2 C = 1
y = −x2 C= 0
y = −1 − x2 C = −1
If 3 = −1 + C then C = 4 and y = −x2 + 4.
6.
(a) y = x4 + A x + B
(b) When x = 0, y = 2 = B; hence B = 2. When x = 1, y = 8 = 1 + A + B = 3 + A
hence A = 5 and y = x4 + 5x + 2.
dy
(c) When x = 0 y = 1 = B. Hence B = 1; = y 0 = 4x3 + A, so at x = 0, y 0 = −2 = A.
dx
Therefore y = x4 − 2x + 1
7. (a) Second order, linear

(b) Third order, non-linear (squared term)
(c) First order, linear
(d) Second order, non-linear (product term)
10 HELM (2006):
®
First Order
Introduction
Separation of variables is a technique commonly used to solve first order ordinary differential
equations. It is so-called because we rearrange the equation to be solved such that all terms involving
the dependent variable appear on one side of the equation, and all terms involving the independent
variable appear on the other. Integration completes the solution. Not all first order equations can be
rearranged in this way so this technique is not always appropriate. Further, it is not always possible
to perform the integration even if the variables are separable.
In this Section you will learn how to decide whether the method is appropriate, and how to apply it
in such cases.
An exact first order differential equation is one which can be solved by simply integrating both sides.
Only very few first order differential equations are exact. You will learn how to recognise these and
solve them. Some others may be converted simply to exact equations and that is also considered
Whilst exact differential equations are few and far between an important class of differential equations
can be converted into exact equations by multiplying through by a function known as the integrating
factor for the equation. In the last part of this Section you will learn how to decide whether an
equation is capable of being transformed into an exact equation, how to determine the integrating
factor, and how to obtain the solution of the original equation.

Prerequisites • understand what is meant by a differential

equation; (Section 19.1)

'
$
• explain what is meant by separating the
variables of a first order differential equation
Learning Outcomes • determine whether a first order differential

equation is separable
• solve a variety of equations using the
separation of variables technique
& %
HELM (2006): 11
Section 19.2: First Order Differential Equations
1. Separating the variables in first order ODEs
In this Section we consider differential equations which can be written in the form
dy
= f (x)g(y)
dx
Note that the right-hand side is a product of a function of x, and a function of y. Examples of such
equations are
dy dy dy
= x2 y 3 , = y 2 sin x and = y ln x
dx dx dx
Not all first order equations can be written in this form. For example, it is not possible to rewrite
the equation
dy
= x2 + y 3
dx
in the form
dy
= f (x)g(y)
dx
Task
Determine which of the following differential equations can be written in the form
dy
= f (x)g(y)
dx
If possible, rewrite each equation in this form.
dy x2 dy dy
(a) = 2, (b) = 4x2 + 2y 2 , (c) y + 3x = 7
dx y dx dx
Your solution
Answer
dy 2 1
(a) =x , (b) cannot be written in the stated form,
dx y2
dy 1
(c) Reformulating gives = (7 − 3x) × which is in the required form.
dx y
12 HELM (2006):
®
The variables involved in differential equations need not be x and y. Any symbols for variables may
be used. Other first order differential equations are

dz z dθ dv 1
= te = −θ and =v
dt dt dr r2
Given a differential equation in the form
dy
= f (x)g(y)
dx
we can divide through by g(y) to obtain
1 dy
= f (x)
g(y) dx
If we now integrate both sides of this equation with respect to x we obtain
Z Z
1 dy
dx = f (x) dx
g(y) dx
that is
Z Z
1
dy = f (x) dx
g(y)
We have separated the variables because the left-hand side contains only the variable y, and the
right-hand side contains only the variable x. We can now try to integrate each side separately. If
we can actually perform the required integrations we will obtain a relationship between y and x.
Examples of this process are given in the next subsection.
Key Point 1
Method of Separation of Variables
The solution of the equation
dy
= f (x)g(y)
dx
may be found from separating the variables and integrating:
Z Z
1
dy = f (x) dx
g(y)
HELM (2006): 13
2. Applying the method of separation of variables to ODEs
Example 3
Use the method of separation of variables to solve the differential equation
dy 3x2
=
dx y
Solution
The equation already has the form
dy
= f (x)g(y)
dx
where
f (x) = 3x2 and g(y) = 1/y.
Dividing both sides by g(y) we find
dy
y = 3x2
dx
Integrating both sides with respect to x gives
Z Z
dy
y dx = 3x2 dx
dx
that is
Z Z
y dy = 3x2 dx
Note that the left-hand side is an integral involving just y; the right-hand side is an integral involving
just x. After integrating both sides with respect to the stated variables we find
1 2
2
y = x3 + c
where c is a constant of integration. (You might think that there would be a constant on the
left-hand side too. You are quite right but the two constants can be combined into a single constant
and so we need only write one.)
We now have a relationship between y and x as required. Often it is sufficient to leave your answer
in this form but you may also be required to obtain an explicit relation for y in terms of x. In this
particular case
y 2 = 2x3 + 2c
so that
√
y = ± 2x3 + 2c
14 HELM (2006):
®
Task
Use the method of separation of variables to solve the differential equation
dy cos x
=
dx sin 2y
dy
First separate the variables so that terms involving y and appear on the left, and terms involving
dx
x appear on the right:
Your solution
Answer
You should have obtained
dy
sin 2y = cos x
dx
Now reformulate both sides as integrals:

Your solution
Answer
Z Z Z Z
dy
sin 2y dx = cos x dx that is sin 2y dy = cos x dx
dx
Now integrate both sides:

Your solution
Answer
− 12 cos 2y = sin x + c
Finally, rearrange to obtain an expression for y in terms of x:

Your solution
Answer
y = 12 cos−1 (D − 2 sin x) where D = −2c
HELM (2006): 15
Exercises
1. Solve the equation
dy e−x
= .
dx y
2. Solve the following equation subject to the condition y(0) = 1:
dy
= 3x2 e−y
dx
3. Find the general solution of the following equations:
dy dy 6 sin x
(a) = 3, (b) =
dx dx y
4. (a) Find the general solution of the equation
dx
= t(x − 2).
dt
(b) Find the particular solution which satisfies the condition x(0) = 5.
5. Some equations which do not appear to be separable can be made so by means of a suitable
substitution. By means of the substitution z = y/x solve the equation
dy y2 y
= 2 + +1
dx x x
6. The equation
di
iR + L =E
dt
where R, L and E are constants arises in electrical circuit theory. This equation can be
solved by separation of variables. Find the solution which satisfies the condition i(0) = 0.
Answers
√
1. y = ± D − 2e−x .
2. y = ln(x3 + e).
3 (a) y = 3x + C, (b) 12 y 2 = C − 6 cos x.
2 /2 2 /2
4. (a) x = 2 + Aet , (b) x = 2 + 3et .
5. z = tan(ln Dx) so that y = x tan(ln Dx).
E
6. i = (1 − e−t/τ ) where τ = L/R.
R
16 HELM (2006):
®
3. Exact equations
Consider the differential equation
dy
= 3x2
dx
By direct integration we find that the general solution of this equation is
y = x3 + C
where C is, as usual, an arbitrary constant of integration.

Next, consider the differential equation
d
(yx) = 3x2 .
dx
Again, by direct integration we find that the general solution is
yx = x3 + C.
We now divide this equation by x to obtain

C
y = x2 + .
x
d
The differential equation (yx) = 3x2 is called an exact equation. It can effectively be solved by
dx
integrating both sides.
Task
dy d 3
Solve the equations (a) = 5x4 (b) (x y) = 5x4
dx dx
Your solution
(a) y = (b) y =
Answer
C
(a) y = x5 + C (b) x3 y = x5 + C so that y = x2 + .
x3
If we consider examples of this kind in a more general setting we obtain the following Key Point:
HELM (2006): 17
Key Point 2
The solution of the equation
d
(f (x) · y) = g(x)
dx
is Z Z
1
f (x) · y = g(x) dx or y= g(x) dx
f (x)
4. Solving exact equations

d
As we have seen, the differential equation (yx) = 3x2 has solution y = x2 + C/x. In the solution,
dx
x2 is called the definite part and C/x is called the indefinite part (containing the arbitrary constant
of integration). If we take the definite part of this solution, i.e. yd = x2 , then
d d 2 d 3
(yd · x) = (x · x) = (x ) = 3x2 .
dx dx dx
Hence yd = x2 is a solution of the differential equation.
Now if we take the indefinite part of the solution i.e. yi = C/x then

d d C d
(yi · x) = ·x = (C) = 0.
dx dx x dx
It is always the case that the general solution of an exact equation is in two parts: a definite part
yd (x) which is a solution of the differential equation and an indefinite part yi (x) which satisfies a
simpler version of the differential equation in which the right-hand side is zero.
Task
(a) Solve the equation
d
(y cos x) = cos x
dx
(b) Verify that the indefinite part of the solution satisfies the equation
d
(y cos x) = 0.
dx
(a) Integrate both sides of the first differential equation:
Your solution
18 HELM (2006):
®
Answer
Z
y cos x = cos x dx = sin x + C leading to y = tan x + C sec x
(b) Substitute for y in the indefinite part (i.e. the part which contains the arbitrary constant) in the
second differential equation:
Your solution
Answer
The indefinite part of the solution is yi = C sec x and so yi cos x = C and
d d
(yi cos x) = (C) = 0
dx dx
5. Recognising an exact equation

d
The equation (yx) = 3x2 is exact, as we have seen. If we expand the left-hand side of this
dx
equation (i.e. differentiate the product) we obtain
dy
x + y.
dx
Hence the equation
dy
x + y = 3x2
dx
must be exact, but it is not so obvious that it is exact as in the original form. This leads to the
following Key Point:
Key Point 3
The equation
dy
f (x) + y f 0 (x) = g(x)
dx
is exact. It can be re-written as
Z
d
(y f (x)) = g(x) so that y f (x) = g(x) dx
dx
HELM (2006): 19
Example 4
Solve the equation
dy
x3 + 3x2 y = x
dx
Solution
Comparing this equation with the form in Key Point 3 we see that f (x) = x3 and g(x) = x. Hence
the equation can be written
d
(yx3 ) = x
dx
which has solution
Z
yx = x dx = 12 x2 + C.
3
Therefore
1 C
y= + 3.
2x x
Task
dy
Solve the equation sin x + y cos x = cos x.
dx
Your solution
Answer
You should obtain y = 1 + Ccosec x since, here f (x) = sin x and g(x) = cos x. Then
Z
d
(y sin x) = cos x and y sin x = cos x dx = sin x + C
dx
Finally y = 1 + C cosec x.
20 HELM (2006):
®
Exercises
d
1. Solve the equation (yx2 ) = x3 .
dx
d
2. Solve the equation (yex ) = e2x given the condition y(0) = 2.
dx
dy
3. Solve the equation e2x + 2e2x y = x2 .
dx
dy
4. Show that the equation x2 + 2xy = x3 is exact and obtain its solution.
dx
dy
5. Show that the equation x2 + 3xy = x3 is not exact.
dx
Multiply the equation by x and show that the resulting equation is exact and obtain its solution.
Answers
x2 C 1 C
+ 2 . 2. y = 12 ex + 32 e−x . 1 3
+ C e−2x . 4. y = x2 + 2 .

1. y = 3. y = 3
x
4 x 4 x
1 C
5. y = x2 + 3 .
5 x
6. The integrating factor

The equation
dy
x2 + 3x y = x3
dx
is not exact. However, if we multiply it by x we obtain the equation
dy
x3 + 3x2 y = x4 .
dx
This can be re-written as
d 3
(x y) = x4
dx
which is an exact equation with solution
Z
x y = x4 dx
3
1
so x3 y = x5 + C
5
and hence
1 C
y = x2 + 3 .
5 x
The function by which we multiplied the given differential equation in order to make it exact is called
an integrating factor. In this example the integrating factor is simply x.
HELM (2006): 21
Task
Which of the following differential equations can be made exact by multiplying by
x2 ?
dy 2 dy 1 dy 1
(a) + y=4 (b) x + 3y = x2 (c) − 2y = x
dx x dx x dx x
1 dy 1
(d) + y = 3.
x dx x2
d
Where possible, write the exact equation in the form (f (x) y) = g(x).
dx
Your solution
Answer
dy d 2
(a) Yes. x2 + 2xy = 4x2 becomes (x y) = 4x2 .
dx dx
dy d 3
(b) Yes. x3 + 3x2 y = x4 becomes (x y) = x4 .
dx dx

d 1
(c) No. This equation is already exact as it can be written in the form y = x.
dx x
dy d
(d) Yes. x + y = 3x2 becomes (xy) = 3x2 .
dx dx
7. Finding the integrating factor for linear ODEs

The differential equation governing the current i in a circuit with inductance L and resistance R in
series subject to a constant applied electromotive force E cos ωt, where E and ω are constants, is
di
L + Ri = E cos ωt (1)
dt
This is an example of a linear differential equation in which i is the dependent variable and t is
the independent variable. The general standard form of a linear first order differential equation is
normally written with ‘y’ as the dependent variable and with ‘x’ as the independent variable and
dy
arranged so that the coefficient of is 1. That is, it takes the form:
dx
dy
+ f (x) y = g(x) (2)
dx
in which f (x) and g(x) are functions of x.
22 HELM (2006):
®
di
Comparing (1) and (2), x is replaced by t and y by i to produce + f (t) i = g(t). The function
dt
f (t) is the coefficient of the dependent variable in the differential equation. We shall describe the
method of finding the integrating factor for (1) and then generalise it to a linear differential equation
written in standard form.
Step 1 Write the differential equation in standard form i.e. with the coefficient of the derivative
equal to 1. Here we need to divide through by L:
di R E
+ i = cos ωt.
dt L L
Step 2 Integrate the coefficient of the dependent variable (that is, f (t) = R/L) with respect to
the independent variable (that is, t), and ignoring the constant of integration
Z
R R
dt = t.
L L
Step 3 Take the exponential of the function obtained in Step 2.
This is the integrating factor (I.F.)
I.F. = eRt/L .
This leads to the following Key Point on integrating factors:
Key Point 4
The linear differential equation (written in standard form):
Z
dy
+ f (x)y = g(x) has an integrating factor I.F. = exp f (x)dx
dx
Task
Find the integrating factors for the equations
dy di dy
(a) x + 2x y = xe−2x (b) t + 2t i = te−2t (c) − (tan x)y = 1.
dx dt dx
Your solution
HELM (2006): 23
Answer
dy
(a) Step 1 Divide by x to obtain + 2y = e−2x
dx
Z
Step 2 The coefficient of the independent variable is 2 hence 2 dx = 2x
Step 3 I.F. = e2x

(b) The only difference from (a) is that i replaces y and t replaces x. Hence I.F. = e2t .
(c) Step 1 This is already in the standard form.
− sin x
Z Z
Step 2 − tan x dx = dx = ln cos x.
cos x
Step 3 I.F. = eln cos x = cos x
8. Solving equations via the integrating factor

Having found the integrating factor for a linear equation we now proceed to solve the equation.
Returning to the differential equation, written in standard form:
di R E
+ i = cos ωt
dt L L
for which the integrating factor is
eRt/L
we multiply the equation by the integrating factor to obtain
di R Rt/L E Rt/L
eRt/L + e i= e cos ωt
dt L L
At this stage the left-hand side of this equation can always be simplified as follows:
d Rt/L E Rt/L
(e i) = e cos ωt.
dt L
Now this is in the form of an exact differential equation and so we can integrate both sides to obtain
the solution:
Z
Rt/L E
e i= eRt/L cos ωt dt.
L
All that remains is to complete the integral on the right-hand side. Using the method of integration
by parts we find
Z
L
eRt/L cos ωt dt = 2 2 [ωL sin ωt + R cos ωt] eRt/L
L ω + R2
Hence
E
eRt/L i = [ωL sin ωt + R cos ωt] eRt/L + C.
L2 ω 2+ R2
Finally
E
i= [ωL sin ωt + R cos ωt] + C e−Rt/L .
L2 ω 2+ R2
24 HELM (2006):
®
is the solution to the original differential equation (1). Note that, as we should expect for the solution
to a first order differential equation, it contains a single arbitrary constant C.
Task
Using the integrating factors found earlier in the Task on pages 22-23, find the
general solutions to the differential equations
dy di dy
(a) x2 + 2x2 y = x2 e−2x (b) t2 + 2t2 i = t2 e−2t (c) − (tan x)y = 1.
dx dt dx
Your solution
Answer
dy
(a) The standard form is + 2y = e−2x for which the integrating factor is e2x .
dx
dy
e2x + 2e2x y = 1
dx
d
i.e. (e2x y) = 1 so that e2x y = x + C
dx
leading to y = (x + C)e−2x
(b) The general solution is i = (t+C)e−2t as this problem is the same as (a) with different variables.
(c) The equation is in standard form and the integrating factor is cos x.
Z
d
then (cos x y) = cos x so that cos x y = cos x dx = sin x + C
dx
giving y = tan x + C sec x
HELM (2006): 25
An RC circuit with a single frequency input
Introduction
The components in RC circuits containing resistance, inductance and capacitance can be chosen
so that the circuit filters out certain frequencies from the input. A particular kind of filter circuit
consists of a resistor and capacitor in series and acts as a high cut (or low pass) filter. The high cut
frequency is defined to be √the frequency at which the magnitude of the voltage across the capacitor
(the output voltage) is 1/ 2 of the magnitude of the input voltage.
Problem in words
Calculate the high cut frequency for an RC circuit is subjected to a single frequency input of angular
frequency ω and magnitude vi .
(a) Find the steady state solution of the equation
dq q
R + = vi ejωt
dt C
and hence find the magnitude of
q
(i) the voltage across the capacitor vc =
C
dq
(ii) the voltage across the resistor vR = R
dt
(b) Using the impedance method of 12.6 confirm your results to part (a) by calculating
(i) the voltage across the capacitor vc
(ii) the voltage across the resistor vR in response to a single frequency of angular frequency ω and
magnitude vi .
|vc |
(c) For the case where R = 1 kΩ and C = 1 µF, find the ratio and complete the table below
|vi |
ω 10 102 103 104 105 106
|vc |
|vi |
(d) Explain why the table results show that a RC circuit acts as a high-cut filter and find the value
|vc | 1
of the high-cut frequency, defined as fhc = ωhc /2π, such that =√ .
|vi | 2
26 HELM (2006):
®

dq q
We need to find a particular solution to the differential equation R + = vi ejωt .
dt C
q
This will give us the steady state solution for the charge q. Using this we can find vc = and
C
dq
vR = R . These should give the same result as the values calculated by considering the impedances
dt
|vc |
in the circuit. Finally we can calculate and fill in the table of values as required and find the
|vi |
|vc | 1
high-cut frequency from = √ and fhc = ωhc /2π.
|vi | 2
Mathematical solution
(a) To find a particular solution, we try a function of the form q = c0 ejωt which means that
dq
= jωc0 ejωt .
dt
dq q
Substituting into R + = vi ejωt we get
dt C
jωt c0 ejωt c0
Rjωc0 e + = vi ejωt ⇒ Rjωc0 + = vi
C C
vi Cvi Cvi
⇒ c0 = 1 = ⇒ q= ejωt
Rjω + C RCjω + 1 RCjω + 1
Thus
q vi dq RCvi jω jωt
(i) vc = = ejωt and (ii) vR = = e
C RCjω + 1 dt RCjω + 1
(b) We use the impedance to determine the voltage across each of the elements. The applied voltage
is a single frequency of angular frequency ω and magnitude vi such that V = vi ejωt .
For an RC circuit, the impedance of the circuit is Z = ZR + Zc where ZR is the impedance of the
j
resistor R and Zc is the impedance of the capacitor Zc = − .
ωC
j
Therefore Z = R − .
ωC
The current
can be found using v = Zi giving
j vi ejωt
vi ejωt = R − i ⇒ i= j
ωC R − ωC
We can now use vc = zc i and vR = zR i giving
q j vi jωt vi
(i) vc = =− × j e = ejωt
C ωC R − ωC RCjω + 1
Rvi jωt RCvi jω jωt

(ii) vR = j e = e
R − ωC RCjω + 1
which confirms the result in part (a) found by solving the differential equation.
(c) When R = 1000 Ω and C = 10−6 F
vi vi
vc = ejωt = −3 ejωt
RCjω + 1 10 jω + 1
HELM (2006): 27

|vc | 1 jωt 1
= √ 1
So = −3 |e | = −3

|vi | 10 jω + 1 10 jω + 1 10−6 ω 2 + 1

vc
Table 1: Values of for a range of values of ω
vi
ω 10 102 103 104 105 106
|vc |
0.99995 0.995 0.707 0.00995 0.0099995 0.001
|vi |
|vc |
(d) Table 1 shows that a RC circuit can be used as a high-cut filter because for low values of ω,
|vi |
|vc |
is approximately 1 and for high values of ω, is approximately 0. So the circuit will filter out high
|vi |
frequency values.
|vc | 1 1 1
= √ when √ =√ ⇔ 10−6 ω 2 + 1 = 2 ⇔ 10−6 ω 2 = 1 ⇔ ω 2 = 106
|vi | 2 −6 2
10 ω + 1 2
As we are considering ω to be a positive frequency, ω = 1000.
ωhc 1000
So fhc = = ≈ 159 Hz.
2π 2π
Interpretation
We have shown that for an RC circuit finding the steady state solution of the differential equation
|vc | |vR |
with a single frequency input voltage yields the same result for and as found by working
|vi | |vi |
with the complex impedances for the circuit.
An RC circuit can be used as a high-cut filter and in the case where R = 1 kΩ, C = 1 µF we found
the high-cut frequency to be at approximately 159 Hz.
This means that the circuit will pass frequencies less than this value and remove frequencies greater
than this value.
28 HELM (2006):
®
Exercises
dy
1. Solve the equation x2 + x y = 1.
dx
dy
2. Find the solution of the equation x − y = x subject to the condition y(1) = 2.
dx
dy
3. Find the general solution of the equation + (tan t) y = cos t.
dt
dy
4. Solve the equation + (cot t) y = sin t.
dt
5. The temperature θ (measured in degrees) of a body immersed in an atmosphere of varying
dθ
temperature is given by + 0.1θ = 5 − 2.5t. Find the temperature at time t if θ = 60◦ C
dt
when t = 0.
6. In an LR circuit with applied voltage E = 10(1 − e−0.1t ) the current i is given by
di
L + Ri = 10(1 − e−0.1t ).
dt
If the initial current is i0 find i subsequently.
Answers
1 C
1. y = ln x +
x x
2. y = x ln x + 2x
3. y = (t + C) cos t
1
t − 14 sin 2t + C cosec t

4. y = 2
5. θ = 300 − 25t − 240e−0.1t

10 100 −0.1t 10L
6. i = − e + i0 + e−Rt/L
R 10R − L R(10R − L)
HELM (2006): 29
Second Order
Introduction
In this Section we start to learn how to solve second order differential equations of a particular type:
those that are linear and have constant coefficients. Such equations are used widely in the modelling
of physical phenomena, for example, in the analysis of vibrating systems and the analysis of electrical
circuits.
The solution of these equations is achieved in stages. The first stage is to find what is called a ‘com-
plementary function’. The second stage is to find a ‘particular integral’. Finally, the complementary
function and the particular integral are combined to form the general solution.

• understand what is meant by a differential
Prerequisites equation
Before starting this Section you should . . . • understand complex numbers ( 10)

'
$
• recognise a linear, constant coefficient
equation
• understand what is meant by the terms

Learning Outcomes ‘auxiliary equation’ and ‘complementary
function’
• find the complementary function when the
auxiliary equation has real, equal or complex
roots
& %
30 HELM (2006):
®
1. Constant coefficient second order linear ODEs

We now proceed to study those second order linear equations which have constant coefficients. The
general form of such an equation is:
d2 y dy
a 2
+ b + cy = f (x) (3)
dx dx
where a, b, c are constants. The homogeneous form of (3) is the case when f (x) ≡ 0:
d2 y dy
a 2
+ b + cy = 0 (4)
dx dx
To find the general solution of (3), it is first necessary to solve (4). The general solution of (4) is
called the complementary function and will always contain two arbitrary constants. We will denote
this solution by ycf .
The technique for finding the complementary function is described in this Section.
Task
State which of the following are constant coefficient equations.
State which are homogeneous.
d2 y dy d2 y
(a) 2
+ 4 + 3y = e−2x (b) x + 2y = 0
dx dx dx2
2
dx dx d2 y dy
(c) 2
+ 3 + 7x = 0 (d) 2
+ 4 + 4y = 0
dt dt dx dx
Your solution
(a)
(b)
(c)
(d)
Answer
(a) is constant coefficient and is not homogeneous.
d2 y
(b) is homogeneous but not constant coefficient as the coefficient of is x, a variable.
dx2
(c) is constant coefficient and homogeneous. In this example the dependent variable is x.
(d) is constant coefficient and homogeneous.
Note: A complementary function is the general solution of a homogeneous, linear differential equation.
HELM (2006): 31
Section 19.3: Second Order Differential Equations
2. Finding the complementary function
To find the complementary function we must make use of the following property.
If y1 (x) and y2 (x) are any two (linearly independent) solutions of a linear, homogeneous second order
differential equation then the general solution ycf (x), is
ycf (x) = Ay1 (x) + By2 (x)
where A, B are constants.
We see that the second order linear ordinary differential equation has two arbitrary constants in its
general solution. The functions y1 (x) and y2 (x) are linearly independent if one is not a multiple
of the other.
Example 5
Verify that y1 = e4x and y2 = e2x both satisfy the constant coefficient linear
homogeneous equation:
d2 y dy
− 6 + 8y = 0
dx2 dx
Write down the general solution of this equation.
Solution
When y1 = e4x , differentiation yields:
dy1 d2 y1
= 4e4x and = 16e4x
dx dx2
Substitution into the left-hand side of the ODE gives 16e4x − 6(4e4x ) + 8e4x , which equals 0, so
that y1 = e4x is indeed a solution.
Similarly if y2 = e2x , then
dy2 d2 y2
= 2e2x and = 4e2x .
dx dx2
Substitution into the left-hand side of the ODE gives 4e2x − 6(2e2x ) + 8e2x , which equals 0, so that
y2 = e2x is also a solution of equation the ODE. Now e2x and e4x are linearly independent functions,
so, from the property stated above we have:
ycf (x) = Ae4x + Be2x is the general solution of the ODE.
32 HELM (2006):
®
Example 6
Find values of k so that y = ekx is a solution of:
d2 y dy
2
− − 6y = 0
dx dx
Hence state the general solution.
Solution
As suggested we try a solution of the form y = ekx . Differentiating we find
dy d2 y
= kekx and = k 2 ekx .
dx dx2
Substitution into the given equation yields:
k 2 ekx − kekx − 6ekx = 0 that is (k 2 − k − 6)ekx = 0
The only way this equation can be satisfied for all values of x is if
k2 − k − 6 = 0
that is, (k − 3)(k + 2) = 0 so that k = 3 or k = −2. That is to say, if y = ekx is to be a solution
of the differential equation, k must be either 3 or −2. We therefore have found two solutions:
y1 (x) = e3x and y2 (x) = e−2x
These are linearly independent and therefore the general solution is
ycf (x) = Ae3x + Be−2x
The equation k 2 − k − 6 = 0 for determining k is called the auxiliary equation.
Task
By substituting y = ekx , find values of k so that y is a solution of
d2 y dy
2
− 3 + 2y = 0
dx dx
Hence, write down two solutions, and the general solution of this equation.
First find the auxiliary equation:
Your solution
Answer
k 2 − 3k + 2 = 0
HELM (2006): 33
Now solve the auxiliary equation and write down the general solution:
Your solution
Answer
The auxiliary equation can be factorised as (k − 1)(k − 2) = 0 and so the required values of k are
1 and 2. The two solutions are y = ex and y = e2x . The general solution is
ycf (x) = Aex + Be2x
Example 7
Find the auxiliary equation of the differential equation:
d2 y dy
a 2
+ b + cy = 0
dx dx
Solution
We try a solution of the form y = ekx so that
dy d2 y
= kekx and 2
= k 2 ekx .
dx dx
Substitution into the given differential equation yields:
ak 2 ekx + bkekx + cekx = 0 that is (ak 2 + bk + c)ekx = 0
Since this equation is to be satisfied for all values of x, then
ak 2 + bk + c = 0
is the required auxiliary equation.
Key Point 5
d2 y dy
The auxiliary equation of a 2
+ b + cy = 0 is ak 2 + bk + c = 0 where y = ekx
dx dx
34 HELM (2006):
®
Task
Write down, but do not solve, the auxiliary equations of the following:
d2 y dy d2 y dy
(a) + + y = 0, (b) 2 + 7 − 3y = 0
dx2 dx dx 2 dx
d2 y 2
d y dy
(c) 4 2 + 7y = 0, (d) + =0
dx dx2 dx
Your solution
(a)
(b)
(c)
(d)
Answer
(a) k 2 + k + 1 = 0 (b) 2k 2 + 7k − 3 = 0 (c) 4k 2 + 7 = 0 (d) k 2 + k = 0
Solving the auxiliary equation gives the values of k which we need to find the complementary function.
Clearly the nature of the roots will depend upon the values of a, b and c.
Case 1 If b2 > 4ac the roots will be real and distinct. The two values of k thus obtained, k1 and
k2 , will allow us to write down two independent solutions: y1 (x) = ek1 x and y2 (x) = ek2 x , and so
the general solution of the differential equation will be:
y(x) = Aek1 x + Bek2 x
Key Point 6
If the auxiliary equation has real, distinct roots k1 and k2 , the complementary function will be:
ycf (x) = Aek1 x + Bek2 x
Case 2 On the other hand, if b2 = 4ac the two roots of the auxiliary equation will be equal and this
method will therefore only yield one independent solution. In this case, special treatment is required.
Case 3 If b2 < 4ac the two roots of the auxiliary equation will be complex, that is, k1 and k2
will be complex numbers. The procedure for dealing with such cases will become apparent in the
following examples.
HELM (2006): 35
Example 8
d2 y dy
Find the general solution of: 2
+ 3 − 10y = 0
dx dx
Solution
dy kx d2 y
kx
By letting y = e , so that = ke and 2
= k 2 ekx
dx dx
2
the auxiliary equation is found to be: k + 3k − 10 = 0 and so (k − 2)(k + 5) = 0
so that k = 2 and k = −5. Thus there exist two solutions: y1 = e2x and y2 = e−5x .
We can write the general solution as: y = Ae2x + Be−5x
Example 9
d2 y
Find the general solution of: + 4y = 0
dx2
Solution
dy d2 y
As before, let y = ekx so that = kekx and = k 2 ekx .
dx dx2
The auxiliary equation is easily found to be: k 2 + 4 = 0 that is, k 2 = −4 so that k = ±2i, that is,
we have complex roots. The two independent solutions of the equation are thus
y1 (x) = e2ix y2 (x) = e−2ix
so that the general solution can be written in the form y(x) = Ae2ix + Be−2ix .
However, in cases such as this, it is usual to rewrite the solution in the following way.
Recall that Euler’s relations give: e2ix = cos 2x + i sin 2x and e−2ix = cos 2x − i sin 2x
so that y(x) = A(cos 2x + i sin 2x) + B(cos 2x − i sin 2x).
If we now relabel the constants such that A + B = C and Ai − Bi = D we can write the general
solution in the form:
y(x) = C cos 2x + D sin 2x
Note: In Example 8 we have expressed the solution as y = . . . whereas in Example 9 we have

expressed it as y(x) = . . . . Either will do.
36 HELM (2006):
®
Example 10
Given ay 00 + by 0 + cy = 0, write down the auxiliary equation. If the roots of the
auxiliary equation are complex (one root will always be the complex conjugate of
the other) and are denoted by k1 = α + βi and k2 = α − βi show that the general
solution is:
y(x) = eαx (A cos βx + B sin βx)
Solution
Substitution of y = ekx into the differential equation yields (ak 2 +bk +c)ekx = 0 and so the auxiliary
equation is:
ak 2 + bk + c = 0
If k1 = α + βi, k2 = α − βi then the general solution is
y = Ce(α+βi)x + De(α−βi)x
where C and D are arbitrary constants.
Using the laws of indices this is rewritten as:
y = Ceαx eβix + Deαx e−βix = eαx (Ceβix + De−βix )
Then, using Euler’s relations, we obtain:
y = eαx (C cos βx + Ci sin βx + D cos βx − Di sin βx)

= eαx {(C + D) cos βx + (Ci − Di) sin βx}
Writing A = C + D and B = Ci − Di, we find the required solution:

y = eαx (A cos βx + B sin βx)
Key Point 7
If the auxiliary equation has complex roots, α + βi and α − βi, then the complementary function
is:
ycf = eαx (A cos βx + B sin βx)
HELM (2006): 37
Task
Find the general solution of y 00 + 2y 0 + 4y = 0.
Write down the auxiliary equation:

Your solution
Answer
k 2 + 2k + 4 = 0
Find the complex roots of the auxiliary equation:
Your solution
Answer
√
k = −1 ± 3i
√
Using Key Point 7 with α = −1 and β = 3 write down the general solution:
Your solution
Answer
√ √
y = e−x (A cos 3x + B sin 3x)
Key Point 8
If the auxiliary equation has two equal roots, k, the complementary function is:
ycf = (A + Bx)ekx
38 HELM (2006):
®
Example 11
The auxiliary equation of ay 00 + by 0 + cy = 0 is ak 2 + bk + c = 0. Suppose this
equation has equal roots k = k1 and k = k1 . Verify that y = xek1 x is a solution
of the differential equation.
Solution
We have: y = xek1 x y 0 = ek1 x (1 + k1 x) y 00 = ek1 x (k12 x + 2k1 )
Substitution into the left-hand side of the differential equation yields:
ek1 x {a(k12 x + 2k1 ) + b(1 + k1 x) + cx} = ek1 x {(ak12 + bk1 + c)x + 2ak1 + b}
But ak12 + bk1 + c = 0 since k1 satisfies the auxiliary equation. Also,
√
−b ± b2 − 4ac
k1 =
2a
but since the roots are equal, then b2 − 4ac = 0 hence k1 = −b/2a. So 2ak1 + b = 0. Hence
ek1 x {(ak12 + bk1 + c)x + 2ak1 + b} = ek1 x {(0)x + 0} = 0. We conclude that y = xek1 x is a solution
of ay 00 + by 0 + cy = 0 when the roots of the auxiliary equation are equal. This illustrates Key Point
8.
Example 12
d2 y dy
Obtain the general solution of the equation: 2
+ 8 + 16y = 0.
dx dx
Solution
As before, a trial solution of the form y = ekx yields an auxiliary equation k 2 + 8k + 16 = 0. This
equation factorizes so that (k + 4)(k + 4) = 0 and we obtain equal roots, that is, k = −4 (twice).
If we proceed as before, writing y1 (x) = e−4x y2 (x) = e−4x , it is clear that the two solutions are not
independent. We need to find a second independent solution. Using the result summarised in Key
Point 8, we conclude that the second independent solution is y2 = xe−4x . The general solution is
then:
y(x) = (A + Bx)e−4x
HELM (2006): 39
Exercises
1. Obtain the general solutions, that is, the complementary functions, of the following equations:
d2 y dy d2 y dy d2 x dx
(a) 2
− 3 + 2y = 0 (b) 2
+ 7 + 6y = 0 (c) 2
+ 5 + 6x = 0
dx dx dx dx dt dt
2 2 2
dy dy dy dy d y dy
(d) 2 + 2 + y = 0 (e) 2 − 4 + 4y = 0 (f) 2 + + 8y = 0
dt dt dx dx dt dt
2 2
dy dy d y dy d2 y dy
(g) 2 − 2 + y = 0 (h) 2 + + 5y = 0 (i) 2 + − 2y = 0
dx dx dt dt dx dx
2
dy d2 y dy d2 x
(j) 2 + 9y = 0 (k) 2 − 2 =0 (l) 2 − 16x = 0
dx dx dx dt
d2 i di 1
2. Find the auxiliary equation for the differential equation L 2 +R + i=0
dt dt C
Hence write down the complementary function.
d2 y dy
3. Find the complementary function of the equation + +y =0
dx2 dx
Answers
1. (a) y = Aex + Be2x

(b) y = Ae−x + Be−6x
(c) x = Ae−2t + Be−3t
(d) y = Ae−t + Bte−t
(e) y = Ae2x + Bxe2x
(f) y = e−0.5t (A cos 2.78t + B sin 2.78t)
(g) y = Aex + Bxex
(h) x = e−0.5t (A cos 2.18t + B sin 2.18t)
(i) y = Ae−2x + Bex
(j) y = A cos 3x + B sin 3x
(k) y = A + Be2x
(l) x = Ae4t + Be−4t
r
R2 C − 4L

1 1
2. Lk 2 + Rk + = 0 i(t) = Aek1 t + Bek2 t k1 , k2 = −R ±
C 2L C
√ √
−x/2 3 3
3. e A cos 2 x + B sin 2 x
40 HELM (2006):
®
3. The particular integral

Given a second order ODE
d2 y dy
a 2 + b + c y = f (x),
dx dx
a particular integral is any function, yp (x), which satisfies the equation. That is, any function
which when substituted into the left-hand side, results in the expression on the right-hand side.
Task
Show that
y = − 14 e2x
is a particular integral of
d2 y dy
2
− − 6y = e2x (1)
dx dx
dy d2 y
Starting with y = − 14 e2x , find and 2 :
dx dx
Your solution
Answer
dy d2 y
= − 12 e2x , 2
= −e2x
dx dx
Now substitute these into the ODE and simplify to check it satisfies the equation:
Your solution
Answer
Substitution yields −e2x − − 12 e2x − 6 − 41 e2x which simplifies to e2x , the same as the right-hand

side.
Therefore y = − 14 e2x is a particular integral and we write (attaching a subscript p):
yp (x) = − 14 e2x
HELM (2006): 41
Task
State what is meant by a particular integral.
Your solution
Answer
A particular integral is any solution of a differential equation.
4. Finding a particular integral

In the previous subsection we explained what is meant by a particular integral. Now we look at a
simple method to find a particular integral. In fact our method is rather crude. It involves trial and
error and educated guesswork. We try solutions which are of the same general form as the f (x) on
the right-hand side.
Example 13
Find a particular integral of the equation
d2 y dy
2
− − 6y = e2x
dx dx
Solution
We shall attempt to find a solution of the inhomogeneous problem by trying a function of the same
form as that on the right-hand side of the ODE. In particular, let us try y(x) = Ae2x , where A is a
constant that we shall now determine. If y(x) = Ae2x then
dy d2 y
= 2Ae2x and = 4Ae2x .
dx dx2
Substitution in the ODE gives:
4Ae2x − 2Ae2x − 6Ae2x = e2x
that is,
−4Ae2x = e2x
To ensure that y is a solution, we require −4A = 1, that is, A = − 14 .
Therefore the particular integral is yp (x) = − 14 e2x .
In Example 13 we chose a trial solution Ae2x of the same form as the ODE’s right-hand side. Table
2 provides a summary of the trial solutions which should be tried for various forms of the right-hand
side.
42 HELM (2006):
®
Table 2: Trial solutions to find the particular integral

f (x) Trial solution
(1) constant term c constant term k
(2) linear, ax + b Ax + B
(3) polynomial in x polynomial in x

of degree r: of degree r:
axr + · · · + bx + c Axr + · · · + Bx + k
(4) a cos kx A cos kx + B sin kx
(5) a sin kx A cos kx + B sin kx
(6) aekx Aekx
(7) ae−kx Ae−kx
Task
By trying a solution of the form y = αe−x find a particular integral of the equation
d2 y dy
+ − 2y = 3e−x
dx2 dx
Substitute y = αe−x into the given equation to find α, and hence find the particular integral:
Your solution
Answer
α = − 32 ; yp (x) = − 32 e−x
HELM (2006): 43
Example 14
d2 y dy
Obtain a particular integral of the equation: 2
− 6 + 8y = x.
dx dx
Solution
In Example 13 and the last Task, we found that a fruitful approach for a first order ODE was
to assume a solution in the same form as that on the right-hand side. Suppose we assume a
solution y(x) = αx and proceed to determine α. This approach will actually fail, but let us see
dy d2 y
why. If y(x) = αx then = α and = 0. Substitution into the differential equation yields
dx dx2
0 − 6α + 8αx = x and α.
Comparing coefficients of x:
1
8αx = x so α =
8
Comparing constants: −6α = 0 so α = 0
We have a contradiction! Clearly a particular integral of the form αx is not possible. The problem
arises because differentiation of the term αx produces constant terms which are unbalanced on the
right-hand side. So, we try a solution of the form y(x) = αx + β with α, β constants. This is
dy d2 y
consistent with the recommendation in Table 2 on page 43. Proceeding as before = α, = 0.
dx dx2
Substitution in the differential equation now gives:
0 − 6α + 8(αx + β) = x
Equating coefficients of x and then equating constant terms we find:
8α = 1 (1)
−6α + 8β = 0 (2)
1 3
From (1), α = 8
and then from (2) β = 32
.
The required particular integral is yp (x) = 18 x + 3
32
.
44 HELM (2006):
®
Task
Find a particular integral for the equation:
d2 y dy
2
− 6 + 8y = 3 cos x
dx dx
First decide on an appropriate form for the trial solution, referring to Table 2 (page 43) if necessary:
Your solution
Answer
From Table 2, y = A cos x + B sin x, A and B constants.
dy d2 y
Now find and 2 and substitute into the differential equation:
dx dx
Your solution
Answer
Differentiating, we find:
dy d2 y
= −A sin x + B cos x = −A cos x − B sin x
dx dx2
Substitution into the differential equation gives:
(−A cos x − B sin x) − 6(−A sin x + B cos x) + 8(A cos x + B sin x) = 3 cos x
HELM (2006): 45
Equate coefficients of cos x:
Your solution
Answer
7A − 6B = 3
Also, equate coefficients of sin x:
Your solution
Answer
7B + 6A = 0
Solve these two equations in A and B simultaneously to find values for A and B, and hence obtain
the particular integral:
Your solution
Answer
A = 21
85
, B = − 18
85
, yp (x) = 21
85
cos x − 18
85
sin x
46 HELM (2006):
®
5. Finding the general solution of a second order linear

inhomogeneous ODE
The general solution of a second order linear inhomogeneous equation is the sum of its particular
integral and the complementary function. In subsection 2 (page 32) you learned how to find a
complementary function, and in subsection 4 (page 42) you learnt how to find a particular integral.
We now put these together to find the general solution.
Example 15
d2 y dy
Find the general solution of 2
+ 3 − 10y = 3x2
dx dx
Solution
The complementary function was found in Example 8 to be ycf = Ae2x + Be−5x .
The particular integral is found by trying a solution of the form y = ax2 + bx + c, so that
dy d2 y
= 2ax + b, = 2a
dx dx2
Substituting into the differential equation gives
2a + 3(2ax + b) − 10(ax2 + bx + c) = 3x2
Comparing constants: 2a + 3b − 10c = 0
Comparing x terms: 6a − 10b = 0
Comparing x2 terms: −10a = 3
3 9 57 3 2 9 57
So a=− ,b=− ,c=− , yp (x) = − x − x− .
10 50 500 10 50 500
3 9 57
Thus the general solution is y = yp (x) + ycf (x) = − x2 − x − + Ae2x + Be−5x
10 50 500
Key Point 9
The general solution of a second order constant coefficient ordinary differential equation
d2 y dy
a
2
+ b + cy = f (x) is y = yp + ycf
dx dx
being the sum of the particular integral and the complementary function.
yp contains no arbitrary constants; ycf contains two arbitrary constants.
HELM (2006): 47
An LC circuit with sinusoidal input
The differential equation governing the flow of current in a series LC circuit when subject to an
d2 i 1
applied voltage v(t) = V0 sin ωt is L 2 + i = ωV0 cos ωt
dt C
v
L C
i
Figure 3
Obtain its general solution.
Solution
d2 icf icf
The homogeneous equation is L + = 0.
dt2 C
√
Letting icf = ekt we find the auxiliary equation is Lk 2 + C1 = 0 so that k = ±i/ LC. Therefore,
the complementary function is:
t t
icf = A cos √ + B sin √ where A and B arbitrary constants.
LC LC
To find a particular integral try ip = E cos ωt + F sin ωt, where E, F are constants. We find:
dip d2 ip
= −ωE sin ωt + ωF cos ωt = −ω 2 E cos ωt − ω 2 F sin ωt
dt dt2
Substitution into the inhomogeneous equation yields:
1
L(−ω 2 E cos ωt − ω 2 F sin ωt) + (E cos ωt + F sin ωt) = ωV0 cos ωt
C
Equating coefficients of sin ωt gives: −ω 2 LF + (F/C) = 0.
Equating coefficients of cos ωt gives: −ω 2 LE + (E/C) = ωV0 .
Therefore F = 0 and E = CV0 ω/(1 − ω 2 LC). Hence the particular integral is
CV0 ω
ip = cos ωt.
1 − ω 2 LC
Finally, the general solution is:
t t CV0 ω
i = icf + ip = A cos √ + B sin √ + cos ωt
LC LC 1 − ω 2 LC
48 HELM (2006):
®
6. Inhomogeneous term in the complementary function

d2 y dy
Occasionally you will come across a differential equation a
2
+b + cy = f (x) for which the
dx dx
inhomogeneous term, f (x), forms part of the complementary function. One such example is the
equation
d2 y dy
− − 6y = e3x
dx2 dx
It is straightforward to check that the complementary function is ycf = Ae3x + Be−2x . Note that the
first of these terms has the same form as the inhomogeneous term, e3x , on the right-hand side of the
differential equation.
You should verify for yourself that trying a particular integral of the form yp (x) = αe3x will not work
in a case like this. Can you see why?
Instead, try a particular integral of the form yp (x) = αxe3x . Verify that
dyp d2 yp
= αe3x (3x + 1) and = αe3x (9x + 6).
dx dx2
Substitute these expressions into the differential equation to find α = 15 .
Finally, the particular integral is yp (x) = 15 xe3x and so the general solution to the differential equation
is:
y = Ae3x + Be−2x + 15 xe3x
This shows a generally effective method - where the inhomogeneous term f (x) appears in the com-
plementary function use as a trial particular integral x times what would otherwise be used.
Key Point 10
When solving
d2 y dy
2
+ b + cy = f (x)
a
dx dx
if the inhomogeneous term f (x) appears in the complementary function use as a trial particular
integral x times what would otherwise be used.
HELM (2006): 49
Exercises
1. Find the general solution of the following equations:
d2 x dx d2 y dy d2 y dy
(a) 2
− 2 − 3x = 6 (b) 2
+ 5 + 4y = 8 (c) 2
+ 5 + 6y = 2t
dt dt dx dx dt dt
2 2 2
dx dx dy dy d y dy
(d) 2 + 11 + 30x = 8t (e) 2 + 2 + 3y = 2 sin 2x (f) 2 + + y = 4 cos 3t
dt dt dx dx dt dt
2 2
dy dx
(g) 2 + 9y = 4e8x (h) 2 − 16x = 9e6t
dx dt
d2 x dx
2. Find a particular integral for the equation 2
− 3 + 2x = 5e3t
dt dt
d2 x
3. Find a particular integral for the equation − x = 4e−2t
dt2
4. Obtain the general solution of y 00 − y 0 − 2y = 6
d2 y dy
5. Obtain the general solution of the equation 2
+ 3 + 2y = 10 cos 2x
dx dx
dy
Find the particular solution satisfying y(0) = 1, (0) = 0
dx
d2 y dy
6. Find a particular integral for the equation + +y =1+x
dx2 dx
7. Find the general solution of
d2 x dx d2 x dx
(a) − 6 + 5x = 3 (b) − 2 + x = et
dt2 dt dt2 dt
Answers
1. (a) x = Ae−t + Be3t − 2 (b) y = Ae−x + Be−4x + 2 (c) y = Ae−2t + Be−3t + 13 t − 18
5
(d) x = Ae−6t + Be−5t + 0.267t − 0.0978

√ √
(e) y = e−x [A sin 2x + B cos 2x] − 178
cos 2x − 2
17
sin 2x
(f) y = e−0.5t (A cos 0.866t + B sin 0.866t) − 0.438 cos 3t + 0.164 sin 3t
(g) y = A cos 3x + B sin 3x + 0.0548e8x (h) x = Ae4t + Be−4t + 9 6t
20
e
2. xp = 2.5e3t
3. xp = 34 e−2t
4. y = Ae2x + Be−x − 3
5. y = Ae−2x + Be−x + 32 sin 2x − 12 cos 2x; y = 32 e−2x + 32 sin 2x − 12 cos 2x
6. yp = x
3
7. (a) x = Aet + Be5t + 5
(b) x = Aet + Btet + 12 t2 et
50 HELM (2006):
®
Applications of
Introduction
Sections 19.2 and 19.3 have introduced several techniques for solving commonly occurring first-order
and second-order ordinary differential equations. In this Section we solve a number of these equations
which model engineering systems.
' $
• understand what is meant by a differential
equation
Prerequisites • be familiar with the terminology associated

with differential equations: order, dependent
Before starting this Section you should . . . variable and independent variable
• be able to integrate standard functions

&
' %
$
• recognise and solve first-order ordinary
differential equations, modelling simple
electrical circuits, projectile motion and
Newton’s law of cooling
• recognise and solve second-order ordinary

Learning Outcomes differential equations with constant
coefficients modelling free electrical and
On completion you should be able to . . . mechanical oscillations
• recognise and solve second-order ordinary

differential equations with constant
coefficients modelling forced electrical and
mechanical oscillations
& %
HELM (2006): 51
Section 19.4: Applications of Differential Equations
1. Modelling with first-order equations
Applying Newton’s law of cooling
In Section 19.1 we introduced Newton’s law of cooling. The model equation is
dθ
= −k(θ − θs ) θ = θ0 at t = 0. (5)
dt
where θ = θ(t) is the temperature of the cooling object at time t, θs the temperature of the
environment (assumed constant) and k is a thermal constant related to the object, θ0 is the initial
temperature of the liquid.
Task
Solve this initial value problem:
dθ
= −k(θ − θs ), θ = θ0 at t = 0
dt
Separate the variables to obtain an equation connecting two integrals:

Your solution
Answer
Z Z
dθ
=− k dt
θ − θs
Now integrate both sides of this equation:

Your solution
Answer
ln(θ − θs ) = −kt + C where C is constant
Apply the initial condition and take exponentials to obtain a formula for θ:
Your solution
Answer
ln(θ0 −θs ) = C. Hence ln(θ −θs ) = −kt+ln(θ0 −θs ) so that ln(θ −θs )−ln(θ0 −θ0 ) = −kt
Thus, rearranging and inverting, we find:

θ − θs θ − θs
ln = −kt ∴ = e−kt giving θ = θs + (θ0 − θs )e−kt .
θ0 − θs θ0 − θs
52 HELM (2006):
®
The graph of θ against t for θ = θs + (θ0 − θs )e−kt is shown in Figure 4 below.

θ
θ0
θs
t
Figure 4
We see that as time increases (t → ∞), then the temperature of the object cools down to that of
the environment, that is: θ → θs .
We could have solved (5) by the integrating factor method, which you are now asked to do.
Task
We can write the equation for Newton’s law of cooling (5) as
dθ
+ k θ = kθs , θ = θ0 at t = 0 (6)
dt
State the integrating factor for this equation:

Your solution
Answer
R
e k dt = ekt is the integrating factor.
Multiplying (6) by this factor we find that
dθ d kt
ekt + kekt θ = kθs ekt or, rearranging, (e θ) = kθs ekt
dt dt
Now integrate this equation and apply the initial condition:
Your solution
Answer
Integration produces ekt θ = θs ekt + C, where C is an arbitrary constant. Then, applying the initial
condition: when t = 0, θ0 = θs + C so that C = θ0 − θs gives the same result as before:
θ = θs + (θ0 − θs )e−kt ,
HELM (2006): 53
Modelling electrical circuits
Another application of first-order differential equations arises in the modelling of electrical circuits.
In Section 19.1 the differential equation for the RL circuit in Figure 5 below was shown to be
di
L + Ri = E
dt
in which the initial condition is i = 0 at t = 0.
R L
i
E
+ −
Figure 5
dy
First we write this equation in standard form { + P (x)y = Q(x)} and obtain the integrating
dx
factor.
Dividing the differential equation through by L gives
di R E
+ i=
dt L L
R
R
which is now in standard form. The integrating factor is e L dt = eRt/L .
Multiplying the equation in standard form by the integrating factor gives
di R E
eRt/L + eRt/L i = eRt/L
dt L L
or, rearranging,
d Rt/L E
(e i) = eRt/L .
dt L
Now we integrate both sides and apply the initial condition to obtain the solution.
Integrating the differential equation gives:
E Rt/L
eRt/L i = e +C
R
where C is a constant so
E
i = + Ce−Rt/L
R
Applying the initial condition i = 0 when t = 0 gives
E
0= +C
R
E
so that C = − .
R
E
Finally, i = (1 − e−Rt/L ).
R
E
Note that as t → ∞, i → so as t increases the effect of the inductor diminishes to zero.
R
54 HELM (2006):
®
Task
A spherical pill with volume V and surface area S is swallowed and slowly dissolves
in the stomach, releasing an active component. In one model it is assumed that
the capsule dissolves in the stomach acids such that the rate of change in volume,
dV
, is directly proportional to the pill’s surface area.
dt
dV
(a) Show that = −kV 2/3 where k is a positive real constant and solve this
dt
given that V = V0 at t = 0.
(b) Experimental measurements indicate that for a 4 mm pill, half of the volume
has dissolved after 3 hours. Find the rate constant k (m s−1 ).
(c) Estimate the time required for 95% of the pill to dissolve.
(a) First write down the formulae for volume of a sphere (V ) and surface area of a sphere (S) and
so express S in terms of V by eliminating r:
Your solution
Answer
4
V = πr3 S = 4πr2
3
1/3
3V
From the V equation r = so S = (36π)1/3 V 2/3 = kV 2/3 for constant k.
4π
Now write down the differential equation modelling the solution:

Your solution
Answer
dV
= −kV 2/3 (negative to represent a decrease with time)
dt
HELM (2006): 55
Using the condition V = V0 when t = 0, solve the differential equation:
Your solution
Answer
Solving by separation of variables gives
1/3
1
V = (C − kt)
3
and setting V = V0 when t = 0 means
3
1 1/3
V0 = C so C = 3V0 and the solution is
3
3
1/3 kt
V = V0 −
3
(b) Impose the condition that half the volume has dissolved after 3 hours to find k:
Your solution
Answer
3
1/3 kt
V = V0 −
3
V0
and when t = 3, V = so
2
1/3
V0 1/3 1/3
= V0 −k and so k = V0 (1 − (0.5)1/3 )
2
56 HELM (2006):
®
(c) First write down the solution to the differential equation inserting the value of k obtained in (b)
and then use it to estimate the time to 95% dissolving:
Your solution
Answer
3 3
1/3 1/3 1/3 t 1/3 t
V = V0 − V0 (1 − (0.5) ) i.e. V = V0 1 − (1 − (0.5) )
3 3
When 95% dissolved V = 0.05V0 so
3
1/3 t t
0.05V0 = V0 1 − (1 − (0.5) ) so (0.05)1/3 = 1 − (1 − (0.5)1/3 )
3 3
so
1 − (0.05)1/3

t=3 ≈ 9.185 ≈ 9 hr 11 min
1 − (0.5)1/3
2. Modelling free mechanical oscillations

Consider the following schematic diagram of a shock absorber:
Mass
Spring
Dashpot
Figure 6
The equation of motion can be described in terms of the vertical displacement x of the mass.
dx
Let m be the mass, k the damping force resulting from the dashpot and nx the restoring force
dt
resulting from the spring. Here, k and n are constants.
Then the equation of motion is
HELM (2006): 57
d2 x dx
m 2
= −k − nx.
dt dt
Suppose that the mass is displaced a distance x0 initially and released from rest. Then at t = 0,
dx
x = x0 and = 0. Writing the differential equation in standard form gives
dt
d2 x dx
m 2 +k + nx = 0.
dt dt
We shall see that the nature of the oscillations described by this differential equation depends crucially
upon the relative values of the mechanical constants m, k and n. This will be explored in subsequent
Tasks.
Task
Find and solve the auxiliary equation of the differential equation
d2 x dx
m 2
+k + nx = 0.
dt dt
Your solution
Answer
Putting x = eλt , the auxiliary equation is m λ2 + k λ + n = 0.
√
−k ± k 2 − 4m n
Hence λ = .
2m
The value of k controls the amount of damping in the system. We explore the solution for various
values of k.
Case 1: No damping
If k = 0 then there is no damping. We expect, in this case, that once motion has started it will
continue for ever. The motion that ensues is called simple harmonic motion. In this case we have
√ r
± −4m n n
λ= , that is, λ = ± i where i2 = −1.
2m m
and the solution for the displacement x is:
r r
n n
x = A cos t + B sin t where A, B are arbitrary constants.
m m
58 HELM (2006):
®
Task
dx
Impose the initial conditions x = x0 and = 0 at t = 0 to find the unique
dt
solution to the ODE:
Your solution
Answer
r r r r
dx n n n n
=− A sin t + B cos t
dt m m m m
r
dx n
When t = 0, = 0 so that B=0 so that B = 0.
dt m
r
n
Therefore x = A cos t .
m
Imposing the remaining initial condition: when t = 0, x = x0 so that x0 = A and finally:
r
n
x = x0 cos t .
m
Case 2: Light damping

If k 2 − 4mn < 0, i.e. k 2 < 4mn then the roots of the auxiliary equation are complex:
√ √
−k + i 4mn − k 2 −k − i 4mn − k 2
λ1 = λ2 =
2m 2m
Then, after some rearrangement:
√
x = e−kt/2m [A cos pt + B sin pt] in which p = 4mn − k 2 /2m.
HELM (2006): 59
Task
If m = 1, n = 1 and k = 1 find λ1 and λ2 and then find the solution for the
displacement x.
Your solution
Answer
√ " √ √ #
−1 + i 4 − 1 √ 3 3
λ= = −1/2 ± i 3/2. Hence x = e−t/2 A cos t + B sin t .
2 2 2
dx
Impose the initial conditions x = x0 , = 0 at t = 0 to find the arbitrary constants and hence find
dt
the solution to the ODE:
Your solution
Answer
Differentiating, we obtain
" √ √ # " √ √ √ √ #
dx 1 3 3 3 3 3 3
= − e−t/2 A cos t + B sin t + e−t/2 − A sin t+ B cos t
dt 2 2 2 2 2 2 2
At t = 0,
x = x0 = A (i)
√
dx 1 3
=0=− A+ B (ii)
dt 2 2
Solving (i) and (ii) we obtain
√ " √ √ √ #
3 3 3 3
A = x0 B= x0 then x = x0 e−t/2 cos t+ sin t .
3 2 3 2
The graph of x against t is shown in Figure 7. This is the case of light damping. As the damping in
the system decreases (i.e. k → 0 ) the number of oscillations (in a given time interval) will increase.
In many mechanical systems these oscillations are usually unwanted and the designer would choose a
value of k to either reduce them or to eliminate them altogether. For the choice k 2 = 4mn, known
60 HELM (2006):
®
as the critical damping case, all the oscillations are absent.
x0
x0 e−t/2
t
! √ √ √ "
3 3 3
x = x0 e−t/2 cos t+ sin t
2 3 2
Figure 7
Case 3: Heavy damping

If k 2 − 4mn > 0, i.e. k 2 > 4mn, then there are two real roots of the auxiliary equation, λ1 and λ2 :
√ √
−k + k 2 − 4mn −k − k 2 − 4mn
λ1 = λ2 =
2m 2m
Then
x = Aeλ1 t + Beλ2 t .
Task
If m = 1, n = 1 and k = 2.5 find λ1 and λ2 and then find the solution for the
displacement x.
Your solution
Answer
√
−2.5 ± 6.25 − 4
λ= = −1.25 ± 0.75
2
Hence λ1 , λ2 = −0.5, −2 and so x = Ae−0.5t + Be−2t
HELM (2006): 61
dx
Impose the initial conditions x = x0 , = 0 at t = 0 to find the arbitrary constants and hence find
dt
the solution to the ODE.
Your solution
Answer
Differentiating, we obtain
dx
= −0.5Ae−0.5t − 2Be−2t
dt
At t = 0,
x = x0 = A + B (i)
dx
= 0 = −0.5A − 2B (ii)
dt
4 1 1
Solving (i) and (ii) we obtain A = x0 B = − x0 then x = x0 (4e−0.5t − e−2t ).
3 3 3
The graph of x against t is shown below. This is the case of heavy damping.
x
x0
Other cases are dealt with in the Exercises at the end of the Section.
62 HELM (2006):
®
3. Modelling forced mechanical oscillations

Suppose now that the mass is subject to a force f (t) after the initial disturbance. Then the equation
of motion is
d2 x dx
m 2 +k + nx = f (t)
dt dt
Consider the case f (t) = F cos ωt, that is, an oscillatory force of magnitude F and angular frequency
ω. Choosing specific values for the constants in the model: m = n = 1, k = 0, and ω = 2 we find
d2 x
+ x = F cos 2t
dt2
Task
Find the complementary function for the differential equation
d2 x
+ x = F cos 2t
dt2
Your solution
Answer
The homogeneous equation is
d2 x
+x=0
dt2
with auxiliary equation λ2 + 1 = 0. Hence the complementary function is
xcf = A cos t + B sin t.
Now find a particular integral for the differential equation:

Your solution
HELM (2006): 63
Answer
d2 xp
Try xp = C cos 2t+D sin 2t so that 2 = −4C cos 2t−4D sin 2t. Substituting into the differential
dt
equation gives
(−4C + C) cos 2t + (−4D + D sin 2t) ≡ F cos 2t.
1
Comparing coefficients gives −3C = F and − 3D = 0 so that D = 0, C = − F and
3
1
xp = − F cos 2t. The general solution of the differential equation is therefore
3
1
x = xp + xcf = − F cos 2t + A cos t + B sin t.
3
Finally, apply the initial conditions to find the solution for the displacement x:
Your solution
Answer
We need to determine the derivative and apply the initial conditions:
dx 2
= F sin 2t − A sin t + B cos t.
dt 3
1 dx
At t = 0 x = x0 = − F + A and =0=B
3 dt
1
Hence B=0 and A = x0 + F.
3

1 1
Then x = − F cos 2t + x0 + F cos t.
3 3
The graph of x against t is shown below.
x
x0
If the angular frequency ω of the applied force is nearly equal to that of the free oscillation the
phenomenon of beats occurs. If the angular frequencies are equal we get the phenomenon of
resonance. Note that we can eliminate resonance by introducing damping into the system.
64 HELM (2006):
®
4. Modelling forces on beams
Shear force and bending moment of a beam
Introduction
The beam is a fundamental part of most structures we see around us. It may be used in many ways
depending as how its ends are fixed. One end may be rigidly fixed and the other free (called can-
tilevered) or both ends may be resting on supports (called simply supported). Other combinations
are possible. There are three basic quantities of interest in the deformation of beams, the deflection,
the shear force and the bending moment.
For a beam which is supporting a load of w (measured in N m−1 and which may represent the
self-weight of the beam or may be an external load), the shear force is denoted by S and measured
in N m−1 and the bending moment is denoted by M and measured in N m−1 .
The quantities M , S and w are related by
dM
=S (1)
dz
and
dS
= −w (2)
dz
where z measures the position along the beam. If one of the quantities is known, the others can be
calculated from the Equations (1) and (2). In words, the shear force is the negative of the derivative
(with respect to position) of the bending moment and the load is the derivative of the shear force.
Alternatively, the shear force is the negative of the integral (with respect to position) of the load and
the bending moment is the integral of the shear force. The negative sign in Equation (2) reflects the
fact that the load is normally measured positively in the downward direction while a positive shear
force refers to an upward force.
Problem posed in words
A beam is fixed rigidly at one end and free to move at the other end (like a diving board). It only
has to support its own weight. Find the shear force and the bending moment along its length.
A uniform beam of length L, supports its own weight wo (a constant). At one end (z = 0), the
beam is fixed rigidly while the other end (z = L) is free to move. Find the shear force S and the
bending moment M as functions of z.
As w is a constant, Equation (2) gives
Z Z
S = − wdz = − wo dz = −wo z + C.
HELM (2006): 65
At the free end (z = L) , the shear force S = 0 so C = wo L giving
S = wo (L − z)
This expression can be substituted into Equation (1) to give
Z Z Z
wo 2
M = Sdz = wo (L − z) dz = (wo L − wo z) dz = wo Lz − z + K.
2
wo
Once again, M = 0 at the free end z = L so K is given by K = − L2 . Thus
2
wo 2 wo 2
M = wo Lz − z − L
2 2
The diagrams in Figure 8 show the load w (Figure 8a), the shear force S (Figure 8b) and the bending
moment M (Figure 8c) as functions of position z.
Load (w)
(a) w = w0
Position (z)
L
Shear Force (S)

S = w0 L
(b)
Position (z)
L
Bending Moment (M )
Position (z)
L
(c)
M = −w0 L2
Figure 8: The loading (a), shear force (b) and bending moment (c) as functions of position z
Interpretation
The beam deforms (as we might have expected) with the shear force and bending moments having
maximum values at the fixed end and minimum (zero) values at the free end. You can easily
experience this for yourselves: simply hold a wooden plank (not too heavy) at one end with both
hands so that it is horizontal. As you try this with planks of increasing length (and hence weight)
you will find it increasingly difficult to support the weight of the plank (this is the shear force) and
increasingly difficult to keep the plank horizontal (this is the bending moment).
This mathematical model is an excellent description of real beams.
66 HELM (2006):
®
Deflection of a uniformly loaded beam
Introduction
A uniformly loaded beam of length L is supported at both ends as shown in Figure 9. The deflection
y(x) is a function of horizontal position x and obeys the ordinary differential equation (ODE)
d4 y 1
4
(x) = q(x) (1)
dx EI
where E is Young’s modulus, I is the moment of inertia and q(x) is the load per unit length at
point x. We assume in this problem that q(x) = q a constant. The boundary conditions are (i) no
deflection at x = 0 and x = L (ii) no curvature of the beam at x = 0 and x = L.
Load q
Beam
y(x) x
x
Ground y L
Figure 9: The bending beam, parameters involved in the mathematical formulation

Problem in words
Find the deflection of a beam, supported so that that there is no deflection and no curvature of the
beam at its ends, subject to a uniformly distributed load, as a function of position along the beam.

Find the equation of the curve y(x) assumed by the bending beam that satisfies the ODE (1). Use
the coordinate system shown in Figure 9 where the origin is at the left extremity of the beam. In
this coordinate system, the boundary conditions, which require that there is no deflection at x = 0
and x = L, and that there is no curvature of the beam at x = 0 and x = L, are
(a) y(0) = 0
(b) y(L) = 0
d2 y
(c) =0
dx2 x=0
d2 y
(d) =0
dx2 x=L
d2 y
(e) =0
dx2 x=L
dy(x) d2 y(x)
Note that and are respectively the slope and the radius of curvature of the curve at
dx dx2
point (x, y).
HELM (2006): 67
Integrating Equation (1) leads to:
d3 y
EI (x) = qx + A (2)
dx3
Integrating a second time:
d2 y
EI 2
(x) = qx2 /2 + Ax + B (3)
dx
Integrating a third time:
dy
EI (x) = qx3 /6 + Ax2 /2 + Bx + C (4)
dx
Integrating a fourth time:
EIy(x) = qx4 /24 + Ax3 /6 + Bx2 /2 + Cx + D. (5)
The boundary conditions (a) and (b) enable determination of the constants of integration A, B, C, D.
Indeed, the boundary condition (a), y(0) = 0, and Equation (5) give
EIy(0) = q × (0)4 /24 + A × (0)3 /6 + B × (0)2 /2 + C × (0) + D = 0
which yields D = 0.
The boundary condition (b), y(L) = 0, and Equation (5) give
EIy(L) = qL4 /24 + AL3 /6 + BL2 /2 + CL + D.
Using the newly found value for D one writes
qL4 /24 + AL3 /6 + BL2 /2 + CL = 0 (6)
d2 y
The boundary condition (c) obtained from the definition of the radius of curvature, (0) = 0, and
dx2
Equation (3) give
d2 y
I (0) = q × (0)2 /2 + A × (0) + B
dx2
d2 y
which yields B = 0 . The boundary condition (d), (L) = 0, and Equation (3) give
dx2
d2 y
EI (L) = qL2 /2 + AL = 0
dx2
which yields A = −qL/2 . The expressions for A, B, D are introduced in Equation (6) to find the last
unknown constant C. This leads to qL4 /24 − qL4 /12 + CL = 0 or C = qL3 /24. Finally, Equation
(5) and the values of constants lead to the solution
y(x) = [qx4 /24 − qLx3 /12 + qL3 x/24]/EI. (7)
Interpretation
The predicted deflection is zero at both ends as required, and you may check that it is symmetrical
about the centre of the beam by switching to the coordinate system (X, Y ) with L/2 − x = X
and y = Y and verifying that the deflection Y (X) is symmetrical about the vertical axis, i.e.
Y (X) = Y (−X).
68 HELM (2006):
®
Exercises
1. In an RC circuit (a resistor and a capacitor in series) the applied emf is a constant E. Given
dq
that = i where q is the charge in the capacitor, i the current in the circuit, R the resistance
dt
and C the capacitance the equation for the circuit is
q
Ri + = E.
C
If the initial charge is zero find the charge subsequently.
2. If the voltage in the RC circuit is E = E0 cos ωt find the charge and the current at time t.
3. An object is projected from the Earth’s surface. What is the least velocity (the escape velocity)
of projection in order to escape the gravitational field, ignoring air resistance.
The equation of motion is
dv R2
mv = −m g 2
dx x
where the mass of the object is m, its distance from the centre of the Earth is x and the radius
of the Earth is R.
4. The radial stress p at distance r from the axis of a thick cylinder subjected to internal pressure
dp
is given by p + r = A − p where A is a constant. If p = p0 at the inner wall (r = r1 ) and
dr
is negligible (p = 0) at the outer wall (r = r2 ) find an expression for p.
5. The equation for an LCR circuit with applied voltage E is

di 1
L + Ri + q = E.
dt C
By differentiating this equation find the solution for q(t) and i(t) if L = 1, R = 100, C = 10−4
and E = 1000 given that q = 0 and i = 0 at t = 0.
6. Consider the free vibration problem in Section 19.4 subsection 2 (page 57) when m = 1, n = 1
and k = 2 (critical damping).
Find the solution for x(t).
7. Repeat Exercise 6 for the case m = 1, n = 1 and k = 1.5 (light damping)
8. Consider the forced vibration problem in Section 19.4 subsection 2 with m = 1, n = 25, k =
8, E = sin 3t, x0 = 0 with an initial velocity of 3.
9. This refers to the Task on page 55 concerning modelling the dissolving of a pill in the stomach.
An alternative model supposes that the pill is very rapidly permeated by stomach acids and the
small granules contained in the capsule dissolve individually. In this case, the rate of change
of volume is assumed to be directly proportional to the volume. Using the experimental data
given in the Task, estimate the time for 95% of the pill to dissolve, based on this alternative
model, and compare results.
HELM (2006): 69
Answers
dq q dq 1 E
1. Use the equation in the form R + = E or + q= .
dt C dt RC R
The integrating factor is et/RC and the general solution is
q = EC(1 − e−t/RC ) and as t → ∞ q → EC.
E0 C
cos ωt − e−t/RC + ωRC sin ωt

2. q = 2 2 2
1+ω R C

dq E0 C 1 −t/RC 2
i= = −ω sin ωt + e + ω RC cos ωt .
dt 1 + ω 2 R2 C 2 RC
√
3. vmin = 2gR. If R = 6378 km and g = 9.81 m s−2 then vmin = 11.2 km s−1 .
p0 r12 r22

4. p = 2 1− 2
r1 − r22 r
1 −50t √ √ √ 20 √
5. q = 0.1 − √ e (sin 50 3t + 3 cos 50 3t) i = √ e−50t sin 50 3t.
10 3 3
6. x = x0 (1 + t)e−t
√ √
−0.75t 7 3 7
7. x = x0 e (cos t + √ sin t)
4 7 4
1 −4t
8. x = e (3 cos 3t + 106 sin 3t) − 3 cos 3t + 2 sin 3t)
104
dV 1
9. This leads to = −kV and V = V0 e−kt where k = ln 2. The time taken is about 4 hr
dt 3
19 min. This is much less than the other model, as should be expected.
70 HELM (2006):
Contents 20
Laplace Transforms
20.1 Causal Functions 2
20.2 The Transform and its Inverse 11
20.3 Further Laplace Transforms 24
20.4 Solving Differential Equations 34
20.5 The Convolution Theorem 55
20.6 Transfer Functions 63
Learning outcomes
In this Workbook you will learn what a causal function is, what the Laplace transform is,
and how to obtain the Laplace transform of many commonly occurring causal functions.
You will learn how the inverse Laplace transform can be obtained by using a look-up
table and by using the so-called shift theorems. You will understand how to apply the
Laplace transform to solve single and systems of ordinary differential equations.
Finally you will gain some appreciation of transfer functions and some of their applications
in solving linear systems.

Causal Functions 20.1
Introduction
The Laplace transformation is a technique employed primarily to solve constant coefficient ordinary
differential equations. It is also used in modelling engineering systems. In this section we look at
those functions to which the Laplace transformation is normally applied; so-called causal or one-
sided functions. These are functions f (t) of a single variable t such that f (t) = 0 if t < 0. In
particular we consider the simplest causal function: the unit step function (often called the Heaviside
function) u(t):
(
1 if t ≥ 0
u(t) =
0 if t < 0
We then use this function to show how signals (functions of time t) may be ‘switched on’ and
‘switched off’.

• understand what a function is

Prerequisites
• be able to integrate simple functions

• explain what a causal function is
Learning Outcomes
• be able to apply the step function to ‘switch
On completion you should be able to . . . on’ and ‘switch off’ signals

2 HELM (2006):
Workbook 20: Laplace Transforms
®
1. Transforms and causal functions

Without perhaps realising it, we are used to employing transformations in mathematics. For example,
we often transform problems in algebra to an equivalent problem in geometry in which our natural
intuition and experience can be brought to bear. Thus, for example, if we ask:
q What are those values of x for which x(x − 1)(x + 2) > 0’ then perhaps the simplest way to solve
this problem is to sketch the curve y = x(x − 1)(x + 2) and then, by inspection, find for what values
of x it is positive. We obtain the following figure.
−2 0 1 x
Figure 1
We have transformed a problem in algebra into an equivalent geometrical problem.
Clearly, by inspection of the curve, this inequality is satisfied if
−2 < x < 0 or if x > 1
and we have transformed back again to algebraic form.
The Laplace transform is a more complicated transformation than the simple geometric transforma-
tion considered above. What is done is to transform a function f (t) of a single variable t into another
function F (s) of a single variable s through the relation:
Z ∞
F (s) = e−st f (t) dt.
0
The procedure is to produce, for each f (t) of interest, the corresponding expression F (s). As a
simple example, if f (t) = e−2t then
Z ∞
F (s) = e−st e−2t dt
Z0 ∞
= e−(s+2)t dt
0 −(s+2)t ∞
e
=
−(s + 2) 0
e0 1
= 0− =
−(s + 2) s+2
(We remind the reader that e−kt → 0 as t → ∞ if k > 0.)
HELM (2006): 3
Section 20.1: Causal Functions
Task Z ∞
Find F (s) if f (t) = t using F (s) = e−st t dt
0
Your solution
Answer
You should obtain F (s) = 1/s2 . You do this by integrating by parts:
Z ∞ −st ∞ Z ∞ −st Z ∞ −st
−st e e e
F (s) = e t dt = t − dt = 0 + dt
0 (−s) 0 0 (−s) 0 s
−st ∞
e 1
= − 2 = 2
s 0 s
Z ∞
The integral e−st f (t) dt is called the Laplace transform of f (t) and is denoted by L{f (t)}.
0
Key Point 1
The Laplace Transform
Z ∞
L{f (t)} = e−st f (t) dt = F (s)
0
Causal functions
As we have seen above, the Laplace transform involves an integral with limits t = 0 and t = ∞.
Because of this, the nature of the function being transformed, f (t), when t is negative is of no
importance. In order to emphasize this we shall only consider so-called causal functions all of
which take the value 0 when t < 0.
The simplest causal function is the Heaviside or step function denoted by u(t) and defined by:
(
1 if t ≥ 0
u(t) =
0 if t < 0
4 HELM (2006):
®
with graph as in Figure 2.

u (t)
1
Figure 2
Similarly we can consider other ‘step-functions’. For example, from the above definition we deduce
( (
1 if t − 3 ≥ 0 1 if t ≥ 3
u(t − 3) = or, rearranging the inequalities: u(t − 3) =
0 if t − 3 < 0 0 if t < 3
with graph as in Figure 3:
u(t − 3 )
1
t
3
Figure 3
The step function has a useful property: multiplying an ordinary function f (t) by the step function
u(t) changes it into a causal function; e.g. if f (t) = sin t then sin t.u(t) is causal. This is illustrated
in the change from Figure 4 to Figure 5:
f (t) = sin t
Figure 4
h(t) = f (t)u(t) = sin t.u(t)
Figure 5
HELM (2006): 5
Key Point 2
Causal Functions
If u(t) is the unit step function and f (t) is any function then
f (t)u(t) is a causal function
The step function can be used to ‘switch on’ functions at other values of t (which we will normally
interpret as time). For example u(t − 1) has the value 1 if t ≥ 1 and 0 otherwise so that sin t.u(t − 1)
is described by the (solid) curve in Figure 6:
sin t.u(t − 1)
1 t
Figure 6
The step function can also be used to ‘switch-off’ signals. For example, the step function u(t − 1) −
u(t − 3) in Figure 7 has the effect on f (t) such that f (t) [u(t − 1) − u(t − 3)] (described by the solid
curve in Figure 8) switches on at t = 1 (because then u(t − 1) − u(t − 3) takes the value 1), remains
‘on’ for 1 ≤ t ≤ 3, and then switches ‘off’ when t > 3 (because then u(t−1)−u(t−3) = 1−1 = 0).
u(t − 1) − u(t − 3)
1 3 t
Figure 7
f (t)[u(t − 1) − u(t − 3)]
1 3 t
Figure 8
6 HELM (2006):
®
If we have an expression f (t − a)u(t − a) then this is the function f (t) translated along the t-axis
through a time a. For example sin(t − 2).u(t − 2) is simply the causal sine curve sin t.u(t) shifted
to the right by two units as described in the following Figure 9.
sin(t − 2).u(t − 2)
2 t
Figure 9
Task
Sketch the curve f (t) = et (u(t − 1) − u(t − 2)).
Your solution
HELM (2006): 7
Answer
You should obtain
f (t)
1 2 t
This is obtained since, if t < 1 then t − 1 < 0 and t − 2 < 0 and so

u(t − 1) = 0, u(t − 2) = 0 leading to f (t) = 0
Also if 1 < t < 2 then t − 1 > 0 and t − 2 < 0 so
u(t − 1) = 1 and u(t − 2) = 0 implying f (t) = et for this range of t-values.
Finally if t > 2 then t − 1 > 0 and t − 2 > 0 and so
u(t − 1) = 1, u(t − 2) = 1 giving f (t) = et (1 − 1) = 0.
2. Properties of causal functions

Even though a function f (t) may be causal we shall still often use the step function u(t) to emphasize
its causality and write f (t) u(t). The following properties are easily verified.
(a) The sum of casusal functions is causal:
f (t)u(t) + g(t)u(t) = [f (t) + g(t)] u(t)
(b) The product of causal functions is causal:
{f (t)u(t)} {g(t)u(t)} = f (t)g(t).u(t)
(c) The derivative of a causal function is causal:
d df
{f (t)u(t)} = .u(t)
dt dt
(d) The definite integral of a causal function is a constant.
Calculating the definite integral of a causal function needs care.

Z b
Consider f (t)u(t) dt where a < b. There are 3 cases to consider (i) b < 0 (ii) a < 0, b > 0
a
and (iii) a > 0 which are described in Figure 10:
8 HELM (2006):
®
(i)
a b t
(ii)
a b t
(iii)
a b t
Figure 10
Z b
(i) If b < 0 then t < 0 and so u(t) = 0 ∴ f (t)u(t) dt = 0
a
(ii) If a < 0, b > 0 then

Z b Z 0 Z b Z b Z b
F (t) = f (t)u(t) dt = f (t)u(t) dt + f (t)u(t) dt = 0 + f (t)u(t) dt = f (t) dt
a a 0 0 0
since, in the first integral t < 0 and so u(t) = 0 whereas, in the second integral t > 0 and so
u(t) = 1.
Z b Z b
(iii) If a > 0 then f (t)u(t) dt = f (t) dt since t > 0 and so u(t) = 1.
a a
Task Z 4
−t df
If f (t) = (e + t)u(t) then find and f (t) dt
dt −3
Find the derivative first:

Your solution
Answer
df
= (−e−t + 1)u(t)
dt
HELM (2006): 9
Z 4
Now obtain another integral representing f (t) dt:
−3
Your solution
Answer Z 4
You should obtain (e−t + t) dt since
0
Z 4 Z 4 Z 4
−t
f (t) dt = (e + t)u(t) dt = (e−t + t) dt
−3 −3 0
This follows because in the range t = −3 to t = 0 the step function u(t) = 0 and so that part of
the integral is zero. In the other part of the integral u(t) = 1.
Now complete the integration:
Your solution
Answer
You should obtain 8.9817 (to 4 d.p.) since
Z 4 4
t2

−t −t
(e + t) dt = −e + = (−e−4 + 8) − (−1) = −e−4 + 9 ≈ 8.9817
0 2 0
Exercises
1. Find the derivative with respect to t of (t3 + sin t) u(t).
2. Find the area under the curve (t3 + sin t)u(t) between t = −3 and t = 1.
1
3. Find the area under the curve [u(t − 1) − u(t − 3)] between t = −2 and t = 2.5.
(t + 3)
Answers
1. (3t2 + cos t)u(t)
2. 0.7097
3. 0.3185
10 HELM (2006):
®
The Transform and
its Inverse 20.2
Introduction
In this Section we formally introduce the Laplace transform. The transform is only applied to causal
functions which were introduced in Section 20.1. We find the Laplace transform of many commonly
occurring ‘signals’and produce a table of standard Laplace transforms.
We also consider the inverse Laplace transform. To begin with, the inverse Laplace transform is
obtained ‘by inspection’ using a table of transforms. This approach is developed by employing
techniques such as partial fractions and completing the square introduced in 3.6.
' $
• understand what a causal function is
• be able to find and use partial fractions

Prerequisites
• be able to perform integration by parts
• be able to use the technique of completing
the square
&
' %
$
• find the Laplace transform of many
commonly occurrring causal functions
• obtain the inverse Laplace transform

Learning Outcomes using techniques involving
(i) a table of transforms
(ii) partial fractions
(iii) completing the square
(iv) the first shift theorem
& %
HELM (2006): 11
Section 20.2: The Transform and its Inverse
1. The Laplace transform
If f (t) is a causal function then the Laplace transform of f (t) is written L{f (t)} and defined by:
Z ∞
L{f (t)} = e−st f (t) dt.
0
Clearly, once the integral is performed and the limits substituted the resulting expression will involve
the s parameter alone since the dependence upon t is removed in the integration process. This
resulting expression in s is denoted by F (s); its precise form is dependent upon the form taken by
f (t). We now refine Key Point 1 (page 4).
Key Point 3
The Laplace Transform of a Causal Function
Z ∞
L{f (t)u(t)} ≡ e−st f (t)u(t) dt ≡ F (s)
0
To begin, we determine the Laplace transform of some simple causal functions. For example, if we
consider the ramp function f (t) = t.u(t) with graph
f (t) = t u(t)
45 0
t
Figure 11
we find:
Z ∞
L{t u(t)} = e−st t u(t) dt
Z0 ∞
= e−st t dt since in the range of the integral u(t) = 1
0
−st ∞ Z ∞ −st
te e
= − dt using integration by parts
(−s) 0 0 (−s)
−st ∞ −st ∞
te e
= −
(−s) 0 (−s)2 0
Now we have the difficulty of substituting in the limits of integration. The only problem arises
with the upper limit (t = ∞). We shall always assume that the parameter s is so chosen that no
12 HELM (2006):
®
contribution ever arises from the upper limit (t = ∞). In this particular case we need only demand
that s is real and positive. Using this ‘rule of thumb’:

1
L{t u(t)} = [0 − 0] − 0 −
(−s)2
1
= 2
s
Thus, if f (t) = t u(t) then F (s) = 1/s2 .
A similar, but more tedious, calculation yields the result that if f (t) = tn u(t) in which n is a positive
integer then:
n!
L{tn u(t)} =
sn+1
[We remember n! ≡ n(n − 1)(n − 2) . . . (3)(2)(1).]
Task
Find the Laplace transform of the step function u(t).
Begin by obtaining the Laplace integral:
Your solution
Answer Z ∞
You should obtain e−st dt since in the range of integration, t > 0 and so u(t) = 1 leading to
0
Z ∞ Z ∞
−st
L{u(t)} = e u(t) dt = e−st dt
0 0
Your solution
Answer
You should have obtained:
Z ∞
L{u(t)} = e−st dt
0 −st ∞
e 1 1
= =0− =
(−s) 0 (−s) s
where, again, we have assumed the contribution from the upper limit is zero.
HELM (2006): 13
As a second example, we consider the decaying exponential f (t) = e−at u(t) where a is a positive
constant. This function has graph:
f (t) = e−at u(t)
Figure 12
In this case,
Z ∞
−at
L{e u(t)} = e−st e−at dt
Z0 ∞
= e−(s+a)t dt
0 −(s+a)t ∞
e 1
= = (zero contribution from the upper limit)
−(s + a) 0 s+a
1
Therefore, if f (t) = e−at u(t) then F (s) = .
s+a
Following this approach we can develop a table of Laplace transforms which records, for each causal
function f (t) listed, its corresponding transform function F (s). Table 1 gives a limited table of
transforms.
The linearity property of the Laplace transformation

If f (t) and g(t) are causal functions and c1 , c2 are constants then
Z ∞
L{c1 f (t) + c2 g(t)} = e−st [c1 f (t) + c2 g(t)] dt
0
Z ∞ Z ∞
−st
= c1 e f (t) dt + c2 e−st g(t) dt
0 0
= c1 L{f (t)} + c2 L{g(t)}
Key Point 4
Linearity Property of the Laplace Transform
L{c1 f (t) + c2 g(t)} = c1 L{f (t)} + c2 L{g(t)}
14 HELM (2006):
®
Table 1. Table of Laplace Transforms

Rule Causal function Laplace transform
1 f (t) F (s)
1
2 u(t)
s
n!
3 tn u(t)
sn+1
1
4 e−at u(t)
s+a
a
5 sin at . u(t)
s + a2
2
s
6 cos at . u(t)
s + a2
2
b
7 e−at sin bt . u(t)
(s + a)2 + b2
s+a
8 e−at cos bt u(t)
(s + a)2 + b2
Note: For convenience, this table is repeated at the end of the Workbook.
That is, the Laplace transform of a linear sum of causal functions is a linear sum of Laplace transforms.
For example,
L{2 cos t . u(t) − 3t2 u(t)} = 2L{cos t . u(t)} − 3L{t2 u(t)}

s 2
= 2 2 −3 3
s +1 s
Task
Obtain the Laplace transform of the hyperbolic function sinh at.
Begin by expressing sinh at in terms of exponential functions:
Your solution
Answer
sinh at = 12 (eat − e−at )
Now use the linearity property (Key Point 4) to obtain the Laplace transform of the causal function
sinh at.u(t):
HELM (2006): 15
Your solution
Answer
You should obtain a/(s2 − a2 ) since
e − e−at
at
1 1
L{sinh at.u(t)} = L .u(t) = L{eat .u(t)} − L{e−at .u(t)}
2 2 2

1 1 1 1
= − (Table 1, Rule 4)
2 s−a 2 s+a

1 2a a
= = 2
2 (s − a)(s + a) s − a2
Task
Obtain the Laplace transform of the hyperbolic function cosh at.
Your solution
Answer
s
You should obtain since
s2− a2
e + e−at
at
1 1
L{cosh at.u(t)} = L .u(t) = L{eat .u(t)} + L{e−at .u(t)}
2 2 2

1 1 1 1
= + (Table 1, Rule 4)
2 s−a 2 s+a

1 2s s
= = 2
2 (s − a)(s + a) s − a2
16 HELM (2006):
®
Task
Find the Laplace transform of the delayed step-function u(t − a), a > 0.
Write the delayed step-function here in terms of an integral:
Your solution
Answer Z ∞
You should obtain L{u(t − a)} = e−st dt (note the lower limit is a) since:
a
Z ∞ Z a Z ∞
−st −st
L{u(t − a)} = e u(t − a) dt = e u(t − a) dt + e−st u(t − a) dt
0 0 a
In the first integral 0 < t < a and so (t − a) < 0, therefore u(t − a) = 0.

In the second integral a < t < ∞ and so (t − a) > 0, therefore u(t − a) = 1. Hence
Z ∞
L{u(t − a)} = 0 + e−st dt.
a
Your solution
Answer ∞
∞
e−st e−sa
Z
−st
L{u(t − a)} = e dt = =
a (−s) a s
Exercise
Determine the Laplace transform of the following functions.
(a) e−3t u(t) (b) u(t − 3) (c) e−t sin 3t.u(t) (d) (5 cos 3t − 6t3 ).u(t)
1 e−3s 3 5s 36
Answer (a) (b) (c) (d) − 4
s+3 s (s + 1)2 + 9 s2 +9 s
HELM (2006): 17
2. The inverse Laplace transform
The Laplace transform takes a causal function f (t) and transforms it into a function of s, F (s):
L{f (t)} ≡ F (s)
The inverse Laplace transform operator is denoted by L−1 and involves recovering the original causal
function f (t). That is,
Key Point 5
Inverse Laplace Transform
L−1 {F (s)} = f (t) where L{f (t)} = F (s)
For example,
usingstandard transforms from Table 1:
s s
L−1 2
= cos 2t . u(t) since L{cos 2t . u(t)} = 2 . (Table 1, Rule 6)
s +4 s +4
Also
−1 3 3
L 2
= 3t u(t) since L{3t u(t)} = 2 . (Table 1, Rule 3)
s s
Because the Laplace transform is a linear operator it follows that the inverse Laplace transform is
also linear, so if c1 , c2 are constants:
Key Point 6
Linearity Property of Inverse Laplace Transforms
L−1 {c1 F (s) + c2 G(s)} = c1 L−1 {F (s)} + c2 L−1 {G(s)}
2 6
For example, to find the inverse Laplace transform of − we have
s4 s2 + 4

−1 2 6 2 −1 6 −1 2
L − = L − 3L
s4 s2 + 4 6 s4 s2 + 4
1 3
= t u(t) − 3 sin 2t . u(t) (from Table 1)
3
Note that the fractions have had to be manipulated slightly in order that the expressions match
precisely with the expressions in Table 1.
18 HELM (2006):
®
Although the inverse Laplace transform can be examined at a deeper mathematical level we shall be
content with this simple-minded approach to finding inverse Laplace transforms by using the table
of Laplace transforms. However, even this approach is not always straightforward and considerable
algebraic manipulation is often required before an inverse Laplace transform can be found. Next we
consider two standard rearrangements which often occur.
Inverting through the use of partial fractions

The function
1
F (s) =
(s − 1)(s + 2)
does not appear in our table of transforms and so we cannot, by inspection, write down the inverse
Laplace transform. However, by using partial fractions we see that
1 1
1
F (s) = = 3 − 3
(s − 1)(s + 2) s−1 s+2
and so, using the linearity property:
1 1
−1 1 −1 3 −1 3
L = L −L
(s − 1)(s + 2) s−1 s+2
= 31 et − 13 e−2t (Table 1, Rule 4)
Task
3
Find the inverse Laplace transform of .
(s − 1)(s2 + 1)
Begin by using partial fractions to write the given expression in a more suitable form:
Your solution
Answer
3 3
3 2 2
s + 32
= − 2
(s − 1)(s2 + 1) s−1 s +1
Now continue to obtain the inverse:

Your solution
HELM (2006): 19
Answer

−1 3 3 −1 1 3 −1 s 3 −1 1
L = L − L − L
(s − 1)(s2 + 1) 2 s−1 2 s2 + 1 2 s2 + 1
3 t
= e − cos t − sin t u(t) (Table 1, Rules 4, 6, 5)
2
3. The first shift theorem

The first and second shift theorems enable an even wider range of Laplace transforms to be easily
obtained than the transforms we have already found. They also enable a significantly wider range of
inverse transforms to be found. Here we introduce the first shift theorem. If f (t) is a causal function
with Laplace transform F (s), i.e. L{f (t)} = F (s), then as we shall see, the Laplace transform of
e−at f (t), where a is a given constant, can easily be found in terms of F (s).
Using the definition of the Laplace transform:
Z ∞
−at −st −at
L{e f (t)} = e e f (t) dt
0
Z ∞
= e−(s+a)t f (t) dt
0
But if
Z ∞
F (s) = L{f (t)} = e−st f (t) dt
0
then simply replacing ‘s’ by ‘s + a’ on both sides gives:

Z ∞
F (s + a) = e−(s+a)t f (t) dt
0
That is, the parameter s is shifted to the value s + a.

We have then the statement of the first shift theorem:
Key Point 7
First Shift Theorem
If L{f (t)} = F (s) then L{e−at f (t)} = F (s + a).
20 HELM (2006):
®
For example, we already know (from Table 1) that

6
L{t3 u(t)} =
s4
and so, by the first shift theorem:
6
L{e−2t t3 u(t)} =
(s + 2)4
Task
Use the first shift theorem to determine L{e2t cos 3t.u(t)}.
Your solution
Answer
s−2 s
You should obtain 2
since L{cos 3t.u(t)} = 2 (Table 1, Rule 6)
(s − 2) + 9 s +9
and so by the first shift theorem (with a = −2)
s−2
L{e2t cos 3t.u(t)} =
(s − 2)2 + 9
obtained by simply replacing ‘s’ by ‘s − 2’.
We can also employ the first shift theorem to determine some inverse Laplace transforms.
Task
3
Find the inverse Laplace transform of F (s) = .
s2 − 2s − 8
Begin by completing the square in the denominator:
Your solution
Answer
3 3
2
=
s − 2s − 8 (s − 1)2 − 9
HELM (2006): 21
3
Recalling that L{sinh 3t u(t)} = (from the Task on page 15) complete the inversion using
s2 −9
the first shift theorem:
Your solution
Answer
You should obtain

−1 3
L = et sinh 3t u(t)
(s − 1)2 − 9
Here, in the notation of the shift theorem:
3
f (t) = sinh 3t u(t) F (s) = 2 and a = −1
s −9
Inverting using completion of the square

The function:
4s
F (s) =
s2 + 2s + 5
does not appear in the table of transforms and, again, needs amending before we can find its inverse
transform. In this case, because s2 + 2s + 5 does not have nice factors, we complete the square in
the denominator:
s2 + 2s + 5 ≡ (s + 1)2 + 4
and so
4s 4s
F (s) = =
s2 + 2s + 5 (s + 1)2 + 4
Now the numerator needs amending slightly to enable us to use the appropriate rule in the table of
transforms (Table 1, Rule 8):

4s s+1−1
F (s) = = 4
(s + 1)2 + 4 (s + 1)2 + 4

s+1 1
= 4 −
(s + 1)2 + 4 (s + 1)2 + 4

4(s + 1) 2
= −2
(s + 1)2 + 4 (s + 1)2 + 4
Hence

−1 −1 s+1 −1 2
L {F (s)} = 4L − 2L
(s + 1)2 + 4 (s + 1)2 + 4
= 4e−t cos 2t . u(t) − 2e−t sin 2t . u(t)
= e−t [4 cos 2t − 2 sin 2t]u(t)
22 HELM (2006):
®
Task
3
s2 − 4s + 6
Begin by completing the square in the denominator of this expression:
Your solution
Answer
3 3
=
s2 − 4s + 6 (s − 2)2 + 2
Now obtain the inverse:

Your solution
Answer
You should obtain:
( " √ #)
√

3 3 2 3 2t
L−1 = L−1
√ = √ e sin 2t.u(t) (Table 1, Rule 7)
(s − 2)2 + 2 2 (s − 2)2 + 2 2
Exercise
Determine the inverse Laplace transforms of the following functions.
10 s−1 3s − 7 3s + 3 s+3
(a) (b) (c) (d) (e)
s4 s2 + 8s + 17 s2 + 9 (s − 1)(s + 2) s2 + 4s
2
(f)
(s + 1)(s2 + 1)
Answer
(a) 10 3
6
t (b) e−4t cos t − 5e−4t sin t (c) 3 cos 3t − 37 sin 3t (d) 2et + e−2t
(e) 43 u(t) + 41 e−4t u(t) (f) (e−t − cos t + sin t)u(t)
HELM (2006): 23
Further Laplace
Transforms 20.3
Introduction
In this Section we introduce the second shift theorem which simplifies the determination of Laplace
and inverse Laplace transforms in some complicated cases.
Then we obtain the Laplace transform of derivatives of causal functions. This will allow us, in the
next Section, to apply the Laplace transform in the solution of ordinary differential equations.
Finally, we introduce the delta function and obtain its Laplace transform. The delta function is often
needed to model the effect on a system of a forcing function which acts for a very short time.
' $
• be able to find Laplace transforms and inverse
Laplace transforms of simple causal functions
Prerequisites • be familiar with integration by parts

Before starting this Section you should . . . • understand what an initial-value problem is
• have experience of the first shift theorem

&
' %
$
• use the second shift theorem to obtain
Laplace transforms and inverse Laplace
Learning Outcomes transforms
On completion you should be able to . . . • find the Laplace transform of the derivative
of a causal function
& %
24 HELM (2006):
®
1. The second shift theorem

The second shift theorem is similar to the first except that, in this case, it is the time-variable that
is shifted not the s-variable. Consider a causal function f (t)u(t) which is shifted to the right by
amount a, that is, the function f (t − a)u(t − a) where a > 0. Figure 13 illustrates the two causal
functions.
f (t)u(t) f (t − a)u(t − a)
t a t
Figure 13
The Laplace transform of the shifted function is easily obtained:
Z ∞
L{f (t − a)u(t − a)} = e−st f (t − a)u(t − a) dt
Z0 ∞
= e−st f (t − a) dt
a
(Note the change in the lower limit from 0 to a resulting from the step function switching on at
t = a). We can re-organise this integral by making the substitution x = t − a. Then dt = dx
and when t = a, x = 0 and when t = ∞ then x = ∞.
Therefore
Z ∞ Z ∞
−st
e f (t − a) dt = e−s(x+a) f (x) dx
a 0
Z ∞
−sa
= e e−sx f (x) dx
0
The final integral is simply the Laplace transform of f (x), which we know is F (s) and so, finally, we
have the statement of the second shift theorem:
Key Point 8
Second Shift Theorem
If L{f (t)} = F (s) then L{f (t − a)u(t − a)} = e−sa F (s)
HELM (2006): 25
Section 20.3: Further Laplace Transforms
Obviously, this theorem has its uses in finding the Laplace transform of time-shifted causal func-
tions but it is also of considerable use in finding inverse Laplace transforms since, using the inverse
formulation of the theorem of Key Point 8 we get:
Key Point 9
Inverse Second Shift Theorem
If L−1 {F (s)} = f (t) then L−1 {e−sa F (s)} = f (t − a)u(t − a)
Task
e−3s
s2
Your solution
Answer
You should obtain (t − 3)u(t − 3) for the following reasons. We know that the inverse Laplace
transform of 1/s2 is t.u(t) (Table 1, Rule 3) and so, using the second shift theorem (with a = 3),
we have

−1 −3s 1
L e = (t − 3)u(t − 3)
s2
This function is graphed in the following figure:
(t − 3)u(t − 3)
45◦
t
3
26 HELM (2006):
®
Task
s
Find the inverse Laplace transform of
s2 − 2s + 2
Your solution
Answer
You should obtain et (cos t + sin t).
To obtain this, complete the square in the denominator: s2 − 2s + 2 = (s − 1)2 + 1 and so
s s (s − 1) + 1 s−1 1
= = = +
s2 − 2s + 2 2
(s − 1) + 1 2
(s − 1) + 1 (s − 1) + 1 (s − 1)2 + 1
2
Now, using the first shift theorem

−1 s−1 −1 s
L 2
= et cos t.u(t) since L 2
= cos t.u(t) (Table 1, Rule 6)
(s − 1) + 1 s +1
and

−1 1 t −1 1
L = e sin t.u(t) since L = sin t.u(t) (Table 1. Rule 5)
(s − 1)2 + 1 2
s +1
Thus

−1 s
L = et (cos t + sin t)u(t)
s2 − 2s + 2
2. The Laplace transform of a derivative

df d2 f
Here we consider not a causal function f (t) directly but its derivatives , , . . . (which are also
dt dt2
causal.) The Laplace transform of derivatives will be invaluable when we apply the Laplace transform
to the solution of constant coefficient ordinary differential equations.
df
If L{f (t)} is F (s) then we shall seek an expression for L{ } in terms of the function F (s).
dt
Now, by the definition of the Laplace transform
Z ∞
df df
L = e−st dt
dt 0 dt
HELM (2006): 27
This integral can be simplified using integration by parts:
Z ∞ ∞ Z ∞
−st df −st
e dt = e f (t) − (−s)e−st f (t) dt
0 dt
Z ∞ 0
0
= −f (0) + s e−st f (t) dt

0
(As usual, we assume that contributions arising from the upper limit, t = ∞, are zero.) The integral
on the right-hand side is precisely the Laplace transform of f (t) which we naturally replace by F (s).
Thus

df
L = −f (0) + sF (s)
dt
As an example, we know that if f (t) = sin t u(t) then
1
L{f (t)} = = F (s) (Table 1, Rule 5)
s2 + 1
and so, according to the result just obtained,

df
L = L{cos t u(t)} = −f (0) + sF (s)
dt

1
= 0+s 2
s +1
s
= 2
s +1
a result we know to be true.
We can find the Laplace transform of the second derivative in a similar way to find:
2
df
L = −f 0 (0) − sf (0) + s2 F (s)
dt2
(The reader might wish to derive this result.) Here f 0 (0) is the derivative of f (t) evaluated at t = 0.
Key Point 10
Laplace Transforms of Derivatives
If L{f (t)} = F (s) then

df
L = −f (0) + sF (s)
dt
2
df
L 2
= −f 0 (0) − sf (0) + s2 F (s)
dt
28 HELM (2006):
®
Task
d2 f df
If L{f (t)} = F (s) and 2
− = 3t with initial conditions
0
dt dt
f (0) = 1, f (0) = 0, find the explicit expression for F (s).
2
df df
Begin by finding L 2
, L and L{3t}:
dt dt
Your solution
Answer
L{3t} = 3/s2

df
L = −f (0) + sF (s) = −1 + sF (s)
dt
2
df
L 2
= −f 0 (0) − sf (0) + s2 F (s) = −s + s2 F (s)
dt
Now complete the calculation to find F (s):
Your solution
Answer
s3 − s2 + 3
You should find F (s) = since, using the transforms we have found:
s3 (s − 1)
3
−s + s2 F (s) − (−1 + sF (s)) =
s2
3 s3 − s2 + 3
so F (s)[s2 − s] = 2 + s − 1 =
s s2
s3 − s2 + 3
leading to F (s) =
s3 (s − 1)
HELM (2006): 29
Exercises
1. Find the Laplace transforms of
(a) t3 e−2t u(t) (b) et sinh 3t.u(t) (c) sin(t − 3).u(t − 3)
2. If F (s) = L{f (t)} find expressions for F (s) if

d2 y dy
(a) 2 − 3 + 4y = sin t y(0) = 1, y 0 (0) = 0
dt dt
dy
(b) 7 − 6y = 3u(t) y(0) = 0,
dt
3. Find the inverse Laplace transforms of
6 15 3s2 + 11s + 14 e−3s e−2s−2 (s + 1)
(a) (b) 2 (c) 3 (d) 4 (e) 2
(s + 3)4 s − 2s + 10 s + 2s2 − 11s − 52 s s + 2s + 5
Answers
6 3 e−3s
1. (a) (b) (c)
(s + 2)4 (s − 1)2 − 9 s2 + 1
s3 − 3s2 + s − 2 3
2. (a) (b)
(s2 + 1)(s2 − 3s + 4) s(7s − 6)
3. (a) e−3t t3 u(t) (b) 5et sin 3t.u(t) (c) (2e4t + e−3t cos 2t)u(t) (d) 16 (t − 3)3 u(t − 3)
(e) e−t cos 2(t − 2).u(t − 2)
3. The delta function (or impulse function)

There is often a need for considering the effect on a system (modelled by a differential equation) by
a forcing function which acts for a very short time interval. For example, how does the current in
a circuit behave if the voltage is switched on and then very shortly afterwards switched off? How
does a cantilevered beam vibrate if it is hit with a hammer (providing a force which acts over a very
short time interval)? Both of these engineering ‘systems’ can be modelled by a differential equation.
There are many ways the ‘kick’ or ‘impulse’ to the system can be modelled. The function we have
in mind could have the graphical representation (when a is small) shown in Figure 14.
f (t)
d d+a t
Figure 14
This can be represented formally using step functions; it switches on at t = d and switches off at
t = d + a and has amplitude b:
30 HELM (2006):
®
f (t) = b[u(t − d) − u(t − {d + a})]

The effect on the system is related to the area under the curve rather than just the amplitude b. Our
aim is to reduce the time interval over which the forcing function acts (i.e. reduce a) whilst at the
same time keeping the total effect (i.e. the area under the curve) a constant. To do this we shall
take b = 1/a so that the area is always equal to 1. Reducing the value of a then gives the sequence
of inputs shown in Figure 15.
f (t)
decreasing a
1
a
t
d d+a
Figure 15
As the value of a decreases the height of the rectangle increases (to ensure the value of the area
under the curve is fixed at value 1) until, in the limit as a → 0, the ‘function’ becomes a ‘spike’ at
t = d. The resulting function is called a delta function (or impulse function) and denoted by
δ(t − d). This notation is used because, in a very obvious sense, the delta function described here is
‘located’ at t = d. Thus the delta function δ(t − 1) is ‘located’ at t = 1 whilst the delta function
δ(t) is ‘located’ at t = 0.
If we were defining an ordinary function we would write
1
δ(t − d) = lim [u(t − d) − u(t − {d + a})]
a→0 a
However, this limit does not exist. The important property of the delta function relates to its integral:
Z ∞ Z ∞ Z d+a
1 1
δ(t − d) dt = lim [u(t − d) − u(t − {d + a})] dt = lim dt
−∞ a→0 −∞ a a→0 d a

d+a d
= lim − =1
a→0 a a
which is what we expect since the area under each of the limiting curves is equal to 1.
A more technical discussion obtains the more general result:
HELM (2006): 31
Key Point 11
Sifting Property of the Delta Function
Z ∞
f (t)δ(t − d) dt = f (d)
−∞
This is called the sifting property of the delta function as it sifts out the value f (d) from the
function f (t). Although the integral here ranges from t = −∞ to t = +∞ in fact the same result
is obtained for any range if the range of the integral includes the point t = d. That is, if α ≤ d ≤ β
then
Z β
f (t)δ(t − d) dt = f (d)
α
Thus, as long as the delta function is ‘located’ within the range of the integral the sifting property
holds. For example,
Z 2 Z ∞
sin t δ(t − 1.1) dt = sin 1.1 = 0.8112 e−t δ(t − 1) dt = e−1 = 0.3679
1 0
Task
Write expressions for delta functions located at t = −1.7 and at t = 2.3
Your solution
Answer
δ(t + 1.7) and δ(t − 2.3)
Task Z 3
Evaluate the integral (sin t δ(t + 2) − cos t δ(t)) dt
−1
Your solution
32 HELM (2006):
®
Answer
You should obtain the value −1 since the first delta function, δ(t + 2), is located outside the range
of integration and thus
Z 3 Z 3
(sin t δ(t + 2) − cos t δ(t)) dt = − cos t δ(t) dt = − cos 0 = −1
−1 −1
The Laplace transform of the delta function

Here we consider L{δ(t − d)}. From the definition of the Laplace transform:
Z ∞
L{δ(t − d)} = e−st δ(t − d) dt = e−sd
0
by the sifting property of the delta function. Thus
Key Point 12
Laplace Transform of the Sifting Function
L{δ(t − d)} = e−sd and, putting d = 0, L{δ(t)} = e0 = 1
Exercise
Find the Laplace transforms of 3δ(t − 3).
Answer
3e−3s
HELM (2006): 33
Solving Differential
Equations 20.4
Introduction
In this Section we employ the Laplace transform to solve constant coefficient ordinary differential
equations. In particular we shall consider initial value problems. We shall find that the initial
conditions are automatically included as part of the solution process. The idea is simple; the Laplace
transform of each term in the differential equation is taken. If the unknown function is y(t) then, on
taking the transform, an algebraic equation involving Y (s) = L{y(t)} is obtained. This equation is
solved for Y (s) which is then inverted to produce the required solution y(t) = L−1 {Y (s)}.
' $
• understand how to find Laplace transforms of
simple functions and of their derivatives
Prerequisites • be able to find inverse Laplace transforms
Before starting this Section you should . . . using a variety of techniques
• know what an initial-value problem is

&
%

Learning Outcomes • solve initial-value problems using the Laplace

transform method

34 HELM (2006):
1. Solving ODEs using Laplace transforms
We begin with a straightforward initial value problem involving a first order constant coefficient
differential equation. Let us find the solution of
dy
+ 2y = 12e3t y(0) = 3
dt
using the Laplace transform approach.
Although it is not stated explicitly we shall assume that y(t) is a causal function (we have no interest
in the value of y(t) if t < 0.) Similarly, the function on the right-hand side of the differential equation
(12e3t ), the ‘forcing function’, will be assumed to be causal. (Strictly, we should write 12e3t u(t) but
the step function u(t) will often be omitted.) Let us write L{y(t)} = Y (s). Then, taking the Laplace
transform of every term in the differential equation gives:
dy
L{ } + L{2y} = L{12e3t }
dt
Now
dy
L{ } = −y(0) + sY (s) = −3 + sY (s)
dt
12
L{2y} = 2Y (s) and L{12e3t } =
s−3
Substituting these expressions into the transformed version of the differential equation gives:
12
[−3 + sY (s)] + 2Y (s) =
s−3
Solving for Y (s) we have
12 3 + 3s
(s + 2)Y (s) = +3=
s−3 s−3
Therefore
3(s + 1)
Y (s) =
(s + 2)(s − 3)
Now, using partial fractions, this last expression can be written in a more convenient form:
3/5 12/5
Y (s) = +
(s + 2) (s − 3)
and then, inverting:
1 1
y(t) = L−1 {Y (s)} = 53 L−1 { }+ 12 −1
5
L { }
s+2 s−3
thus
y(t) = 53 e−2t u(t) + 12 3t
5
e u(t)
This is the solution to the given initial value problem.
HELM (2006): 35
Section 20.4: Solving Differential Equations
Task
The equation governing the build up of charge, q(t), on the capacitor of an RC
dq 1
circuit is R + q = v0
dt C
R C
where v0 is the constant d.c. voltage. Initially, the circuit is relaxed and the circuit
is then ‘closed’ at t = 0 and so q(0) = 0 is the initial condition for the charge.
Use the Laplace transform method to solve the differential equation for q(t).
Assume the forcing term v0 is causal.
Begin by finding an expression for Q(s) = L{q(t)}:

Your solution
Answer
v0 C
Q(s) = since, taking the Laplace transform of each term in the differential equation:
s(RCs + 1)
dq 1
RL{ } + L{q} = L{v0 }
dt C
1 v0
i.e. R[−q(0) + sQ(s)] + Q(s) =
C s
v0
where, we emphasize, the Laplace transform of the constant term v0 is .
s
Inserting q(0) = 0 we have, after some rearrangement,
v0 C
Q(s) =
s(RCs + 1)
36 HELM (2006):
Now expand the expression using partial fractions:
Your solution
Answer
1 RC
You should obtain Q(s) = v0 C −
s RCs + 1
Now obtain q(t) by taking inverse Laplace transforms:
Your solution
Answer
q(t) = v0 C(1 − e−t/RC )u(t) since
1 RC 1
L−1 { } = 1 and L−1 { } = L−1 { } = e−t/RC
s RCs + 1 s + (1/RC)
The solution to this problem is illustrated in the following diagram.
q(t)
v0 C
The Laplace transform method is also applied to higher-order differential equations in a similar way.
HELM (2006): 37
Example 1
Solve the second-order initial-value problem:
d2 y dy
2
+ 2 + 2y = e−t y(0) = 0, y 0 (0) = 0
dt dt
using the Laplace transform method.
Solution
As usual we shall assume the forcing function is causal (i.e. is really e−t u(t).0 Taking the Laplace
transform of each term:
d2 y dy
L{ 2 } + 2L{ } + 2L{y} = L{e−t }
dt dt
that is,
1
[−y 0 (0) − sy(0) + s2 Y (s)] + 2[−y(0) + sY (s)] + 2Y (s) =
s+1
Inserting the initial conditions and rearranging:
1 1
Y (s)[s2 + 2s + 2] = i.e. Y (s) =
s+1 (s + 1)(s2 + 2s + 2)
Then, using partial fractions:
1 1 (s + 1) 1 (s + 1)
≡ − 2 ≡ −
(s + 1)(s2+ 2s + 2) s + 1 s + 2s + 2 s + 1 (s + 1)2 + 1
where we have completed the square in the second term of the right-hand side. We can now take
the inverse Laplace transform:
1 s+1
y(t) = L−1 {Y (s)} = L−1 { } − L−1 { }
s+1 (s + 1)2 + 1
= (e−t − e−t cos t)u(t)
which is the solution to the initial value problem.
Exercises
Use Laplace transforms to solve:
dx
1. + x = 9e2t x(0) = 3
dt
d2 x
2. 2 + x = 2t x(0) = 0 x0 (0) = 5
dt
Answers 1. x(t) = 3e2t 2. x(t) = 3 sin t + 2t
38 HELM (2006):
Example 2
A damped spring, constrained to move in one direction, such as might be found
in a railway buffer, is subjected to an impulse of duration 5 seconds. The spring
constant divided by the mass causing the impulse is 10 m−2 s−2 and the frictional
force divided by this mass is 2 m−2 s−2 .
(a) Write down the equation governing the motion in terms of the displace-
ment x m and time t seconds including the impulse u(t).
(b) Write down the initial conditions on the displacement (x) and velocity.
(c) Solve the equation for displacement as a function of time.
(d) Draw a graph of the oscillations for t = 0 to 10 s.
Solution
(a) Since the system involves a restoring force and friction, after dividing through by the
mass, the equation of motion may be written:
d2 x dx
2
+ 2 + 10x = u(t) − u(t − 5)
dt dt
where the right-hand side represents the impulse being switched on at t = 0 s and
switched off at t = 5 s.
(b) Since the system starts from rest x(0) = x0 (0) = 0.
(c) Taking the Laplace Transform of each term of the differential equation gives
2
dx dx
L 2
+ 2L + 10L [x] = L [u(t)] − L [u(t − 5)]
dt dt
1 1 −5s
i.e. s2 x̄(s) − x(0) − s x0 (0) + 2(s x̄(s) − x(0)) + 10x̄(s) = − e
s s
1 1 −as
but as x(0) = x0 (0) = 0, this simplifies to s2 x̄(s) + 2 s x̄(s) + 10x̄(s) = − e
s s
1 1
1 − e−5s

i.e. x̄(s) =
s2
+ 2s + 10 s
1 1 1 s+2 −5s

= − 1 − e
10 s 10 s2 + 2s + 10

1 1 1 s+1 1 3 −5s

= − − 1 − e
10 s 10 (s + 1)2 + 32 30 (s + 1)2 + 32
1 1 1 s+1 1 3
= − 2 2
−
10 s 10 (s + 1) + 3 30 (s + 1)2 + 32
1 1 −5s 1 s+1 −5s 1 3
− e + e + e−5s
10 s 10 (s + 1)2 + 32 30 (s + 1)2 + 32
HELM (2006): 39
Solution (contd.)
so, on taking inverse Laplace Transforms,
1 1 1
x(t) = − e−t cos 3t − e−t sin 3t
10 10 30
1 1 −(t−5) 1
− u(t − 5) + e cos 3(t − 5)u(t − 5) + e−(t−5) sin 3(t − 5)u(t − 5)
10 10 30
(d)
x(t)
0.125
0.1
0.075
0.05
0.025
t
2 4 6 8 10
− 0.025
Figure 16
According to the graph the damped spring has a damped oscillation about a displacement of 0.1
m after the start of the impulse and a damped oscillation about a displacement of zero after the
impulse has finished.
2. Solving systems of differential equations

The Laplace transform method is also well suited to solving systems of differential equations. A
simple example will illustrate the technique.
Let x(t), y(t) be two independent functions which satisfy the coupled differential equations
dx
+ y = e−t
dt
dy
− x = 3e−t
dt
x(0) = 0, y(0) = 1
Now, using a traditional approach, we could try to eliminate one of the unknown functions from this
system: for example, from the first:
dy d2 x
= −e−t − 2 (taking the derivative and rearranging)
dt dt
dy
This can then be substituted in the second equation: − x = 3e−t , to give:
dt
d2 x
− 2 − x = 4e−t
dt
40 HELM (2006):
which can then be solved in the normal way (either using the complementary function/particular
integral approach or else the Laplace transform approach.) However, this approach is not workable
if we have large numbers of first order differential equations to deal with. Let us instead use the
Laplace transform directly.
If we use the notation that
L{x(t)} = X(s) and L{y(t)} = Y (s)
then, by taking the Laplace transform of every term in the given differential equations, we obtain:
1
−x(0) + sX(s) + Y (s) =
s+1
3
−y(0) + sY (s) − X(s) =
s+1
which, using the initial conditions and rearranging gives
1
sX(s) + Y (s) =
s+1
s+4
−X(s) + sY (s) =
s+1
Key Point 13
Taking the Laplace transform converts a system of differential equations
into a system of algebraic simultaneous equations.
We can solve these algebraic equations (in X(s) and Y (s)) using a variety of techniques (inverse
matrix; Cramer’s determinant method etc.) Here we will use Cramer’s method.
1

s+1 1 s s+4
s+4
s
−
X(s) = s+1 = s + 12 s + 1

s 1 s +1
−1 s
−4 2(s − 1) 2
= 2
= 2 −
(s + 1)(s + 1) s +1 s+1
and
1

s s(s + 4) 1
s+1

−1 s+4
+
Y (s) = s+1
= s+1 s+1
2
s +1
s 1

−1 s
s2 + 4s + 1 1 2(s + 1)
= 2
=− + 2
(s + 1)(s + 1) s+1 s +1
HELM (2006): 41
The last lines in each case having been obtained using partial fractions. We can now invert X(s), Y (s)
to find x(t), y(t):
s 1 1
x(t) = L−1 {X(s)} = 2L−1 { } − 2L−1 { 2 } − 2L−1 { }
s2
+1 s +1 s+1
= (2 cos t − 2 sin t − 2e−t )u(t)
1 s 1
y(t) = L−1 {Y (s)} = −L−1 { } + 2L−1 { 2 } + 2L−1 { 2 }
s+1 s +1 s +1
= (−e−t + 2 cos t + 2 sin t)u(t)
(Note that once the solution for x(t) is found the solution for y(t) may be easier to obtain by
dx
substituting in the differential equation: y = e−t − rather than using Laplace transforms.)
dt
Task
Use the Laplace transform to solve the coupled differential equations:
dy dx
− x = 0, + y = 1, x(0) = −1, y(0) = 1
dt dt
Begin by obtaining a system of algebraic equations for X(s) and Y (s):
Your solution
Answer
Writing L{x(t)} = X(s) and L{y(t)} = Y (s) you should obtain the set of transformed equations
−1 + sY (s) − X(s) = 0
1
1 + sX(s) + Y (s) =
s
which, when re-arranged, are
−X(s) + sY (s) = 1
1−s
sX(s) + Y (s) =
s
Now solve these equations for X(s) and Y (s):
Your solution
42 HELM (2006):
Answer
s 1 1
X(s) = − Y (s) = −
1 + s2 s 1 + s2
Now find the required solution by obtaining the inverse Laplace transforms:
Your solution
Answer
You should obtain x(t) = − cos t.u(t) and y(t) = (1 − sin t).u(t). This follows since
s 1 1
L−1 {− } = − cos t.u(t) L−1 { } = u(t) L−1 {− } = − sin t.u(t)
1 + s2 s 1 + s2
Exercises
1. Solve the given system of differential equations for the initial conditions specified.
dx dy
(a) =y =x x(0) = 1 y(0) = 0
dt dt
dx dy
(b) = 4x − 2y = 5x + 2y x(0) = 2 y(0) = −2
dt dt
2. The Laplace transform can also be used to solve a pair of coupled second order differential
equations.
Solve, for the given initial conditions,
d2 x
= y + sin t x(0) = 1 x0 (0) = 0
dt2
d2 y dx
2
= − + cos t y(0) = −1 y 0 (0) = −1
dt dt
(Note that the initial conditions on each of x(t) and y(t) are needed in the second order
situation.)
Answer
1. (a) x = cosh t, y = sinh t (b) x = e3t (2 cos 3t + 2 sin 3t), y = e3t (−2 cos 3t + 4 sin 3t)
2. x = cos t, y = − cos t − sin t
HELM (2006): 43
3. Applications of systems of differential equations
Coupled electrical circuits and mechanical vibrating systems involving several masses in springs offer
examples of engineering systems modelled by systems of differential equations.
Electrical circuits
Consider the RL (resistance/inductance) circuit with a voltage v(t) applied as shown in Figure 17.
R1
L2 R2 i1 v(t)
i2
L1
Figure 17
If i1 and i2 denote the currents in each loop we obtain, using Kirchhoff’s voltage law:
di1
(i) in the right loop: L1 + R2 (i1 − i2 ) + R1 i1 = v(t)
dt
di2
(ii) in the left loop: L2 + R2 (i2 − i1 ) = 0
dt
Task
Suppose, in the above circuit, that

L1 = 0.8 henry, L2 = 1 henry, R1 = 1.4 Ω R2 = 1 Ω.
Assume zero initial conditions: i1 (0) = i2 (0) = 0.
Suppose that the applied voltage is constant: v(t) = 100 volts t ≥ 0.
Solve the problem by Laplace transforms.
Begin by obtaining V (s), the Laplace transform of v(t):
Your solution
44 HELM (2006):
Answer
We have, from the definition of the Laplace transform:
Z ∞ −st ∞
−st e 100
V (s) = 100e dt = 100 =
0 −s 0 s
This is simply the Laplace transform of the step function of height 100.
Now insert the parameter values into the differential equations and obtain the Laplace transform of
each equation. Denote by I1 (s), I2 (s) the Laplace transforms of the unknown currents. (These are
equivalent to X(s) and Y (s) of the theory.):
Your solution
Answer
di1
0.8 + i1 − i2 + 1.4i1 = v(t)
dt
di2
+ i 2 − i1 = 0
dt
Rearranging and dividing the first equation by 0.8:
di1
+ 3i1 − 1.25i2 = 1.25v(t)
dt
di2
− i 1 + i2 = 0
dt
Taking Laplace transforms and inserting the initial conditions i1 (0) = 0, i2 (0) = 0:
125
(s + 3)I1 (s) − 1.25I2 (s) =
s
−I1 (s) + (s + 1)I2 (s) = 0
HELM (2006): 45
Now solve these equations for I1 (s) and I2 (s). Put each expression into partial fractions and finally
take the inverse Laplace transform to obtain i1 (t) and i2 (t):
Your solution
Answer
We find
125(s + 1) 500 125 625
I1 (s) = = − −
s(s + 1/2)(s + 7/2) 7s 3(s + 1/2) 21(s + 7/2)
in partial fractions.
500 125 −t/2 625 −7t/2
Hence i1 (t) = − e − e
7 3 21
Similarly
125 500 250 250
I2 (s) = = − +
s(s + 1/2)(s + 7/2) 7s 3(s + 1/2) 21(s + 7/2)
which has inverse Laplace transform:
500 250 −t/2 250 −7t/2
i2 (t) = − e + e
7 3 21
500
Notice in both cases that i1 (t) and i2 (t) tend to the steady state value as t increases.
7
46 HELM (2006):
Two masses on springs
Consider the vibrating system shown:
k m k m k
y1 y2
Figure 18
As you can see, the system consists of two equal masses, both m, and 3 springs of the same stiffness
k. The governing differential equations can be obtained by applying Newton’s second law (‘force
equals mass times acceleration’): (recall that a single spring of stiffness k will experience a force −ky
if it is displaced a distance y from its equilibrium.)
In our system therefore
d2 y1
m = −ky1 + k(y2 − y1 )
dt2
d2 y2
m 2 = −k(y2 − y1 ) − ky2
dt
which is a pair of second order differential equations.
Task
For the above system, if m = 1, k = 2 and the initial conditions are
√ √
y1 (0) = 1 y10 (0) = 6 y2 (0) = 1 y20 (0) = − 6
use Laplace transforms to solve the system of differential equations to find y1 (t)
and y2 (t).
Begin by letting Y1 (s), Y2 (s) be the Laplace transforms of y1 (t), y2 (t) respectively and take the
transforms of the differential equations, inserting the initial conditions:
Your solution
Answer √
(s2 + 4)Y1 − 2Y2 = s + 6
√
−2Y1 + (s2 + 4)Y2 = s − 6
HELM (2006): 47
Solve these equations (e.g. by Cramer’s rule or by Gauss elimination) then use partial fractions and
finally take inverse Laplace transforms:
Your solution
(Perform the calculation on separate paper and summarise the results here.)
Answer √ √ √
(s + 6)(s2 + 4) + 2(s − 6) s 6
Y1 (s) = 2 2
= 2 + 2
(s + 4) − 4 s +2 s +6
√ √
from which y1 (t) = cos 2t + sin 6t
√ √
A similar calculation gives y2 (t) = cos 2t − sin 6t
We see that the motion of each mass is composed of two harmonic oscillations; the system model
was undamped so, on this model, the vibration continues indefinitely.
48 HELM (2006):
Charge on a capacitor
In the circuit shown in Figure 19, the switch S is closed at t = 0 with a capacitor charge q(0) = q0 =
constant and dq/dt(0) = 0.
D R B
C q(t) L
F S A
Figure 19
h α i R 1
Show that q(t) = q0 (t)e−αt cos ωt + sin ωt where α = and ω 2 = − α2
w 2L LC
Laplace transform properties required
The following properties are needed to solve this problem.
F (s + a) = L{e−at f (t)} (P1)

df (t)
L = s{f (t)} − f (0) (P2)
dt
2
d f (t) df
L 2
= s2 L{f (t)} − (0) − s f (0) (P3)
dt dt
k
L{sin kt} = 2 with s > 0 (P4)
s + k2
s
L{cos kt} = 2 with s > 0 (P5)
s + k2
L−1 {L{f (t)}} = f (t) (P6)
STEP 1 Establish the differential equation for q(t) using, for example, Kirchhoff’s law.
Solution
When the switch S is closed, the inductance L, capacitance C and resistance R give rise to a.c.
voltages related by
di
VA − VB = L , VB − VD = R i, VD − VF = q/C respectively.
dt
dq
So since VA − VF = (VA − VB ) + (VB − VD ) + (VD − VF ) = 0 and i = we have
dt
d2 q dq q
L 2
+R + =0 (1)
dt dt C
HELM (2006): 49
STEP 2 Write the Laplace transform of the differential equation substituting for the initial
conditions:
Solution
Since the Laplace transform is linear, the transform of differential Equation (1) is
2 2
dq dq q dq dq q
L L 2 +R + = LL 2
+ RL + L{ } = 0. (2)
dt dt C dt dt C
We deal with each derivative term in turn: Using property (P3),
2
dq 2 dq
L = s L{q(t)} − (0) − s q(0).
dt2 dt
dq
So, using the initial conditions q(0) = q0 and (0) = 0
dt
2
dq
L = s2 L{q(t)} − s q0 . (3)
dt2
By means of property (2)

dq
L = sL{q(t)} − q0 (4)
dt
STEP 3 Solve for the function L{q(t)} by substituting from (3) and (4) into Equation (2):
Solution
1
L[s2 L{q(t)} − sq0 ] + R[sL{q(t)} − q0 ] + L{q(t)} = 0
C
1
⇒ L{q(t)}[Ls2 + Rs + ] = Lsq0 + Rq0
C
(Ls + R)
⇒ L{q(t)} = q (5)
1 0
(Ls2 + Rs + )
C
R 1
Using the definitions α = and ω 2 = − α2 enables the denominator in Equation (5) to be
2L LC
expressed as the sum of two squares,
1 Rs 1 1
L s2 + R s + = L[s2 + + ] = L[s2 + 2αs + ]
C L LC LC
= L[s2 + 2αs + α2 + ω 2 ] = L[{s + α}2 + ω 2 ].
Consequently, with the new expression for the denominator, Equation (5) becomes

s R 1
L{q(t)} = q0 + . (6)
(s + α)2 + ω 2 L (s + α)2 + ω 2
50 HELM (2006):
STEP 4 Use the inverse Laplace transform to obtain q(t):
Solution
The inverse Laplace transform is used to find q(t).
Taking the inverse Laplace transform of Equation (6) and using the linearity properties

−1 −1 s R 1
L {L{q(t)}} = q0 L + .
(s + α)2 + ω 2 L (s + α)2 + ω 2
Using property (P6) this can be written as

−1 s+α −α R ω
q(t) = q0 L + + .
(s + α)2 + ω 2 (s + α)2 + ω 2 Lω (s + α)2 + ω 2
Using the linearity of the Laplace transform again

−1 s+α −1 −α −1 R ω
q(t) = q0 L +L +L . (7)
(s + α)2 + ω 2 (s + α)2 + ω 2 Lω (s + α)2 + ω 2
Using properties (P1) and (P5)

−1 s+α
L = e−αt cos ωt. (8)
(s + α)2 + ω 2
Similarly,

−1 −α α
L = −( ){e−αt sin ωt} (9)
(s + α)2 + ω 2 ω
and

−1 R ω R −αt
L =( )e sin ωt. (10)
Lω (s + a)2 + ω 2 Lω
Substituting (8), (9) and (10) in (7) gives

−αt α R −αt
q(t) = q0 e cos ωt + − + e sin ωt . (11)
ω Lω
STEP 5 Finally, show that for t > 0 the solution is

α R 1
q(t) = q0 e−αt [cos ωt + ( ) sin ωt] where α = and ω 2 = − α2 .
ω 2L LC
Solution
R
Substituting α = in (11) gives
2L

−αt α 2α −αt
q(t) = q0 e cos ωt + [− + ]e sin ωt
ω ω
α
= q0 e−αt [cos ωt + sin ωt ]
ω
HELM (2006): 51
Deflection of a uniformly loaded beam
Introduction
A uniformly loaded beam of length L is supported at both ends. The deflection y(x) is a function
of horizontal position x and obeys the differential equation
d4 y 1
4
(x) = q(x) (1)
dx EI
where E is Young’s modulus, I is the moment of inertia and q(x) is the load per unit length at
point x. We assume in this problem that q(x) = q (a constant). The boundary conditions are (i) no
deflection at x = 0 and x = L (ii) no curvature of the beam at x = 0 and x = L.
Load q
Beam
y(x) x
x
Ground y L
Figure 20
Problem in words
In addition to being subject to a uniformly distributed load, a beam is supported so that there is no
deflection and no curvature of the beam at its ends. Applying a Laplace Transform to the differential
equation (1), find the deflection of the beam as function of horizontal position along the beam.
Mathematical formulation of the problem

Find the equation of the curve y(x) assumed by the bending beam that solves (1). Use the coordinate
system shown in Figure 1 where the origin is at the left extremity of the beam. In this coordinate
system, the mathematical formulations of the boundary conditions which require that there is no
deflection at x = 0 and x = L, and that there is no curvature of the beam at x = 0 and x = L, are
(a) y(0) = 0
(b) y(L) = 0
d2 y
(c) =0
dx2 x=0
d2 y
(d) =0
dx2 x=L
dy(x) d2 y(x)
Note that and are respectively the slope and the radius of curvature of the curve at
dx dx2
point (x, y).
52 HELM (2006):
The following Laplace transform properties are needed:

n n n−k
d f (t) X d f
L = sn F (s) − sk−1 (P1)

dtn dxn−k

k=1 x=0
L {1} = 1/s (P2)
L {tn } = n!/sn+1 (P3)
L−1 {L {f (t)}} = f (t) (P4)

To solve a differential equation involving the unknown function f (t) using Laplace transforms
(a) Write the Laplace transform of the differential equation using property (P1)
(b) Solve for the function L {f (t)} using properties (P2) and (P3)
(c) Use the inverse Laplace transform to obtain f (t) using property (P4)
Using the linearity properties of the Laplace transform, (1) becomes

4
dy q
L 4
(x) − L{ } = 0.
dx EI
Using (P1) and (P2)

4 4−k
X d y q 1
s4 L{y(x)} − sk−1 4−k − = 0. (2)

k=1
dx EI s
x=0
The four terms of the sum are

4 4−k 3
X d y d y d2 y dy
sk−1 4−k = 3 +d 2 + s2 + s3 y(0).

dx dx dx dx

k=1 x=0 x=0 x=0
2
dy
The boundary conditions give y(0) = 0 and 2 = 0. So (2) becomes
dx

d3 y dy
q 1
s4 L{y(x)} − 3 − s2 − = 0. (3)
dx dx EI s
x=0 x=0

d3 y
dy
Here 3 and are unknown constants, but they can be determined by using the remaining
dx dx

x=0 x=0
d2 y
two boundary conditions y(L) = 0 and 2 = 0.
dx
x=L
Solving for L{y(x)}, (3) leads to

3
1 dy 1 dy q 1
L{y(x)} = 4 3
+ 2 + .
s dx s dx EI s5

x=0 x=0
HELM (2006): 53
Using the linearity of the Laplace transform, the inverse Laplace transform of this equation gives

3

−1 dy
−1 1 dy −1 1 q −1 1
L {L{y(x)}} = 3 ×L 4
+ ×L 2
+ L .
dx s dx s EI s5

x=0 x=0
Hence

d3 y

−1 1 dy −1 1 q −1 1
y(x) = 3 ×L 3! 4 /3! + ×L + L 4! 5 /4!

dx s dx s2 EI s

x=0 x=0
So using (P3)

d3 y dy q −1
y(x) = 3 × L−1 {L{x3 }}/6 + × L−1 {L{x1 }} + L {L{x4 }}/24.

dx dx EI

x=0 x=0
Simplifying by means of (P4)

d3 y dy q 4
y(x) = 3 × x3 /6 + ×x + x /24. (4)

dx dx EI

x=0 x=0

d2 y
To use the boundary condition 2 = 0, take the second derivative of (4), to obtain
dx
x=L

d2 y d3 y q 2
(x) = × x + x.
dx2 dx3 2EI

x=0

2
d y
The boundary condition 2 = 0 implies
dx
x=L

d3 y q
3
=− L. (5)
dx 2EI

x=0
Using the last boundary condition y(L) = 0 with (5) in (4)

dy qL3
= (6)
dx 24EI

x=0
Finally substituting (5) and (6) in (4) gives

q qL 3 qL3
y(x) = x4 − x + x.
24EI 12EI 24EI
Interpretation
The predicted deflection is zero at both ends as required.
Note This problem was solved by an entirely different means (integrating the ODE) in 19.4,
page 65.
54 HELM (2006):
The Convolution
Theorem 20.5
Introduction
In this Section we introduce the convolution of two functions f (t), g(t) which we denote by (f ∗g)(t).
The convolution is an important construct because of the convolution theorem which allows us to
find the inverse Laplace transform of a product of two transformed functions:
L−1 {F (s)G(s)} = (f ∗ g)(t)
' $
• be able to find Laplace transforms and
inverse Laplace transforms of simple functions
Prerequisites • be able to integrate by parts
• understand how to use step functions in
integration
&
# %
• calculate the convolution of simple
functions
Learning Outcomes
• apply the convolution theorem to obtain
inverse Laplace transforms
" !
HELM (2006): 55
Section 20.5: The Convolution Theorem
1. Convolution
Let f (t) and g(t) be two functions of t. The convolution of f (t) and g(t) is also a function of t,
denoted by (f ∗ g)(t) and is defined by the relation
Z ∞
(f ∗ g)(t) = f (t − x)g(x) dx
−∞
However if f and g are both causal functions then (strictly) f (t), g(t) are written f (t)u(t) and
g(t)u(t) respectively, so that
Z ∞ Z t
(f ∗ g)(t) = f (t − x)u(t − x)g(x)u(x) dx = f (t − x)g(x) dx
−∞ 0
because of the properties of the step functions: u(t − x) = 0 if x > t and u(x) = 0 if x < 0.
Key Point 14
Convolution
If f (t) and g(t) are causal functions then their convolution is defined by:
Z t
(f ∗ g)(t) = f (t − x)g(x) dx
0
This is an odd looking definition but it turns out to have considerable use both in Laplace transform
theory and in the modelling of linear engineering systems. The reader should note that the variable
of integration is x. As far as the integration process is concerned the t-variable is (temporarily)
regarded as a constant.
Example 3
Find the convolution of f and g if f (t) = tu(t) and g(t) = t2 u(t).
Solution
f (t − x) = (t − x)u(t − x) and g(x) = x2 u(x)

Therefore
Z t t
(t − x)x2 dx = x3 t − 14 x4
1
(f ∗ g)(t) = 3 0
0
1 4 1 4 1 4
= 3
t − 4
t = 12
t
56 HELM (2006):
Example 4
Find the convolution of f (t) = t.u(t) and g(t) = sin t.u(t).
Solution
Here f (t − x) = (t − x)u(t − x) and g(x) = sin x.u(x) and so
Z t
(f ∗ g)(t) = (t − x) sin x dx
0
We need to integrate by parts. We find, remembering again that t is a constant in the integration
process,
Z t t Z t
(t − x) sin x dx = −(t − x) cos x − (−1)(− cos x) dx
0 0 0
Z t
= [0 + t] − cos x dx
0
t
= t − sin x = t − sin t
0
so that
(f ∗ g)(t) = t − sin t or, equivalently, in this case (t ∗ sin t)(t) = t − sin t
Task
In Example 4 we found the convolution of f (t) = t.u(t) and g(t) = sin t.u(t). In
this Task you are asked to find the convolution (g ∗ f )(t) that is, to reverse the
order of f and g.
Begin by writing (g ∗ f )(t) as an appropriate integral:
Your solution
Answer Z t
g(t − x) = sin(t − x).u(t − x) and f (x) = xu(x), so (g ∗ f )(t) = sin(t − x).x dx
0
HELM (2006): 57
Now evaluate the convolution integral:
Your solution
Answer
Z t
(g ∗ f )(t) = sin(t − x).x dx
0
t Z t
= x cos(t − x) − cos(t − x) dx
0 0
t
= [t − 0] + sin(t − x) = t − sin t
0
This Task illustrates the general result in the following Key Point:
Key Point 15
Commutativity Property of Convolution
(f ∗ g)(t) = (g ∗ f )(t)
In words: the convolution of f (t) with g(t) is the same as the convolution of g(t) with f (t).
Task
Obtain the Laplace transforms of f (t) = t.u(t) and g(t) = sin t.u(t) and (f ∗g)(t).
Begin by finding L{f (t)}, L{g(t)}:
Your solution
Answer
1 1
L{f (t)} = L{g(t)} = (from Table 1)
s2 s2 +1
58 HELM (2006):
Now find L{(f ∗ g)(t)}:
Your solution
Answer
1 1
From Example 4 (f ∗ g)(t) = t − sin t and so L{(f ∗ g)(t)} = L{t − sin t} = 2
− 2
s s +1
Now compare L{f (t)} × L{g(t)} with L{f ∗ g(t)}. What do you observe?
Your solution
Answer
1 1 1 1
L{(f ∗ g)(t)} = 2 − 2 = 2 = L{f (t)}L{g(t)} = F (s)G(s)
s s +1 s s2 + 1
We see that the Laplace transform of the convolution of f (t) and g(t) is the product of their
separate Laplace transforms. This, in fact, is a general result which is expressed in the statement
of the convolution theorem which we discuss in the next subsection.
2. The convolution theorem

Let f (t) and g(t) be causal functions with Laplace transforms F (s) and G(s) respectively, i.e.
L{f (t)} = F (s) and L{g(t)} = G(s). Then it can be shown that
Key Point 16
The Convolution Theorem
L−1 {F (s)G(s)} = (f ∗ g)(t) or equivalently L{(f ∗ g)(t)} = F (s)G(s)
HELM (2006): 59
Example 5
6
Find the inverse transform of .
s(s2 + 9)
(a) Using partial fractions (b) Using the convolution theorem.
Solution
6 (2/3) (2/3)s
(a) = − 2 and so
s(s2+ 9) s s +9

−1 6 2 −1 1 2 −1 s
L 2
= 3L − 3L 2
= 23 u(t) − 23 cos 3t.u(t)
s(s + 9) s s +9
2 3
(b) Let us choose F (s) = and G(s) = then
s s2 +9
f (t) = L−1 {F (s)} = 2u(t) and g(t) = L−1 {G(s)} = sin 3t.u(t)
So
L−1 {F (s)G(s)} = (f ∗ g)(t) (by the convolution theorem)

Z t
= 2u(t − x) sin 3x.u(x) dx
0
Now the variable t can take any value from −∞ to +∞. If t < 0 then the variable of integration,
x, is negative and so u(x) = 0. We conclude that
(f ∗ g)(t) = 0 if t < 0
that is, (f ∗ g)(t) is a causal function. Let us now consider the other possibility for t, that is the
range t ≥ 0. Now, in the range of integration 0 ≤ x ≤ t and so
u(t − x) = 1 u(x) = 1
since both t − x and x are non-negative. Therefore
Z t
−1
L {F (s)G(s)} = 2 sin 3x dx
0
t
2 2
= − cos 3x = − (cos 3t − 1) t≥0
3 0 3
Hence
6 2
L−1 { } = − (cos 3t − 1)u(t)
s(s2+ 9) 3
which agrees with the value obtained above using the partial fraction approach.
60 HELM (2006):
Task
Use the convolution theorem to find the inverse transform of
s
H(s) = .
(s − 1)(s2 + 1)
Begin by choosing two functions of s, that is, F (s) and G(s):
Your solution
Answer
Although there are many possibilities it would seem sensible to choose
1 s
F (s) = and G(s) = 2
s−1 s +1
since, by inspection, we can write down their inverse Laplace transforms:
f (t) = L−1 {F (s)} = et u(t) and g(t) = L−1 {G(s)} = cos t.u(t)
Now construct the convolution integral:
Your solution
h(t) =
Answer
h(t) = L−1 {H(s)}

= L−1 {F (s)G(s)}
Z t Z t
= f (t − x)g(x) dx = et−x u(t − x) cos x.u(x) dx
0 0
HELM (2006): 61
Now complete the evaluation of the integral, treating the cases t < 0 and t ≥ 0 separately:
Your solution
Answer
You should find h(t) = 12 (sin t − cos t + et )u(t) since h(t) = 0 if t < 0 and
Z t
h(t) = et−x cos x dx if t ≥ 0
0
t Z t
t−x
= e sin x − (−1)et−x sin x dx (integrating by parts)
0 0
t Z t
t−x
= sin t + −e cos x − (−et−x )(− cos x) dx
0 0
t
= sin t − cos t + e − h(t)
or 2h(t) = sin t − cos t + et t≥0
Finally h(t) = 12 (sin t − cos t + et )u(t)
Exercises
1. Find the convolution of
(a) 2tu(t) and t3 u(t) (b) et u(t) and tu(t) (c) e−2t u(t) and e−t u(t).
In each case reverse the order to check that (f ∗ g)(t) = (g ∗ f )(t).
2. Use the convolution theorem to determine the inverse Laplace transforms of
1 1 1
(a) (b) (c)
s2 (s + 1) (s − 1)(s − 2) (s2 + 1)2
Answers
1. (a) 1 5
10
t (b) −t − 1 + et (c) e−t − e−2t
2. (a) (t − 1 + e−t )u(t) (b) (−et + e2t )u(t) (c) 12 (sin t − t cos t)u(t)
62 HELM (2006):

Transfer Functions 20.6
Introduction
In this Section we introduce the concept of a transfer function and then use this to obtain a Laplace
transform model of a linear engineering system. (A linear engineering system is one modelled by a
constant coefficient ordinary differential equation.)
We shall also see how to obtain the impulse response of a linear system and hence to construct the
general response by use of the convolution theorem.
' $
• be able to use the convolution theorem
• be familiar with taking Laplace transforms

Prerequisites and inverse Laplace transforms
• be familiar with the delta (impulse) function
and its Laplace transform
&
' %
$
• find a transfer function of a linear system
• show how some linear systems may be

Learning Outcomes combined together by combining appropriate
transfer functions
• obtain the impulse response and the general
response to a linear engineering system
& %
HELM (2006): 63
Section 20.6: Transfer Functions
1. Transfer functions and linear systems
Linear engineering systems are those that can be modelled by linear differential equations. We
shall only consider those sytems that can be modelled by constant coefficient ordinary differential
equations.
Consider a system modelled by the second order differential equation.
d2 y dy
a 2
+ b + cy = f (t)
dt dt
in which a, b, c are given constants and f (t) is a given function. In this context f (t) is often called
the input signal or forcing function and the solution y(t) is often called the output signal.
We shall assume that the initial conditions are zero (in this case y(0) = 0, y 0 (0) = 0).
Now, taking the Laplace transform of the differential equation, gives:
(as2 + bs + c)Y (s) = F (s)
in which we have used y(0) = y 0 (0) = 0 and where we have designated L{y(t)} = Y (s) and
L{f (t)} = F (s).
We define the transfer function of a system to be the ratio of the Laplace transform of the output
signal to the Laplace transform of the input signal with the initial conditions as zero. The transfer
function (a function of s), is denoted by H(s). In this case
Y (s) 1
H(s) ≡ = 2
F (s) as + bs + c
Now, in the special case in which the input signal is the delta function, f (t) = δ(t), we have F (s) = 1
and so,
H(s) = Y (s)
We call the solution to the differential equation in this special case the unit impulse response
function and denote it by h(t)u(t) (we include the step function u(t) to emphasize its causality).
So
h(t)u(t) = L−1 {H(s)} when f (t) = δ(t)
Now, keeping this in mind and returning to the general case in which the input signal f (t) is not
necessarily the impulse function δ(t), we have:
Y (s) = H(s)F (s)
and so the solution for the output signal is, as usual, obtained by taking the inverse Laplace transform:
y(t) = L−1 {Y (s)} = L−1 {H(s)F (s)}

= (h ∗ f )(t)
using the convolution theorem.
64 HELM (2006):
Workbook 20: The Laplace Transform
Key Point 17
Linear System Solution
The solution to a linear system, modelled by a constant coefficient ordinary differential equation,
is given by the convolution of the unit impulse response function h(t)u(t) with the input function
f (t).
This approach provides yet another method of solving a linear system as Example 6 illustrates.
Example 6
Find the impulse response function h(t) to a linear engineering system
modelled by the differential equation:
d2 y
+ 4y = e−t y(0) = 0 y 0 (0) = 0
dt2
and hence solve the system.
Solution
Here
1 1
H(s) = 2
= 2
with a = 1, b = 0, c = 4
s +4 as + bs + c
This is obtained by replacing the forcing function e−t by the impulse function δ(t) and then taking
the Laplace transform. Using this:
1
h(t) = L−1 {H(s)} = L−1 { } = 12 sin 2t.u(t)
s2 +4
Then the output y(t) corresponding to the input e−t is given by the convolution of e−t and h(t).
That is,
Z t
−t
y(t) = (h ∗ e )(t) = 1
2
sin 2(t − x)e−x dx
0
1
sin 2t − 2 cos 2t + 2e−t

= 10
(Note: the last integral can be determined by integrating by parts (twice), or by use of a computer
algebra system such as Matlab.)
HELM (2006): 65
Task
Use the transfer function approach to solve
dx
− 4x = sin t x(0) = 0.
dt
Begin by finding the transfer function H(s):
Your solution
Answer
You should find H(s) = 1/(s − 4) since the transfer function is the Laplace transform of the output
X(s) when the input is a delta function δ(t).
Now obtain an expression for the solution x(t) in terms of the convolution:
Your solution
Answer
You should obtain x(t) = (sin t ∗ h)(t) where
Z t
−1 −1 1
h(t) = L {H(s)} = L = e4t u(t) and x(t) = (sin x)e4(t−x) u(t − x) dx
s−4 0
Now complete the evaluation of this integral:
Your solution
Answer
If t > 0 then u(t − x) = 1 and so
( t Z t )
Z t
sin x cos x
x(t) = sin x e4(t−x) dx = e4t − e−4x − − e−4x dx
0 4 0 0 4
Z t
4t sin t −4t 1 h cos x −4x it sin x −4x
= e − e + − e − e dt
4 4 4 0 0 4
1 4t
Therefore x(t) = − 41 sin t − 1
16
cos t + 16
e − 1
16
x(t)
1
−4 sin t − cos t + e4t

Hence x(t) =
17
66 HELM (2006):
2. Modelling linear systems by transfer functions
We have seen previously that an engineering system can be modelled by one or more differential
equations. However, with the introduction of the transfer function we have an alternative model
which we examine in this Section.
It will be helpful to develop a pictorial approach to system modelling. To begin, we can imagine a
differential equation:
d2 y dy
a 2
+ b + cy = f (t)
dt dt
as being a model of the engineering system which transforms the input signal f (t) into an output
signal y(t) (the solution of the differential equation). The system is characterised by the values
of the coefficients a, b, c. A different engineering system will be characterised by a different set of
coefficients. These coefficients are independent of the input signal. Changing the input signal does
not change the system. It is the system that changes the input signal into the output signal. This is
easy to describe pictorially (Figure 21).
f(t) system y(t)
input signal a, b, c output signal
Figure 21: Block diagram describing the system in the t-domain

In a block diagram the system is represented by a rectangular box and the input and output signals
represented by lines with an arrow to indicate the ’flow’.
After the Laplace transform of the differential equation is taken, the differential equation is trans-
formed into
1
Y (s) ≡ H(s)F (s) H(s) ≡ 2
as + bs + c
in which H(s) is the transfer function. The latter characterises (in Laplace transform terms) the
engineering system from which it was derived. The relation, connecting the Laplace transform of the
output Y (s) to the Laplace transform of the input F (s), can also be described schematically (Figure
22).
F(s) system Y(s)
input signal H(s) output signal
Figure 22: Block diagram describing the system in the s−domain

We can begin to model an engineering system directly in terms of transfer functions. In order to do
this effectively we need to know how transfer functions are to be combined together. Before we do
this we first extend our block diagrams to allow for ’interactions’.
HELM (2006): 67
There are three basic components occurring in block diagrams which we now describe.
The first component, we have already met: the block relating the input to the output (Figure
23):
F(s) Y(s)
H(s)
Figure 23: Input to Output
The second component is called a summing point (Figure 24):
R(s) + R(s) — X(s)
—
X(s)
Figure 24: Summing point
Here we have shown two incoming signals R(s), X(s) (but at a general summing point there may
be many incoming signals) and one outgoing signal (there should never be more than one outgoing
signal). The sign attached to the incoming signal defines whether the signal is adding to (+) or
subtracting from (−) the summing point. The outgoing signal is then calculated in an obvious way,
taking these signs into account.
The third component is a take-off point (Figure 25):
Y(s) Y(s)
Y (s)
Figure 25: Take-off point
Here the value of the signal Y (s) is found in such a way as not to affect the signal that is being
transmitted. (This situation can never be precisely realised in practice, but using sensitive measuring
devices it can be well approximated. As a simple example consider the problem of measuring the
temperature of a certain volume of liquid. The act of putting a thermometer in the liquid will usually
slightly affect the temperature we are trying to observe.)
68 HELM (2006):
An example of a block diagram is the so-called negative feedback loop, shown in Figure 26 (we are
using G(s) to denote the transfer function):
F(s) + Y1 (s) Y(s)

G(s)
—
Y(s)
Figure 26: Negative feedback loop

Here, the output signal is tapped and subtracted from the input signal. Hence
Y (s) = G(s)Y1 (s)
because Y1 (s) is the input signal to the system characterised by transfer function G(s). However, at
the summing point Y1 (s) = F (s) − Y (s) and so
Y (s) = G(s)(F (s) − Y (s))
from which we easily obtain:

G(s)
Y (s) = F (s)
1 + G(s)
so that, in terms of input and output signals, the feedback loop is characterised by a transfer function
G(s)
.
1 + G(s)
In some feedback loops the tapped signal Y (s) may be modified in some way before feedback. Using
the overall transfer function we can now picture the feedback loop in a simpler way (Figure 27):
F(s) Y(s)
G(s)
1 + G(s)
Figure 27: Feedback loop transfer function

Another type of block diagram occurs when the output from one system becomes the input to another
system. For example consider the system of coupled differential equations:
dx
+ x = f (t)
dt
dy
3 − y = x(t)
dt
x(0) = 0 y(0) = 0
in which f (t) is a given input signal.
HELM (2006): 69
In terms of Laplace transforms we have, as usual
sX(s) + X(s) = F (s) 3sY (s) − Y (s) = X(s)
so the transfer function for the first equation (G1 (s) say) satisfies
X(s) 1
G1 (s) ≡ =
F (s) s+1
whilst the transfer function for the second equation G2 (s) satisfies
Y (s) 1
G2 (s) ≡ =
X(s) 3s − 1
In pictorial terms this is shown in Figure 28:
F(s) X(s) Y(s)

G1(s) G2(s)
Figure 28
So we have two transfer functions ‘in series’. To find how they combine we simply find an expression
connecting the final output Y (s) to the initial input F (s). Clearly
X(s) = G1 (s)F (s) and so Y (s) = G2 (s)X(s) = [G2 (s)G1 (s)] F (s)
So transfer functions in series are simply multiplied together. In this case the overall transfer function
H(s) is:
1
H(s) = G1 (s)G2 (s) =
(s + 1)(3s − 1)
Note that this result could be found directly from the differential equations used to model this system.
If we differentiate the second differential equation of the original pair we get:
d2 y dy dx
3 2 − =
dt dt dt
dx
Rearranging the first equation gives = f (t) − x
dt
d2 y dy

dy
Substituting gives: 3 2 − = f (t) − x = f (t) − 3 − y
dt dt dt
or
d2 y dy
3 2
+ 2 − y = f (t)
dt dt
which, on taking Laplace transforms, gives the s-relation (3s2 + 2s − 1)Y (s) = F (s) implying a
transfer function:
1 1
H(s) = 2 =
3s + 2s − 1 (s + 1)(3s − 1)
as obtained above.
70 HELM (2006):
System response
An engineering system is modelled by the block diagram in Figure 29:
V1 (s) + G(s) = K V0 (s)

1 + as
—
Figure 29
Determine the system response v0 (t) when the input function is a unit step function when K = 2.5
and a = 0.
Solution
If the system has an overall transfer function H(s) then V0 (s) = H(s)V1 (s). But this particular
system is the negative feedback loop described earlier and so
K
G(s) K
H(s) = = 1 + as =
1 + G(s) K K + 1 + as
1+
1 + as
In this particular case
2.5 5
H(s) = =
3.5 + 0.5s 7+s
Thus the impulse response h(t) is

−1 −1 5
h(t) = L {H(s)} = L = 5e−7t u(t)
(7 + s)
and so the response to a step input u(t) is given by the convolution of h(t) with u(t)
Z t
v0 (t) = u(t − x)5e−7x u(t) dx
Z0 t
= 5e−7x dx t>0
0
t
5 −7x 5
= − e = − [e−7t − 1]
7 0 7
HELM (2006): 71
Table of Laplace Transforms
Rule Causal function Laplace transform
1 f (t) F (s)
1
2 u(t)
s
n!
3 tn u(t)
sn+1
1
4 e−at u(t)
s+a
a
5 sin at . u(t)
s + a2
2
s
6 cos at . u(t)
s + a2
2
b
7 e−at sin bt . u(t)
(s + a)2 + b2
s+a
8 e−at cos bt u(t)
(s + a)2 + b2
72 HELM (2006):
Contents 21
z-Transforms
21.1 The z-Transform 2
21.2 Basics of z-Transform Theory 12
21.3 z-Transforms and Difference Equations 36
21.4 Engineering Applications of z-Transforms 64
21.5 Sampled Functions 85
Learning outcomes
In this Workbook you will learn about the properties and applications of the z-transform,
a major mathematical tool for the analysis and design of discrete systems including digital
control systems.

The z-Transform 21.1
Introduction
The z-transform is the major mathematical tool for analysis in such topics as digital control and
digital signal processing. In this introductory Section we lay the foundations of the subject by briefly
discussing sequences, shifting of sequences and difference equations. Readers familiar with these
topics can proceed directly to Section 21.2 where z-transforms are first introduced.

Prerequisites • have competence with algebra


'
$
• explain what is meant by a sequence and by a
difference equation
Learning Outcomes • distinguish between first and second order
On completion you should be able to . . . difference equations
• shift sequences to the left or right

& %
2 HELM (2006):
Workbook 21: z-Transforms
®
1. Preliminaries: Sequences and Difference Equations
Sequences
A sequence is a set of numbers formed according to some definite rule. For example the sequence
{1, 4, 9, 16, 25, . . .} (1)
is formed by the squares of the positive integers.
If we write
y1 = 1, y2 = 4, y3 = 9, . . .
then the general or n th term of the sequence (1) is yn = n2 . The notations y(n) and y[n] are
also used sometimes to denote the general term. The notation {yn } is used as an abbreviation for a
whole sequence.
An alternative way of considering a sequence is to view it as being obtained by sampling a continuous
function. In the above example the sequence of squares can be regarded as being obtained from the
function
y(t) = t2
by sampling the function at t = 1, 2, 3, . . . as shown in Figure 1.
y = t2
9
1 t
1 2 3
Figure 1
The notation y(n), as opposed to yn , for the general term of a sequence emphasizes this sampling
aspect.
Task
Find the general term of the sequence {2, 4, 8, 16, 32, . . .}.
Your solution
Answer
The terms of the sequence are the integer powers of 2: y1 = 2 = 21 y2 = 4 = 22
y3 = 8 = 23 . . . so yn = 2n .
HELM (2006): 3
Section 21.1: The z-Transform
Here the sequence {2n } are the sample values of the continuous function y(t) = 2t at t = 1, 2, 3, . . .
An alternative way of defining a sequence is as follows:
(i) give the first term y1 of the sequence
(ii) give the rule for obtaining the (n + 1)th term from the nth .
A simple example is
yn+1 = yn + d y1 = a
where a and d are constants.
It is straightforward to obtain an expression for yn in terms of n as follows:
y2 = y1 + d = a + d
y3 = y2 + d = a + d + d = a + 2d
y4 = y3 + d = a + 3d
.. (2)
.
yn = a + (n − 1)d
This sequence characterised by a constant difference between successive terms

yn+1 − yn = d n = 1, 2, 3, . . .
is called an arithmetic sequence.
Task
Calculate the nth term of the arithmetic sequence defined by
yn+1 − yn = 2 y1 = 9.
Write out the first 4 terms of this sequence explicitly.
Suggest why an arithmetic sequence is also known as a linear sequence.
Your solution
4 HELM (2006):
®
Answer
We have, using (2),
yn = 9 + (n − 1)2 or
yn = 2n + 7
so y1 = 9 (as given), y2 = 11, y3 = 13, y4 = 15, . . .
A graph of yn against n would be just a set of points but all lie on the straight line y = 2x + 7,
hence the term ‘linear sequence’.
yn
y(x) = 2x + 7
13
11
9
n
1 2 3
Nomenclature
The equation
yn+1 − yn = d (3)
is called a difference equation or recurrence equation or more specifically a first order, constant
coefficient, linear, difference equation.
The sequence whose nth term is
yn = a + (n − 1)d (4)
is the solution of (3) for the initial condition y1 = a.
The coefficients in (3) are the numbers preceding the terms yn+1 and yn so are 1 and −1 respectively.
The classification first order for the difference equation (3) follows because the difference between
the highest and lowest subscripts is n + 1 − n = 1.
Now consider again the sequence
{yn } = {2n }
Clearly
yn+1 − yn = 2n+1 − 2n = 2n
so the difference here is dependent on n i.e. is not constant. Hence the sequence {2n } = {2, 4, 8, . . .}
is not an arithmetic sequence.
HELM (2006): 5
Task
For the sequence {yn } = 2n calculate yn+1 − 2yn . Hence write down a difference
equation and initial condition for which {2n } is the solution.
Your solution
Answer
yn+1 − 2yn = 2n+1 − 2 × 2n = 2n+1 − 2n+1 = 0
Hence yn = 2n is the solution of the homogeneous difference equation

yn+1 − 2yn = 0 (5)
with initial condition y1 = 2.
The term ‘homogeneous’ refers to the fact that the right-hand side of the difference equation (5) is
zero.
More generally it follows that

yn+1 − Ayn = 0 y1 = A
has solution sequence {yn } with general term
y n = An
A second order difference equation

Second order difference equations are characterised, as you would expect, by a difference of 2 between
the highest and lowest subscripts. A famous example of a constant coefficient second order difference
equation is
yn+2 = yn+1 + yn or yn+2 − yn+1 − yn = 0 (6)
The solution {yn } of (6) is a sequence where any term is the sum of the two preceding ones.
6 HELM (2006):
®
Task
What additional information is needed if (6) is to be solved?
Your solution
Answer
Two initial conditions, the values of y1 and y2 must be specified so we can calculate
y3 = y2 + y1
y4 = y3 + y2
and so on.
Task
Find the first 6 terms of the solution sequence of (6) for each of the following sets
of initial conditions
(a) y1 = 1 y2 = 3
(b) y1 = 1 y2 = 1
Your solution
Answer
(a) {1, 3, 4, 7, 11, 18 . . .}
(b) {1, 1, 2, 3, 5, 8, . . .} (7)
The sequence (7) is a very famous one; it is known as the Fibonacci Sequence. It follows that the
solution sequence of the difference equation (6)
yn+1 = yn+1 + yn
with initial conditions y1 = y2 = 1 is the Fibonacci sequence. What is not so obvious is what is the
general term yn of this sequence.
One way of obtaining yn in this case, and for many other linear constant coefficient difference
equations, is via a technique involving Z−transforms which we shall introduce shortly.
HELM (2006): 7
Shifting of sequences
Right Shift
Recall the sequence {yn } = {n2 } or, writing out the first few terms explicitly,
{yn } = {1, 4, 9, 16, 25, . . .}
The sequence {vn } = {0, 1, 4, 9, 16, 25, . . .} contains the same numbers as yn but they are all
shifted one place to the right. The general term of this shifted sequence is
vn = (n − 1)2 n = 1, 2, 3, . . .
Similarly the sequence
{wn } = {0, 0, 1, 4, 9, 16, 25, . . .}
has general term

(n − 2)2 n = 2, 3, . . .
wn =
0 n=1
Task
For the sequence {yn } = {2n } = {2, 4, 8, 16, . . . } write out explicitly the first 6
terms and the general terms of the sequences vn and wn obtained respectively by
shifting the terms of {yn }
(a) one place to the right (b) three places the the right.
Your solution
Answer
(a)

2n−1 n = 2, 3, 4, . . .
{vn } = {0, 2, 4, 8, 16, 32 . . .} vn =
0 n=1
(b)

2n−3 n = 4, 5, 6, . . .
{wn } = {0, 0, 0, 2, 4, 8 . . .} wn =
0 n = 1, 2, 3
8 HELM (2006):
®
The operation of shifting the terms of a sequence is an important one in digital signal processing and
digital control. We shall have more to say about this later. For the moment we just note that in a
digital system a right shift can be produced by delay unit denoted symbolically as follows:
{yn } {yn−1 }
−1
z
Figure 2
A shift of 2 units to the right could be produced by 2 such delay units in series:
{yn } {yn−1 } {yn−2 }

z −1 z −1
Figure 3
(The significance of writing z −1 will emerge later when we have studied z−transforms.)
Left Shift
Suppose we again consider the sequence of squares
{yn } = {1, 4, 9, 16, 25, . . .}
with yn = n2 .
Shifting all the numbers one place to the left (or advancing the sequence) means that the sequence
{vn } generated has terms
v0 = y1 = 1 v1 = y2 = 4 v2 = y3 = 9 . . .
and so has general term
vn = (n + 1)2 n = 0, 1, 2, . . .
= yn+1
Notice here the appearance of the zero subscript for the first time.
Shifting the terms of {vn } one place to the left or equivalently the terms of {yn } two places to the
left generates a sequence {wn } where
w−1 = v0 = y1 = 1 w0 = v1 = y2 = 4
and so on.
The general term is
wn = (n + 2)2 n = −1, 0, 1, 2, . . .
= yn+2
HELM (2006): 9
Task
If {yn } = {1, 1, 2, 3, 5, . . .} n = 1, 2, 3, . . . is the Fibonacci sequence, write out
the terms of the sequences {yn+1 }, {yn+2 }.
Your solution
Answer
yn+1 = {1, 1, 2, 3, 5, . . .} where y0 = 1 (arrowed), y1 = 1, y2 = 2, . . .
↑
yn+2 = {1, 1, 2, 3, 5, . . .} where y−1 = 1, y0 = 1 (arrowed), y1 = 2, y2 = 3, . . .

↑
It should be clear from this discussion of left shifted sequences that the simpler idea of a sequence
‘beginning’ at n = 1 and containing only terms y1 , y2 , . . . has to be modified.
We should instead think of a sequence as two-sided i.e. {yn } defined for all integer values of n and
zero. In writing out the ‘middle’ terms of a two sided sequence it is convenient to show by an arrow
the term y0 .
For example the sequence {yn } = {n2 } n = 0, ±1, ±2, . . . could be written
{. . . 9, 4, 1, 0, 1, 4, 9, . . .}
↑
A sequence which is zero for negative integers n is sometimes called a causal sequence.
For example the sequence, denoted by {un },

0 n = −1, −2, −3, . . .
un =
1 n = 0, 1, 2, 3, . . .
is causal. Figure 4 makes it clear why {un } is called the unit step sequence.
un
−3 −2 −1 0 1 2 n
Figure 4
The ‘curly bracket’ notation for the unit step sequence with the n = 0 term arrowed is
{un } = {. . . , 0, 0, 0, 1, 1, 1, . . .}
↑
10 HELM (2006):
®
Task
Draw graphs of the sequences {un−1 }, {un−2 }, {un+1 } where {un } is the
unit step sequence.
Your solution
Answer
un−1
n
−3 −2 −1 0 1 2 3
un−2
n
−3 −2 −1 0 1 2 3
un+1
n
−3 −2 −1 0 1 2
HELM (2006): 11
Basics of z-Transform
Theory 21.2
Introduction
In this Section, which is absolutely fundamental, we define what is meant by the z-transform of a
sequence. We then obtain the z-transform of some important sequences and discuss useful properties
of the transform.
Most of the results obtained are tabulated at the end of the Section.
The z-transform is the major mathematical tool for analysis in such areas as digital control and digital
signal processing.
' $
• understand sigma (Σ) notation for
summations
Prerequisites • be familiar with geometric series and the

binomial theorem
• have studied basic complex number theory
including complex exponentials
&
# %
• define the z-transform of a sequence
Learning Outcomes • obtain the z-transform of simple sequences
On completion you should be able to . . . from the definition or from basic properties of
the z-transform
" !
12 HELM (2006):
®
1. The z-transform
If you have studied the Laplace transform either in a Mathematics course for Engineers and Scientists
or have applied it in, for example, an analog control course you may recall that
1. the Laplace transform definition involves an integral

2. applying the Laplace transform to certain ordinary differential equations turns them into simpler
(algebraic) equations
3. use of the Laplace transform gives rise to the basic concept of the transfer function of a
continuous (or analog) system.
The z-transform plays a similar role for discrete systems, i.e. ones where sequences are involved, to
that played by the Laplace transform for systems where the basic variable t is continuous. Specifically:
1. the z-transform definition involves a summation

2. the z-transform converts certain difference equations to algebraic equations
3. use of the z-transform gives rise to the concept of the transfer function of discrete (or digital)
systems.
Key Point 1
Definition:
For a sequence {yn } the z-transform denoted by Y (z) is given by the infinite series
∞
X
Y (z) = y0 + y1 z −1
+ y2 z −2
+ ... = yn z −n (1)
n=0
Notes:
1. The z-transform only involves the terms yn , n = 0, 1, 2, . . . of the sequence. Terms y−1 , y−2 , . . .
whether zero or non-zero, are not involved.
2. The infinite series in (1) must converge for Y (z) to be defined as a precise function of z.
We shall discuss this point further with specific examples shortly.
3. The precise significance of the quantity (strictly the ‘variable’) z need not concern us except
to note that it is complex and, unlike n, is continuous.
Key Point 2
We use the notation Z{yn } = Y (z) to mean that the z-transform of the sequence {yn } is Y (z).
HELM (2006): 13
Section 21.2: Basics of z-Transform Theory
Less strictly one might write Zyn = Y (z). Some texts use the notation yn ↔ Y (z) to denote that
(the sequence) yn and (the function) Y (z) form a z-transform pair.
We shall also call {yn } the inverse z-transform of Y (z) and write symbolically
{yn } = Z−1 Y (z).
2. Commonly used z-transforms

Unit impulse sequence (delta sequence)
This is a simple but important sequence denoted by δn and defined as

1 n=0
δn =
0 n = ±1, ±2, . . .
The significance of the term ‘unit impulse’ is obvious from this definition.
By the definition (1) of the z-transform
Z{δn } = 1 + 0z −1 + 0z −2 + . . .
= 1
If the single non-zero value is other than at n = 0 the calculation of the z-transform is equally simple.
For example,

1 n=3
δn−3 =
0 otherwise
From (1) we obtain
Z{δn−3 } = 0 + 0z −1 + 0z −2 + z −3 + 0z −4 + . . .
= z −3
Task
Write down the definition of δn−m where m is any positive integer and obtain its
z-transform.
Your solution
Answer

1 n=m
δn−m = Z{δn−m } = z −m
0 otherwise
14 HELM (2006):
®
Key Point 3
Z{δn−m } = z −m m = 0, 1, 2, . . .
Unit step sequence

As we saw earlier in this Workbook the unit step sequence is

1 n = 0, 1, 2, . . .
un =
0 n = −1, −2, −3, . . .
Then, by the definition (1)
Z{un } = 1 + 1z −1 + 1z −2 + . . .
The infinite series here is a geometric series (with a constant ratio z −1 between successive terms).
Hence the sum of the first N terms is
SN = 1 + z −1 + . . . + z −(N −1)
1 − z −N
=
1 − z −1
1
As N → ∞ SN → −1
provided |z −1 | < 1
1−z
Hence, in what is called the closed form of this z-transform we have the result given in the following
Key Point:
Key Point 4
1 z
Z{un } = −1
= ≡ U (z) say, |z −1 | < 1
1−z z−1
The restriction that this result is only valid if |z −1 | < 1 or, equivalently |z| > 1 means that the
position of the complex quantity z must lie outside the circle centre origin and of unit radius in an
Argand diagram. This restriction is not too significant in elementary applications of the z-transform.
HELM (2006): 15
The geometric sequence {an}
Task
For any arbitrary constant a obtain the z-transform of the causal sequence

0 n = −1, −2, −3, . . .
fn = n
a n = 0, 1, 2, 3, . . .
Your solution
Answer
We have, by the definition in Key Point 1,
F (z) = Z{fn } = 1 + az −1 + a2 z −2 + . . .
which is a geometric series with common ratio az −1 . Hence, provided |az −1 | < 1, the closed form
of the z-transform is
1 z
F (z) = = .
1 − az −1 z−a
The z-transform of this sequence {an }, which is itself a geometric sequence is summarized in Key
Point 5.
Key Point 5
1 z
Z{an } = −1
= |z| > |a|.
1 − az z−a
Notice that if a = 1 we recover the result for the z-transform of the unit step sequence.
16 HELM (2006):
®
Task
Use Key Point 5 to write down the z-transform of the following causal sequences
(a) 2n
(b) (−1)n , the unit alternating sequence
(c) e−n
(d) e−αn where α is a constant.
Your solution
Answer
1 z
(a) Using a = 2 Z{2n } = = |z| > 2
1 − 2z −1 z−2
1 z
(b) Using a = −1 Z{(−1)n } = −1
= |z| > 1
1+z z+1
z
(c) Using a = e−1 Z{e−n } = |z| > e−1
z − e−1
z
(d) Using a = e−α Z{e−αn } = |z| > e−α
z − e−α
The basic z-transforms obtained have all been straightforwardly found from the definition in Key Point
1. To obtain further useful results we need a knowledge of some of the properties of z-transforms.
HELM (2006): 17
3. Linearity property and applications
Linearity property
This simple property states that if {vn } and {wn } have z-transforms V (z) and W (z) respectively
then
Z{avn + bwn } = aV (z) + bW (z)
for any constants a and b.
(In particular if a = b = 1 this property tells us that adding sequences corresponds to adding their
z-transforms).
The proof of the linearity property is straightforward using obvious properties of the summation
operation. By the z-transform definition:
∞
X
Z{avn + bwn } = (avn + bwn )z −n
n=0
∞
X
= (avn z −n + bwn z −n )
n=0
∞
X ∞
X
= a vn z −n + b wn z −n
n=0 n=0
= aV (z) + bV (z)
We can now use the linearity property and the exponential sequence {e−αn } to obtain the z-transforms
of hyperbolic and of trigonometric sequences relatively easily. For example,
en − e−n
sinh n =
2
Hence, by the linearity property,
1 1
Z{sinh n} = Z{en } − Z{e−n }
2 2

1 z z
= −
2 z − e z − e−1
z − e−1 − (z − e)

z
=
2 z 2 − (e + e−1 )z + 1
e − e−1

z
=
2 z 2 − (2 cosh 1)z + 1
z sinh 1
=
z2− 2z cosh 1 + 1
Using αn instead of n in this calculation, where α is a constant, we obtain
z sinh α
Z{sinh αn} =
z2 − 2z cosh α + 1
18 HELM (2006):
®
Task
eαn + e−αn
Using cosh αn ≡ obtain the z-transform of the sequence {cosh αn} =
2
{1, cosh α, cosh 2α, . . .}
Your solution
Answer
We have, by linearity,
1 1
Z{cosh αn} = Z{eαn } + Z{e−αn }
2 2

z 1 1
= +
2 z − eα z − e−α
2z − (eα + e−α )

z
=
2 z 2 − 2z cosh α + 1
z 2 − z cosh α
=
z 2 − 2z cosh α + 1
Trigonometric sequences
If we use the result
z
Z{an } = |z| > |a|
z−a
√
with, respectively, a = eiω and a = e−iω where ω is a constant and i denotes −1 we obtain
z z
Z{eiωn } = Z{e−iωn } =
z−e +iω z − e−iω
Hence, recalling from complex number theory that
eix + e−ix
cos x =
2
we can state, using the linearity property, that
HELM (2006): 19
1 1
Z{cos ωn} = Z{eiωn } + Z{e−iωn }
2 2

z 1 1
= +
2 z − eiω z − e−iω
2z − (eiω + e−iω )

z
=
2 z 2 − (eiω + e−iω )z + 1
z 2 − z cos ω
=
z 2 − 2z cos ω + 1
(Note the similarity of the algebra here to that arising in the corresponding hyperbolic case. Note
also the similarity of the results for Z{cosh αn} and Z{cos ωn}.)
Task
By a similar procedure to that used above for Z{cos ωn} obtain Z{sin ωn}.
Your solution
20 HELM (2006):
®
Answer
We have
1 1
Z{sin ωn} = Z{eiωn } − Z{e−iωn } (Don’t miss the i factor here!)
2i 2i

z 1 1
∴ Z{sin ωn} = −
2i z − eiω z − e−iω
−e−iω + eiω

z
=
2i z 2 − 2z cos ω + 1
z sin ω
=
z2 − 2z cos ω + 1
Key Point 6
z 2 − z cos ω
Z{cos ωn} =
z 2 − 2z cos ω + 1
z sin ω
Z{sin ωn} =
z2 − 2z cos ω + 1
Notice the same denominator in the two results in Key Point 6.
Key Point 7
z 2 − z cosh α
Z{cosh αn} =
z 2 − 2z cosh α + 1
z sinh α
Z{sinh αn} =
z 2 − 2z cosh α + 1
Again notice the denominators in Key Point 7. Compare these results with those for the two trigono-
metric sequences in Key Point 6.
HELM (2006): 21
Task
Use Key Points 6 and 7 to write down the z-transforms of
n no
(a) sin (b) {cos 3n} (c) {sinh 2n} (d) {cosh n}
2
Your solution
Answer
z sin 12
n no
(a) Z sin = 2
z − 2z cos 21 + 1

2
z 2 − z cos 3
(b) Z{cos 3n} =
z 2 − 2z cos 3 + 1
z sinh 2
(c) Z{sinh 2n} =
z 2 − 2z cosh 2 + 1
z 2 − z cosh 1
(d) Z{cosh n} =
z 2 − 2z cosh 1 + 1
22 HELM (2006):
®
Task
Use the results for Z{cos ωn} and Z{sin ωn} in Key Point 6 to obtain the z-
transforms of
n nπ o n nπ o
(a) {cos(nπ)} (b) sin (c) cos
2 2
Write out the first few terms of each sequence.
Your solution
Answer
(a) With ω = π
z 2 − z cos π z2 + z z
Z{cos nπ} = 2
= 2
=
z − 2z cos π + 1 z + 2z + 1 z+1
{cos nπ} = {1, −1, 1, −1, . . .} = {(−1)n }

We have re-derived the z-transform of the unit alternating sequence. (See Task on page 17).
π
(b) With ω =
2
z sin π2

n nπ o z
Z sin = 2 π
= 2
2 z − 2z cos 2 + 1 z +1
n nπ o
where sin = {0, 1, 0, −1, 0, . . .}
2
nπ o z 2 − cos π2

π n z2
(c) With ω = Z cos = = 2
2 2 z2 + 1 z +1
n nπ o
where cos = {1, 0, −1, 0, 1, . . .}
2
(These three results can also be readily obtained from the definition of the z-transform. Try!)
HELM (2006): 23
4. Further z -transform properties
We showed earlier that the results
Z{vn + wn } = V (z) + W (z) and similarly Z{vn − wn } = V (z) − W (z)
follow from the linearity property.
You should be clear that there is no comparable result for the product of two sequences.
Z{vn wn } is not equal to V (z)W (z)
For two specific products of sequences however we can derive useful results.
Multiplication of a sequence by an
Suppose fn is an arbitrary sequence with z-transform F (z).
Consider the sequence {vn } where
vn = an fn i.e. {v0 , v1 , v2 , . . .} = {f0 , af1 , a2 f2 , . . .}
By the z-transform definition
Z{vn } = v0 + v1 z −1 + v2 z −2 + . . .
= f0 + a f1 z −1 + a2 f2 z −2 + . . .
∞
X
= an fn z −n
n=0
∞
X z −n
= fn
n=0
a
∞
X
But F (z) = fn z −n
n=0
z
Thus we have shown that Z{an fn } = F
a
Key Point 8
z
n
Z{a fn } = F
a
That is, multiplying a sequence {fn } by the sequence {an } does not change the form of the z-
z
transform F (z). We merely replace z by in that transform.
a
24 HELM (2006):
®
For example, using Key Point 6 we have

z 2 − z cos 1
Z{cos n} =
z 2 − 2z cos 1 + 1
z
So, replacing z by 1 = 2z,
2
n
(2z)2 − (2z) cos 1

1
Z cos n =
2 (2z)2 − 4z cos 1 + 1
Task
Using Key Point 8, write down the z-transform of the sequence {vn } where
vn = e−2n sin 3n
Your solution
Answer
z sin 3
We have, Z{sin 3n} =
z 2 − 2z cos 3 + 1
so with a = e−2 we replace z by z e+2 to obtain
ze2 sin 3
Z{vn } = Z{e−2n sin 3n} =
(ze2 )2 − 2ze2 cos 3 + 1
ze−2 sin 3
=
z 2 − 2ze−2 cos 3 + e−4
HELM (2006): 25
Task
Using the property just discussed write down the z-transform of the sequence {wn }
where
wn = e−αn cos ωn
Your solution
Answer
z 2 − z cos ω
We have, Z{cos ωn} =
z 2 − 2z cos ω + 1
So replacing z by zeα we obtain
(zeα )2 − zeα cos ω

Z{wn } = Z{e−αn cos ωn} =
(zeα )2 − 2zeα cos ω + 1
z 2 − ze−α cos ω
=
z 2 − 2ze−α cos ω + e−2α
Key Point 9
Z{e−αn cos ωn} =
ze−α sin ω
Z{e−αn sin ωn} =
Note the same denominator in each case.
26 HELM (2006):
®
Multiplication of a sequence by n
An important sequence whose z-transform we have not yet obtained is the unit ramp sequence {rn }:

0 n = −1, −2, −3, . . .
rn =
n n = 0, 1, 2, . . .
rn
3
2
1
2 3 n
0 1
Figure 5
Figure 5 clearly suggests the nomenclature ‘ramp’.
We shall attempt to use the z-transform of {rn } from the definition:
Z{rn } = 0 + 1z −1 + 2z −2 + 3z −3 + . . .
This is not a geometric series but we can write
z −1 + 2z −2 + 3z −3 = z −1 (1 + 2z −1 + 3z −2 + . . .)
= z −1 (1 − z −1 )−2 |z −1 | < 1
where we have used the binomial theorem ( 16.3) .

Hence
1
Z{rn } = Z{n} = 2
z 1 − z1
z
= |z| > 1
(z − 1)2
Key Point 10
The z-transform of the unit ramp sequence is
z
Z{rn } = = R(z) (say)
(z − 1)2
z
Recall now that the unit step sequence has z-transform Z{un } = = U (z) (say) which is
(z − 1)
the subject of the next Task.
HELM (2006): 27
Task
z
Obtain the derivative of U (z) = with respect to z.
(z−1)
Your solution
Answer
We have, using the quotient rule of differentiation:

dU d z (z − 1)1 − (z)(1)
= =
dz dz z − 1 (z − 1)2
−1
=
(z − 1)2
We also know that

z 1 dU
R(z) = 2
= (−z) − 2
= −z (3)
(z − 1) (z − 1) dz
Also, if we compare the sequences
un = {0, 0, 1, 1, 1, 1, . . .}
↑
rn = {0, 0, 0, 1, 2, 3, . . .}
↑
we see that rn = n un , (4)
dU
so from (3) and (4) we conclude that Z{n un } = −z
dz
Now let us consider the problem more generally.
Let fn be an arbitrary sequence with z-transform F (z):
∞
X
F (z) = f0 + f1 z −1
+ f2 z −2
+ f3 z −3
+ ... = fn z −n
n=0
28 HELM (2006):
®
We differentiate both sides with respect to the variable z, doing this term-by-term on the right-hand
side. Thus
∞
dF −2 −3 −4
X
= −f1 z − 2f2 z − 3f3 z − ... = (−n)fn z −n−1
dz n=1
∞
X
−1 −1 −2 −3 −1
= −z (f1 z + 2f2 z + 3f3 z + . . .) = −z n fn z −n
n=1
But the bracketed term is the z-transform of the sequence

{n fn } = {0, f1 , 2f2 , 3f3 , . . .}
Thus if F (z) = Z{fn } we have shown that
dF dF
= −z −1 Z{n fn } or Z{n fn } = −z
dz dz
We have already (equations (3) and (4) above) demonstrated this result for the case fn = un .
Key Point 11
dF
If Z{fn } = F (z) then Z{n fn } = −z
dz
Task
By differentiating the z-transform R(z) of the unit ramp sequence obtain the z-
transform of the causal sequence {n2 }.
Your solution
HELM (2006): 29
Answer
We have
z
Z{n} =
(z − 1)2
so

2 d z
Z{n } = Z{n.n} = −z
dz (z − 1)2
By the quotient rule
(z − 1)2 − (z)(2)(z − 1)

d z
=
dz (z − 1)2 (z − 1)4
z − 1 − 2z −1 − z
= 3
=
(z − 1) (z − 1)3
Multiplying by −z we obtain
z + z2 z(1 + z)
Z{n2 } = 3
=
(z − 1) (z − 1)3
Clearly this process can be continued to obtain the transforms of {n3 }, {n4 }, . . . etc.
5. Shifting properties of the z-transform

In this subsection we consider perhaps the most important properties of the z-transform. These
properties relate the z-transform Y (z) of a sequence {yn } to the z-transforms of
(i) right shifted or delayed sequences {yn−1 }{yn−2 } etc.
(ii) left shifted or advanced sequences {yn+1 }, {yn+2 } etc.
The results obtained, formally called shift theorems, are vital in enabling us to solve certain types of
difference equation and are also invaluable in the analysis of digital systems of various types.
Right shift theorems

Let {vn } = {yn−1 } i.e. the terms of the sequence {vn } are the same as those of {yn } but shifted
one place to the right. The z-transforms are, by definition,
Y (z) = y0 + y1 z −1 + y2 z −2 + yj z −3 + . . .
V (z) = v0 + v1 z −1 + v2 z −2 + v3 z −3 + . . .
= y−1 + y0 z −1 + y1 z −2 + y2 z −3 + . . .
= y−1 + z −1 (y0 + y1 z −1 + y2 z −2 + . . .)
i.e.
V (z) = Z{yn−1 } = y−1 + z −1 Y (z)
30 HELM (2006):
®
Task
Obtain the z-transform of the sequence {wn } = {yn−2 } using the method illus-
trated above.
Your solution
Answer
The z-transform of {wn } is W (z) = w0 + w1 z −1 + w2 z −2 + w3 z −3 + . . . or, since wn = yn−2 ,
W (z) = y−2 + y−1 z −1 + y0 z −2 + y1 z −3 + . . .
= y−2 + y−1 z −1 + z −2 (y0 + y1 z −1 + . . .)
i.e. W (z) = Z{yn−2 } = y−2 + y−1 z −1 + z −2 Y (z)
Clearly, we could proceed in a similar way to obtains a general result for Z{yn−m } where m is any
positive integer. The result is
Z{yn−m } = y−m + y−m+1 z −1 + . . . + y−1 z −m+1 + z −m Y (z)
For the particular case of causal sequences (where y−1 = y−2 = . . . = 0) these results are particularly
simple:

Z{yn−1 } = z −1 Y (z) 
Z{yn−2 } = z −2 Y (z)

(causal systems only)


Z{yn−m } = z −m Y (z)

You may recall from earlier in this Workbook that in a digital system we represented the right shift
operation symbolically in the following way:
{yn } {yn−1 } {yn−2 }

−1 −1
z z
Figure 6
The significance of the z −1 factor inside the rectangles should now be clearer. If we replace the
‘input’ and ‘output’ sequences by their z-transforms:
Z{yn } = Y (z) Z{yn−1 } = z −1 Y (z)
it is evident that in the z-transform ‘domain’ the shift becomes a multiplication by the factor z −1 .
N.B. This discussion applies strictly only to causal sequences.
HELM (2006): 31
Notational point:
A causal sequence is sometimes written as yn un where un is the unit step sequence

0 n = −1, −2, . . .
un =
1 n = 0, 1, 2, . . .
The right shift theorem is then written, for a causal sequence,
Z{yn−m un−m } = z −m Y (z)
Examples
z
Recall that the z-transform of the causal sequence {an } is . It follows, from the right shift
z−a
theorems that
zz −1 1
(i) Z{an−1 } = Z{0, 1, a, a2 , . . .} = =
z−a z−a
↑
z −1 1
(ii) Z{an−2 } = Z{0, 0, 1, a, a2 , . . .} = =
z−a z(z − a)
↑
Task
Write the following sequence fn as a difference of two unit step sequences. Hence
obtain its z-transform.
fn
0 1 2 3 4 5 6 7 n
Your solution
32 HELM (2006):
®
Answer
1 n = 0, 1, 2, . . .
Since {un } =
0 n = −1, −2, . . .

1 n = 5, 6, 7, . . .
and {un−5 } =
0 otherwise
it follows that
fn = un − un−5
z z −5 z z − z −4
Hence F (z) = − =
z−1 z−1 z−1
Left shift theorems

Recall that the sequences {yn+1 }, {yn+2 } . . . denote the sequences obtained by shifting the sequence
{yn } by 1, 2, . . . units to the left respectively. Thus, since Y (z) = Z{yn } = y0 + y1 z −1 + y2 z −2 + . . .
then
Z{yn+1 } = y1 + y2 z −1 + y3 z −2 + . . .
= y1 + z(y2 z −2 + y3 z −3 + . . .)
The term in brackets is the z-transform of the unshifted sequence {yn } apart from its first two terms:
thus
Z{yn+1 } = y1 + z(Y (z) − y0 − y1 z −1 )
∴ Z{yn+1 } = zY (z) − zy0
Task
Obtain the z-transform of the sequence {yn+2 } using the method illustrated above.
Your solution
HELM (2006): 33
Answer
Z{yn+2 } = y2 + y3 z −1 + y4 z −2 + . . .
= y2 + z 2 (y3 z −3 + y4 z −4 + . . .)
= y2 + z 2 (Y (z) − y0 − y1 z −1 − y2 z −2 )
∴ Z{yn+2 } = z 2 Y (z) − z 2 y0 − zy1
These left shift theorems have simple forms in special cases:

if y0 = 0 Z{yn+1 } = z Y (z)
if y0 = y1 = 0 Z{yn+2 } = z 2 Y (z)
if y0 = y1 = . . . ym−1 = 0 Z{yn+m } = z m Y (z)
Key Point 12
The right shift theorems or delay theorems are:
Z{yn−1 } = y−1 + z −1 Y (z)

Z{yn−2 } = y−2 + y−1 z −1 + z −2 Y (z)
.. .. .. ..
. . . .
Z{yn−m } = y−m + y−m+1 z + . . . + y−1 z −m+1 + z −m Y (z)
−1
The left shift theorems or advance theorems are:
Z{yn+1 } = zY (z) − zy0

Z{yn+2 } = z 2 Y (z) − z 2 y0 − zy1
.. ..
. .
Z{yn−m } = z m Y (z) − z m y0 − z m−1 y1 − . . . − zym−1
Note carefully the occurrence of positive powers of z in the left shift theorems and of negative
powers of z in the right shift theorems.
34 HELM (2006):
®
Table 1: z-transforms
fn F (z) Name
δn 1 unit impulse
δn−m z −m
z
un unit step sequence
z−1
z
an geometric sequence
z−a
z
eαn
z − eα
z sinh α
sinh αn
z 2 − 2z cosh α + 1
z 2 − z cosh α
cosh αn
z 2 − 2z cosh α + 1
z sin ω
sin ωn
z2 − 2z cos ω + 1
z 2 − z cos ω
cos ωn
z 2 − 2z cos ω + 1
ze−α sin ω
e−αn sin ωn
e−αn cos ωn
z
n ramp sequence
(z − 1)2
z(z + 1)
n2
(z − 1)3
z(z 2 + 4z + 1)
n3
(z − 1)4
z
an fn F
a
dF
n fn −z
dz
This table has been copied to the back of this Workbook (page 96) for convenience.
HELM (2006): 35
z-Transforms and
Difference Equations 21.3
Introduction
In this we apply z-transforms to the solution of certain types of difference equation. We shall see that
this is done by turning the difference equation into an ordinary algebraic equation. We investigate
both first and second order difference equations.
A key aspect in this process in the inversion of the z-transform. As well as demonstrating the use of
partial fractions for this purpose we show an alternative, often easier, method using what are known
as residues.

• have studied carefully Section 21.2

Prerequisites
• be familiar with simple partial fractions

#
• invert z-transforms using partial fractions or
residues where appropriate
Learning Outcomes
• solve constant coefficient linear difference
equations using z-transforms
" !
36 HELM (2006):
®
1. Solution of difference equations using z-transforms

Using z-transforms, in particular the shift theorems discussed at the end of the previous Section,
provides a useful method of solving certain types of difference equation. In particular linear constant
coefficient difference equations are amenable to the z-transform technique although certain other
types can also be tackled. In fact all the difference equations that we looked at in Section 21.1 were
linear:
yn+1 = yn + d (1st order)
yn+1 = A yn (1st order)
yn+2 = yn+1 + yn (2nd order)
Other examples of linear difference equations are
yn+2 + 4yn+1 − 3yn = n2 (2nd order)
yn+1 + yn = n 3n (1st order)
The key point is that for a difference equation to be classified as linear the terms of the sequence
{yn } arise only to power 1 or, more precisely, the highest subscript term is obtainable as a linear
combination of the lower ones. All the examples cited above are consequently linear. Note carefully
that the term n2 in our fourth example does not imply non-linearity since linearity is determined by
the yn terms.
Examples of non-linear difference equations are
p
yn+1 = yn + 1
2
yn+1 + 2 yn = 3
yn+1 yn = n
cos(yn+1 ) = yn
We shall not consider the problem of solving non-linear difference equations.

The five linear equations listed above also have constant coefficients; for example:
yn+2 + 4yn+1 − 3yn = n2
has the constant coefficients 1, 4, −3.
The (linear) difference equation
n yn+2 − yn+1 + yn = 0
has one variable coefficient viz n and so is not classified as a constant coefficient difference equation.
Solution of first order linear constant coefficient difference equations

Consider the first order difference equation
yn+1 − 3yn = 4 n = 0, 1, 2, . . .
The equation could be solved in a step-by-step or recursive manner, provided that y0 is known
because
y1 = 4 + 3y0 y2 = 4 + 3y1 y3 = 4 + 3y2 and so on.
This process will certainly produce the terms of the solution sequence {yn } but the general term yn
may not be obvious.
HELM (2006): 37
Section 21.3: z-Transforms and Difference Equations
So consider
yn+1 − 3yn = 4 n = 0, 1, 2, . . . (1)
with initial condition y0 = 1.
We multiply both sides of (1) by z −n and sum each side over all positive integer values of n and
zero. We obtain
X∞ X ∞
−n
(yn+1 − 3yn )z = 4z −n
n=0 n=0
or
∞
X ∞
X ∞
X
−n −n
yn+1 z −3 yn z =4 z −n (2)
n=0 n=0 n=0
The three terms in (2) are clearly recognisable as z-transforms.

4z
The right-hand side is the z-transform of the constant sequence {4, 4, . . .} which is .
z−1
∞
X
If Y (z) = yn z −n denotes the z-transform of the sequence {yn } that we are seeking then
n=0
∞
X
yn+1 z −n = z Y (z) − zy0 (by the left shift theorem).
n=0
Consequently (2) can be written

4z
z Y (z) − zy0 − 3 Y (z) = (3)
z−1
Equation (3) is the z-transform of the original difference equation (1). The intervening steps have
been included here for explanation purposes but we shall omit them in future. The important point
is that (3) is no longer a difference equation. It is an algebraic equation where the unknown, Y (z),
is the z-transform of the solution sequence {yn }.
We now insert the initial condition y0 = 1 and solve (3) for Y (z):
4z
(z − 3)Y (z) − z =
(z − 1)
4z z 2 + 3z
(z − 3)Y (z) = +z =
z−1 z−1
z 2 + 3z
so Y (z) = (4)
(z − 1)(z − 3)
The final step consists of obtaining the sequence {yn } of which (4) is the z-transform. As it stands
(4) is not recognizable as any of the standard transforms that we have obtained. Consequently, one
method of ‘inverting’ (4) is to use a partial fraction expansion. (We assume that you are familiar
with simple partial fractions. See 3.6)
38 HELM (2006):
®
Thus
(z + 3)
Y (z) = z
(z − 1)(z − 3)

−2 3
= z + (in partial fractions)
z−1 z−3
−2z 3z
so Y (z) = +
z−1 z−3
Now, taking inverse z-transforms, the general term yn is, using the linearity property,
z z
yn = −2Z−1 { } + 3 Z−1 { }
z−1 z−3
The symbolic notation Z−1 is common and is short for ‘the inverse z-transform of’.
Task
Using standard z-transforms write down yn explicitly, where
z z
yn = −2Z−1 { } + 3 Z−1 { }
z−1 z−3
Your solution
Answer
yn = −2 + 3 × 3n = −2 + 3n+1 n = 0, 1, 2, . . . (5)
Checking the solution:

From this solution (5)
yn = −2 + 3n+1
we easily obtain
y0 = −2 + 3 = 1 (as given)
y1 = −2 + 32 = 7
y2 = −2 + 33 = 25
y3 = −2 + 34 = 79 etc.
HELM (2006): 39
These agree with those obtained by recursive solution of the given problem (1):
yn+1 − 3yn = 4 y0 = 1
which yields
y1 = 4 + 3y0 = 7
y2 = 4 + 3y1 = 25
y3 = 4 + 3y2 = 79 etc.
More conclusively we can put the solution (5) back into the left-hand side of the difference equation
(1).
If yn = −2 + 3n+1
then 3yn = −6 + 3n+2
and yn+1 = −2 + 3n+2
So, on the left-hand side of (1),
yn+1 − 3yn = −2 + 3n+2 − (−6 + 3n+2 )
which does indeed equal 4, the given right-hand side, and so the solution has been verified.
Key Point 13
To solve a linear constant coefficient difference equation, three steps are involved:
1. Replace each term in the difference equation by its z-transform and insert the initial condi-
tion(s).
2. Solve the resulting algebraic equation. (Thus gives the z-transform Y (z) of the solution
sequence.)
3. Find the inverse z-transform of Y (z).
The third step is usually the most difficult. We will consider the problem of finding inverse z-
transforms more fully later.
40 HELM (2006):
®
Task
Solve the difference equation
yn+1 − yn = d n = 0, 1, 2, . . . y0 = a (6)
where a and d are constants.
(The solution will give the n th term of an arithmetic sequence with a constant
difference d and initial term a.)
Start by replacing each term of (6) by its z-transform:
Your solution
Answer
If Y (z) = Z{yn } we obtain the algebraic equation
d×z
z Y (z) − zy0 − Y (z) =
(z − 1)
Note that the right-hand side transform is that of a constant sequence {d, d, . . .}. Note also the
use of the left shift theorem.
Now insert the initial condition y0 = a and then solve for Y (z):
Your solution
HELM (2006): 41
Answer
d×z
(z − 1)Y (z) = +z×a
(z − 1)
d×z a×z
Y (z) = 2
+
(z − 1) z−1
Finally take the inverse z-transform of the right-hand side. [Hint: Recall the z-transform of the ramp
sequence {n}.]
Your solution
Answer
We have
z z
yn = d × Z−1 { 2
} + a × Z−1 { }
(z − 1) z−1
∴ yn = dn + a n = 0, 1, 2, . . . (7)
using the known z-transforms of the ramp and unit step sequences. Equation (7) may well be a
familiar result to you – an arithmetic sequence whose ‘zeroth’ term is y0 = a has general term
yn = a + nd.
i.e. {yn } = {a, a + d, . . . a + nd, . . .}
This solution is of course readily obtained by direct recursive solution of (6) without need for z-
transforms. In this case the general term (a + nd) is readily seen from the form of the recursive
solution: (Make sure you really do see it).
N.B. If the term a is labelled as the first term (rather than the zeroth) then
y1 = a, y2 = a + d, y3 − a + 2d,
so in this case the n th term is
yn = a + (n − 1)d
rather than (7).
42 HELM (2006):
®
Use of the right shift theorem in solving difference equations

The problem just solved was given by (6), i.e.
yn+1 − yn = d with y0 = a n = 0, 1, 2, . . .
We obtained the solution
yn = a + nd n = 0, 1, 2, . . .
Now consider the problem
yn − yn−1 = d n = 0, 1, 2, . . . (8)
with y−1 = a.
The only difference between the two problems is that the ‘initial condition’ in (8) is given at n = −1
rather than at n = 0. Writing out the first few terms should make this clear.
(6) (8)
y1 − y0 = d y0 − y−1 = d
y2 − y1 = d y1 − y 0 = d
.. ..
. .
yn+1 − yn = d yn − yn−1 = d
y0 = a y−1 = a
The solution to (8) must therefore be the same as for (6) but with every term in the solution (7) of
(6) shifted 1 unit to the left.
Thus the solution to (8) is expected to be
yn = a + (n + 1)d n = −1, 0, 1, 2, . . .
(replacing n by (n + 1) in the solution (7)).
Task
Use the right shift theorem of z-transforms to solve (8) with the initial condition
y−1 = a.
(a) Begin by taking the z-transform of (8), inserting the initial condition and solving for Y (z):
Your solution
HELM (2006): 43
Answer
We have, for the z-transform of (8)
dz
Y (z) − (z −1 Y (z) + y−1 ) = [Note that here dz means d × z]
z−1
dz
Y (z)(1 − z −1 ) − a =
z−1

z−1 dz
Y (z) = +a
z (z − 1)
dz 2 az (9)
Y (z) = 2
+
(z − 1) z−1
The second term of Y (z) has the inverse z-transform {a un } = {a, a, a, . . .}.
The first term is less straightforward. However, we have already reasoned that the other term in yn
here should be (n + 1)d.
dz 2
(b) Show that the z-transform of (n + 1)d is . Use the standard transform of the ramp and
(z − 1)2
step:
Your solution
Answer
We have
Z{(n + 1)d} = dZ{n} + dZ{1}
by the linearity property
dz dz
∴ Z{(n + 1)d} = 2
+
(z − 1) z−1

1+z−1
= dz
(z − 1)2
dz 2
=
(z − 1)2
as expected.
44 HELM (2006):
®
(c) Finally, state yn :
Your solution
Answer
Returning to (9) the inverse z-transform is
yn = (n + 1)d + a un i.e. yn = a + (n + 1)d n = −1, 0, 1, 2, . . .
as we expected.
Task
Earlier in this Section (pages 37-39) we solved
yn+1 − 3yn = 4 n = 0, 1, 2, . . . with y0 = 1.
Now solve yn − 3yn−1 = 4 n = 0, 1, 2, . . . with y−1 = 1. (10)
Begin by obtaining the z-transform of yn :
Your solution
Answer
We have, taking the z-transform of (10),
4z
Y (z) − 3(z −1 Y (z) + 1) =
z−1
(using the right shift property and inserting the initial condition.)
4z
∴ Y (z) − 3z −1 Y (z) = 3 +
z−1
(z − 3) 4z 3z 4z 2
Y (z) = 3+ so Y (z) = +
z z−1 z − 3 (z − 1)(z − 3)
HELM (2006): 45

z
Write the second term as 4z and obtain the partial fraction expansion of the
(z − 1)(z − 3)
bracketed term. Then complete the z-transform inversion.
Your solution
Answer
z − 12 3
= + 2
(z − 1)(z − 3) z−1 z−3
We now have
3z 2z 6z
Y (z) = − +
z−3 z−1 z−3
so
yn = 3 × 3n − 2 + 6 × 3n = −2 + 9 × 3n = −2 + 3n+2 (11)
Compare this solution (11) to that of the previous problem (5) on page 39:
Your solution
Answer
Solution (11) is just the solution sequence (5) moved 1 unit to the left. We anticipated this since
the difference equation (10) and associated initial condition is the same as the difference equation
(1) but shifted one unit to the left.
46 HELM (2006):
®
2. Second order difference equations

You will learn in this section about solving second order linear constant coefficient difference equations.
In this case two initial conditions are required, typically either y0 and y1 or y−1 and y−2 . In the first
case we use the left shift property of the z-transform, in the second case we use the right shift
property. The same three basic steps are involved as in the first order case.
Task
By solving
yn+2 = yn+1 + yn (12)
y0 = y1 = 1
obtain the general term yn of the Fibonacci sequence.
Begin by taking the z-transform of (12), using the left shift property. Then insert the initial conditions
and solve the resulting algebraic equation for Y (z), the z-transform of {yn }:
Your solution
Answer
z 2 Y (z) − z 2 y0 − zy1 = zY (z) − zy0 + Y (z) (taking z-transforms )
z 2 Y (z) − z 2 − z = zY (z) − z + Y (z) (inserting initial conditions)
(z 2 − z − 1)Y (z) = z 2
so
z2
Y (z) = (solving for Y (z)).
z2 − z − 1
HELM (2006): 47
Now solve the quadratic equation z 2 − z − 1 = 0 and hence factorize the denominator of Y (z):
Your solution
Answer
z2 − z − 1 = 0
√ √
1± 1+4 1± 5
∴ z= =
2 2
√ √
1+ 5 1− 5
so if a = , b=
2 2
2
z
Y (z) =
(z − a)(z − b)
This form for Y (z) often arises in solving second order difference equations. Write it in partial
fractions and find yn , leaving a and b as general at this stage:
Your solution
Answer
z Az Bz
Y (z) = z = + in partial fractions
(z − a)(z − b) z − a (z − b)
a b
where A = and B =
a−b b−a
Hence, taking inverse z-transforms
1
yn = Aan + Bbn = (an+1 − bn+1 ) (13)
(a − b)
48 HELM (2006):
®
Now complete the Fibonacci problem:

Your solution
Answer √ √
1+ 5 1− 5 √
With a = b= so a − b = 5
2 2
we obtain, using (13)
√ !n+1 √ !n
 
1 1+ 5 1− 5 
yn = √  − n = 2, 3, 4, . . .
5 2 2
for the n th term of the Fibonacci sequence.
With an appropriate computational aid you could (i) check that this formula does indeed give the
familiar sequence
{1, 1, 2, 3, 5, 8, 13, . . .}
and (ii) obtain, for example, y50 and y100 .
Key Point 14
The inverse z-transform of
z2 1
Y (z) = a 6= b is yn = (an+1 − bn+1 )
(z − a)(z − b) (a − b)
HELM (2006): 49
Task
Use the right shift property of z-transforms to solve the second order difference
equation
yn − 7yn−1 + 10 yn−2 = 0 with y−1 = 16 and y−2 = 5.
[Hint: the steps involved are the same as in the previous Task]
Your solution
Answer
Y (z) − 7(z −1 Y (z) + 16) + 10(z −2 Y (z) + 16z −1 + 5) = 0
Y (z)(1 − 7z −1 + 10z −2 ) − 112 + 160z −1 + 50 = 0
z 2 − 7z + 10

Y (z) = 62 − 160z −1
z2
62z 2 160z
Y (z) = 2
− 2
z − 7z + 10 z − 7z + 10
(62z − 160)
= z
(z − 2)(z − 5)
12z 50z
= + in partial fractions
z−2 z−5
so yn = 12 × 2n + 50 × 5n n = 0, 1, 2, . . .
We now give an Example where a quadratic equation with repeated solutions arises.
50 HELM (2006):
®
Example 1
(a) Obtain the z-transform of {fn } = {nan }.
(b) Solve
yn − 6yn−1 + 9yn−2 = 0 n = 0, 1, 2, . . .
y−1 = 1 y−2 = 0
[Hint: use the result from (a) at the inversion stage.]
Solution
z n z/a az
(a) Z{n} = ∴ Z{na } = 2 = where we have used the
(z − 1)2 (z/a − 1) (z − a)2
z
property Z{fn an } = F
a
(b) Taking the z-transform of the difference equation and inserting the initial conditions:
Y (z) − 6(z −1 Y (z) + 1) + 9(z −2 Y (z) + z −1 ) = 0
Y (z)(1 − 6z −1 + 9z −2 ) = 6 − 9z −1
Y (z)(z 2 − 6z + 9) = 6z 2 − 9z
6z 2 − 9z

6z − 9 6 9
Y (z) = =z =z + in partial fractions
(z − 3)2 (z − 3)2 z − 3 (z − 3)2
from which, using the result (a) on the second term,
yn = 6 × 3n + 3n × 3n = (6 + 3n)3n
We shall re-do this inversion by an alternative method shortly.
Task
Solve the difference equation
yn+2 + yn = 0 with y0 , y1 arbitrary. (14)
Start by obtaining Y (z) using the left shift theorem:
Your solution
HELM (2006): 51
Answer
z 2 Y (z) − z 2 y0 − zy1 + Y (z) = 0

(z 2 + 1)Y (z) = z 2 y0 + zy1
z2 z
Y (z) = 2 y0 + 2 y1
z +1 z +1
To find the inverse z-transforms recall the results for Z{cos ωn} and Z{sin ωn} from Key Point 6
(page 21) and some of the particular cases discussed in Section 21.2. Hence find yn here:
Your solution
Answer
π
Taking Z{cos ωn} and Z{sin ωn} with ω =
2
n nπ o z2
Z cos = 2
2 z +1
n nπ o z
Z sin = 2
2 z +1
−1 z2 −1 z nπ nπ
Hence yn = y0 Z { 2 } + y1 Z { 2 } = y0 cos + y1 sin (15)
z +1 z +1 2 2
Those of you who are familiar with differential equations may know that
d2 y
2
+y =0 y(0) = y0 , y 0 (0) = y00 (16)
dt
has solutions y1 = cos t and y2 = sin t and a general solution
y = c1 cos t + c2 sin t (17)
where c1 = y0 and c2 = y00 .
This differential equation is a model for simple harmonic oscillations. The difference equation (14)
and its solution (15) are the discrete counterparts of (16) and (17).
52 HELM (2006):
®
3. Inversion of z-transforms using residues

This method has its basis in a branch of mathematics called complex integration. You may recall
that the ‘z’ quantity of z-transforms is a complex quantity, more specifically a complex variable.
However, it is not necessary to delve deeply into the theory of complex variables in order to obtain
simple inverse z-transforms using what are called residues. In many cases inversion using residues is
easier than using partial fractions. Hence reading on is strongly advised.
Pole of a function of a complex variable

If G(z) is a function of the complex variable z and if
G1 (z)
G(z) =
(z − z0 )k
where G1 (z0 ) is non-zero and finite then G(z) is said to have a pole of order k at z = z0 .
For example if
6(z − 2)
G(z) =
z(z − 3)(z − 4)2
then G(z) has the following 3 poles.
(i) pole of order 1 at z = 0
(ii) pole of order 1 at z = 3
(iii) pole of order 2 at z = 4.
(Poles of order 1 are sometimes known as simple poles.)
Note that when z = 2, G(z) = 0. Hence z = 2 is said to be a zero of G(z). (It is the only zero in
this case).
Task
Write down the poles and zeros of
3(z + 4)
G(z) = (18)
z 2 (2z+ 1)(3z − 9)
State the order of each pole.
Your solution
Answer
G(z) has a zero when z = −4.
G(z) has first order poles at z = −1/2, z = 3.
G(z) has a second order pole at z = 0.
HELM (2006): 53
Residue at a pole
The residue of a complex function G(z) at a first order pole z0 is
Res (G(z), z0 ) = [G(z)(z − z0 )]z0 (19)
The residue at a second order pole z0 is

d
Res (G(z), z0 ) = (G(z)(z − z0 )2 ) (20)
dz z0
You need not worry about how these results are obtained or their full mathematical significance.
(Any textbook on Complex Variable Theory could be consulted by interested readers.)
Example
Consider again the function (18) in the previous guided exercise.
3(z + 4)
G(z) =
z 2 (2z + 1)(3z − 9)
(z + 4)
=
z + 12 (z − 3)

2z 2
The second form is the more convenient for the residue formulae to be used.
Using (19) at the two first order poles:

1 1
Res G(z), − = G(z) z − −
2 2 1
2

(z + 4) 18
= 2
=−
2z (z − 3) 1 5
2
 
 (z + 4)  1
Res [G(z), 3] =   =
 1  9
2z 2 z +
2 3
Using (20) at the second order pole

d 2
Res (G(z), 0) = (G(z)(z − 0) )
dz 0
The differentiation has to be carried out before the substitution of z = 0 of course.

  
d 
 (z +4)

∴ Res (G(z), 0) =  
 dz  1 
2 z+ (z − 3)
2 0
  
1 d  z+4
=

2 dz
  5 3 
z2 − z −
2 2 0
54 HELM (2006):
®
Task
Carry out the differentiation shown on the last line of the previous page, then
substitute z = 0 and hence obtain the required residue.
Your solution
Answer
Differentiating by the quotient rule then substituting z = 0 gives
17
Res (G(z), 0) =
9
Key Point 15
Residue at a Pole of Order k
If G(z) has a k th order pole at z = z0
G1 (z)
i.e. G(z) = G1 (z0 ) 6= 0 and finite
(z − z0 )k
k−1
1 d k
Res (G(z), z0 ) = (G(z) (z − z0 ) ) (21)
(k − 1)! dz k−1 z0
This formula reduces to (19) and (20) when k = 1 and 2 respectively.
HELM (2006): 55
Inverse z-transform formula
Recall that, by definition, the z-transform of a sequence {fn } is
F (z) = f0 + f1 z −1 + f2 z −2 + . . . fn z −n + . . .
If we multiply both sides by z n−1 where n is a positive integer we obtain
F (z)z n−1 = f0 z n−1 + f1 z n−2 + f2 z n−3 + . . . fn z −1 + fn+1 z −2 + . . .
Using again a result from complex integration it can be shown from this expression that the general
term fn is given by
fn = sum of residues of F (z) z n−1 at its poles (22)
The poles of F (z)z n−1 will be those of F (z) with possibly additional poles at the origin.
To illustrate the residue method of inversion we shall re-do some of the earlier examples that were
done using partial fractions.
Example:
z2
Y (z) = a 6= b
(z − a)(z − b)
so
z n+1
Y (z)z n−1 = = G(z), say.
(z − a)(z − b)
G(z) has first order poles at z = a, z = b so using (19).
z n+1 an+1

Res (G(z), a) = =
z−b a a−b
n+1
bn+1 −bn+1

z
Res (G(z), b) = = =
z−a b b−a a−b
We need simply add these residues to obtain the required inverse z-transform
1
∴ fn = (an+1 − bn+1 )
(a − b)
as before.
56 HELM (2006):
®
Task
Obtain, using (22), the inverse z-transform of
6z 2 − 9z
Y (z) =
(z − 3)2
Firstly, obtain the pole(s) of G(z) = Y (z)z n−1 and deduce the order:
Your solution
Answer
6z n+1 − 9z n
G(z) = Y (z)z n−1 =
(z − 3)2
whose only pole is one of second order at z = 3.
Now calculate the residue of G(z) at z = 3 using (20) and hence write down the required inverse
z-transform yn :
Your solution
HELM (2006): 57
Answer

d n+1 n
Res (G(z), 3) = (6z − 9z )
dz 3
6(n + 1)z n − 9nz n−1 3

=
= 6(n + 1)3n − 9n3n−1
= 6 × 3n + 3n3n
This is the same as was found by partial fractions, but there is considerably less labour by the residue
method.
In the above examples all the poles of the various functions G(z) were real. This is the easiest
situation but the residue method will cope with complex poles.
Example
We showed earlier that
z2 nπ
and cos
z2 + 1 2
formed a z-transform pair.
z2
We will now obtain yn if Y (z) = using residues.
z2 + 1
Using residues with, from (22),
z n+1 z n+1
G(z) = = where i2 = −1.
z2 + 1 (z − i)(z + i)
we see that G(z) has first order poles at the complex conjugate points ± i.
Using (19)
in+1
n+1
z (−i)n+1
Res (G(z), i) = = Res (G(z), −i) =
z+i i 2i (−2i)
(Note the complex conjugate residues at the complex conjugate poles.)
z2 1 n+1
Hence Z−1 { 2

}= i − (−i)n+1
z +1 2i
But i = eiπ/2 and −i = e−iπ/2 , so the inverse z-transform is
1 i(n+1)π/2 π nπ
− e−i(n+1)π/2 = sin(n + 1) = cos

e as expected.
2i 2 2
58 HELM (2006):
®
Task
Show, using residues, that
z nπ
Z−1 { 2 } = sin
z +1 2
Your solution
Answer
Using (22):
z zn zn
G(z) = z n−1 = =
z2 + 1 z2 + 1 (z + i)(z − i)
in
Res (G(z), i) =
2i
(−i)n
Res (G(z), −i) =
−2i
z 1 n
Z−1 { } = (i − (−i)n )
z2 + 1 2i
1 inπ/2
= (e − e−inπ/2 )
2i
nπ
= sin
2
HELM (2006): 59
4. An application of difference equations – currents in a
ladder network
The application we will consider is that of finding the electric currents in each loop of the ladder resis-
tance network shown, which consists of (N + 1) loops. The currents form a sequence {i0 , i1 , . . . iN }
V io i1 in in+1 iN
Figure 7
All the resistors have the same resistance R so loops 1 to N are identical. The zero’th loop contains
an applied voltage V . In this zero’th loop, Kirchhoff’s voltage law gives
V = Ri0 + R(i0 − i1 )
from which
V
i1 = 2i0 − (23)
R
Similarly, applying the Kirchhoff law to the (n + 1)th loop where there is no voltage source and 3
resistors
0 = Rin+1 + R(in+1 − in+2 ) + R(in+1 − in )
from which
in+2 − 3in+1 + in = 0 n = 0, 1, 2, . . . (N − 2) (24)
(24) is the basic difference equation that has to be solved.
Task
Using the left shift theorems obtain the z-transform of equation (24). Denote by
I(z) the z-transform of {in }. Simplify the algebraic equation you obtain.
Your solution
60 HELM (2006):
®
Answer
We obtain
z 2 I(z) − z 2 i0 − zi1 − 3(zI(z) − zi0 ) + I(z) = 0
Simplifying
(z 2 − 3z + 1)I(z) = z 2 i0 + zi1 − 3zi0 (25)
If we now eliminate i1 using (23), the right-hand side of (25) becomes

V V V
z 2 i0 + z 2i0 − − 3zi0 = z 2 i0 − zi0 − z = i0 z 2 − z − z
R R i0 R
Hence from (25)

2 V
i0 z − 1 + z
i0 R
I(z) = (26)
z 2 − 3z + 1
Our final task is to find the inverse z-transform of (26).
Task
Look at the table of z-transforms on page 35 (or at the back of the Workbook)
and suggest what sequences are likely to arise by inverting I(z) as given in (26).
Your solution
Answer
The most likely candidates are hyperbolic sequences because both {cosh αn} and {sinh αn} have
z-transforms with denominator
z 2 − 2z cosh α + 1
which is of the same form as the denominator of (26), remembering that cosh α ≥ 1. (Why are the
trigonometric sequences {cos ωn} and {sin ωn} not plausible here?)
To proceed, we introduce a quantity α such that α is the positive solution of 2 cosh α = 3 from
which (using cosh2 α − sinh2 α ≡ 1) we get
HELM (2006): 61
r √
9 5
sinh α = −1=
4 2
Hence (26) can be written

2 V
z − 1+ z
i0 R
I(z) = i0 2 (27)
z − 2z cosh α + 1
To further progress, bearing in mind the z-transforms of {cosh αn} and {sinh αn}, we must subtract
and add z cosh α to the numerator of (27), where cosh α = 23 .

3z V

2
 z − z cosh α + 2 − 1 + i0 R z 
I(z) = i0  
 z 2 − 2z cosh α + 1 

3 Vz
 
 (z 2 − z cosh α) −1 z−
2 i0 R 
= i0  2
 + 2 
z − 2z cosh α + 1 z − 2z cosh α + 1 
The first term in the square bracket is the z-transform of {cosh αn}.
The second term is

√
1 V 1 V 2 5
− z − √ z
2 i0 R 2 i0 R 5 2
2
= 2
z − 2z cosh α + 1 z − 2z cosh α + 1
which has inverse z-transform

1 V 2
− √ sinh αn
2 i0 R 5
Hence we have for the loop currents

i0 V 2
in = i0 cosh(αn) + − √ sinh(αn) n = 0, 1, . . . N (27)
2 R 5
3
where cosh α = determines the value of α.
2
Finally, by Kirchhoff’s law applied to the rightmost loop
3iN = iN −1
from which, with (27), we could determine the value of i0 .
62 HELM (2006):
®
Exercises
1. Deduce the inverse z-transform of each of the following functions:
2z 2 − 3z
(a)
z 2 − 3z − 4
2z 2 + z
(b)
(z − 1)2
2z 2 − z
(c)
2z 2 − 2z + 2
3z 2 + 5
(d)
z4
2. Use z-transforms to solve each of the following difference equations:
(a) yn+1 − 3yn = 4n y0 = 0

(b) yn − 3yn−1 = 6 y−1 = 4
(c) yn − 2yn−1 = n y−1 = 0
(d) yn+1 − 5yn = 5n+1 y0 = 0
(e) yn+1 + 3yn = 4δn−2 y0 = 2
(f) yn − 7yn−1 + 10yn−2 = 0 y−1 = 16, y−2 = 5
(g) yn − 6yn−1 + 9yn−2 = 0 y−1 = 1, y−2 = 0
Answers
1 (a) (−1)n + 4n (b) 2 + 3n (c) cos(nπ/3) (d) 3δn−2 + 5δn−4
2 (a) yn = 4n − 3n (b) yn = 21 × 3n − 3 (c) yn = 2 × 2n − 2 − n (d) yn = n5n
(e) yn = 2 × (−3)n + 4 × (−3)n−3 un−2 (f) yn = 12 × 2n + 50 × 5n (g) yn = (6 + 3n)3n
HELM (2006): 63
Engineering
Applications
of z-Transforms 21.4
Introduction
In this Section we shall apply the basic theory of z-transforms to help us to obtain the response or
output sequence for a discrete system. This will involve the concept of the transfer function and we
shall also show how to obtain the transfer functions of series and feedback systems. We will also
discuss an alternative technique for output calculations using convolution. Finally we shall discuss
the initial and final value theorems of z-transforms which are important in digital control.

Prerequisites • be familiar with basic z-transforms,

particularly the shift properties

'
$
• obtain transfer functions for discrete systems
including series and feedback combinations
Learning Outcomes
• state the link between the convolution
On completion you should be able to . . . summation of two sequences and the product
of their z-transforms
& %
64 HELM (2006):
®
1. Applications of z-transforms
Transfer (or system) function
Consider a first order linear constant coefficient difference equation
yn + a yn−1 = bxn n = 0, 1, 2, . . . (1)
where {xn } is a given sequence.
Assume an initial condition y−1 is given.
Task
Take the z-transform of (1), insert the initial condition and obtain Y (z) in terms
of X(z).
Your solution
Answer
Using the right shift theorem
Y (z) + a(z −1 Y (z) + y−1 ) = b X(z)
where X(z) is the z-transform of the given or input sequence {xn } and Y (z) is the z-transform of
the response or output sequence {yn }.
Solving for Y (z)
Y (z)(1 + az −1 ) = bX(z) − ay−1
so
bX(z) ay−1
Y (z) = − (2)
1 + az −1 1 + az −1
The form of (2) shows us clearly that Y (z) is made up of two components, Y1 (z) and Y2 (z) say,
where
bX(z)
(i) Y1 (z) = which depends on the input X(z)
1 + az −1
−ay−1
(ii) Y2 (z) = which depends on the initial condition y−1 .
1 + az −1
HELM (2006): 65
Section 21.4: Engineering Applications of z-Transforms
Clearly, from (2), if y−1 = 0 (zero initial condition) then
Y (z) = Y1 (z)
and hence the term zero-state response is sometimes used for Y1 (z).
Similarly if {xn } and hence X(z) = 0 (zero input)
Y (z) = Y2 (z)
and hence the term zero-input response can be used for Y2 (z).
In engineering the difference equation (1) is regarded as modelling a system or more specifically a
linear discrete time-invariant system. The terms linear and time-invariant arise because the difference
equation (1) is linear and has constant coefficients i.e. the coefficients do not involve the index n.
The term ‘discrete’ is used because sequences of numbers, not continuous quantities, are involved.
As noted above, the given sequence {xn } is considered to be the input sequence and {yn }, the
solution to (1), is regarded as the output sequence.
{xn } {yn }
system
input output
(stimulus) (response)
Figure 8
A more precise block diagram representation of a system can be easily drawn since only two operations
are involved:
1. Multiplying the terms of a sequence by a constant.
2. Shifting to the right, or delaying, the terms of the sequence.
A system which consists of a single multiplier is denoted as shown by a triangular symbol:
{xn } {yn }
A yn = Axn
Figure 9
As we have seen earlier in this workbook a system which consists of only a single delay unit is
represented symbolically as follows
{xn } {yn }
z −1 yn = xn−1
Figure 10
The system represented by the difference equation (1) consists of two multipliers and one delay unit.
Because (1) can be written
yn = bxn − ayn−1
a symbolic representation of (1) is as shown in Figure 11.
66 HELM (2006):
®
{xn } {yn }
+
b
+
z −1 −a
Figure 11
The circle symbol denotes an adder or summation unit whose output is the sum of the two (or more)
sequences that are input to it.
We will now concentrate upon the zero state response of the system i.e. we will assume that the
initial condition y−1 is zero.
Thus, using (2),
bX(z)
Y (z) =
1 + az −1
so
Y (z) b
= (3)
X(z) 1 + az −1
Y (z)
The quantity , the ratio of the output z-transform to the input z-transform, is called the
X(z)
transfer function of the discrete system. It is often denoted by H(z).
Key Point 16
The transfer function H(z) of a discrete system is defined by
Y (z) z-transform of output sequence
H(z) = =
X(z) z-transform of input sequence
when the initial conditions are zero.
HELM (2006): 67
Task
(a) Write down the transfer function H(z) of the system represented by (1)
(i) using negative powers of z
(ii) using positive powers of z.
(b) Write down the inverse z-transform of H(z).
Your solution
Answer
(a) From (3)
b
(i) H(z) =
1 + az −1
bz
(ii) H(z) =
z+a
(b) Referring to the Table of z-transforms at the end of the Workbook:
{hn } = b(−a)n n = 0, 1, 2, . . .
We can represent any discrete system as follows
{xn } {yn }
H(z)
X(z) Y (z)
Figure 12
From the definition of the transfer function it follows that
Y (z) = X(z)H(z) (at zero initial conditions).
The corresponding relation between {yn }, {xn } and the inverse z-transform {hn } of the transfer
function will be discussed later; it is called a convolution summation.
The significance of {hn } is readily obtained.

1 n=0
Suppose {xn } =
0 n = 1, 2, 3, . . .
i.e. {xn } is the unit impulse sequence that is normally denoted by δn . Hence, in this case,
X(z) = Z{δn } = 1 so Y (z) = H(z) and {yn } = {hn }
In words: {hn } is the response or output of a system where the input is the unit impulse sequence
{δn }. Hence {hn } is called the unit impulse response of the system.
68 HELM (2006):
®
Key Point 17
For a linear, time invariant discrete system, the unit impulse response and the system transfer
function are a z-transform pair:
H(z) = Z{hn } {hn } = Z−1 {H(z)}
It follows from the previous Task that for the first order system (1)
b bz
H(z) = = is the transfer function and
1 + az −1 z+a
{hn } = {b(−a)n } is the unit impulse response sequence.
Task
Write down the transfer function of
(a) a single multiplier unit (b) a single delay unit.
Your solution
Answer
(a) {yn } = {A xn } if the multiplying factor is A
∴ using the linearity property of z-transform
Y (z) = AX(z)
Y (z)
so H(z) = =A is the required transfer function.
X(z)
(b) {yn } = {xn−1 }
so Y (z) = z −1 X(z) (remembering that initial conditions are zero)
∴ H(z) = z −1 is the transfer function of the single delay unit.
HELM (2006): 69
Task
Obtain the transfer function of the system.
yn + a1 yn−1 = b0 xn + b1 xn−1 n = 0, 1, 2, . . .
where {xn } is a known sequence with xn = 0 for n = −1, −2, . . . .
[Remember that the transfer function is only defined at zero initial condition i.e.
assume y−1 = 0 also.]
Your solution
Answer
Taking z-transforms
Y (z) + a1 z −1 Y (z) = b0 X(z) + b1 z −1 X(z)

Y (z)(1 + a1 z −1 ) = (b0 + b1 z −1 )X(z)
so the transfer function is

Y (z) b0 + b1 z −1 b0 z + b1
H(z) = = −1
=
X(z) 1 + a1 z z + a1
Second order systems

Consider the system whose difference equation is
yn + a1 yn−1 + a2 yn−2 = bxn n = 0, 1, 2, . . . (4)
where the input sequence xn = 0, n = −1, −2, . . .
In exactly the same way as for first order systems it is easy to show that the system response has a
z-transform with two components.
70 HELM (2006):
®
Task
Take the z-transform of (4), assuming given initial values y−1 , y−2 . Show that
Y (z) has two components. Obtain the transfer function of the system (4).
Your solution
Answer
From (4)
Y (z) + a1 (z −1 Y (z) + y−1 ) + a2 (z −2 Y (z) + z −1 y−1 + y−2 ) = bX(z)

Y (z)(1 + a1 z −1 + a2 z −2 ) + a1 y−1 + a2 z −1 y−1 + a2 y−2 = bX(z)
bX(z) (a1 y−1 + a2 z −1 y−1 + a2 y−2 )

∴ Y (z) = − = Y1 (z) + Y2 (z) say.
1 + a1 z −1 + a2 z −2 1 + a1 z −1 + a2 z −2
At zero initial conditions, Y (z) = Y1 (z) so the transfer function is
b bz 2
H(z) = = .
1 + a1 z −1 + a2 z −2 z 2 + a1 z + a2
Example
Obtain (i) the unit impulse response (ii) the unit step response of the system specified by the second
order difference equation
3 1
yn − yn−1 + yn−2 = xn (5)
4 8
Note that both these responses refer to the case of zero initial conditions. Hence it is convenient to
first obtain the transfer function H(z) of the system and then use the relation Y (z) = X(z)H(z) in
each case.
We write down the transfer function of (5), using positive powers of z. Taking the z-transform of
(5) at zero initial conditions we obtain
3 1
Y (z) − z −1 Y (z) + z −2 Y (z) = X(z)
4 8
3 −1 1 −2
Y (z) 1 − z + z = X(z)
4 8
Y (z) z2 z2
∴ H(z) = = 2 3 1 =
X(z) z − 4z + 8
(z − 12 )(z − 14 )
HELM (2006): 71
We now complete the problem for inputs (i) xn = δn (ii) xn = un , the unit step sequence, using
partial fractions.
z2 2z z
H(z) = 1 −
1
1
= 1
z− 2
z− 4
z−2 z− 4
(i) With xn = δn so X(z) = 1 the response is, as we saw earlier,

Y (z) = H(z)
so yn = hn
n n
−1 1 1
where hn = Z H(z) = 2 × − n = 0, 1, 2, . . .
2 4
z
(ii) The z-transform of the unit step is so the unit step response has z-transform
z−1
z2 z
Y (z) = 1
1

z− 2
z− 4
(z − 1)
1 8
2z 3
z 3
z
= − 1 + 1 +
z−2 z−4 z−1
Hence, taking inverse z-transforms, the unit step response of the system is
n n
1 1 1 8
yn = (−2) × + × + n = 0, 1, 2, . . .
2 3 4 3
Notice carefully the form of this unit step response - the first two terms decrease as n increases and
are called transients. Thus
8
yn → as n→∞
3
8
and the term is referred to as the steady state part of the unit step response.
3
Combinations of systems
The concept of transfer function enables us to readily analyse combinations of discrete systems.
Series combination
Suppose we have two systems S1 and S2 with transfer functions H1 (z), H2 (z) in series with each
other. i.e. the output from S1 is the input to S2 .
{xn } {y1 (n)} = {x2 (n)} S2 {yn }

S1
H1 (z) H2 (z)
X(z) Y1 (z) = X2 (z) Y (z)
Figure 13
72 HELM (2006):
®
Clearly, at zero initial conditions,

Y1 (z) = H1 (z)X(z)
Y (z) = H2 (z)X2 (z)
= H2 (z)Y1 (z)
∴ Y (z) = H2 (z)H1 (z)X(z)
so the ratio of the final output transform to the input transform is
Y (z)
= H2 (z) H1 (z) (6)
X(z)
i.e. the series system shown above is equivalent to a single system with transfer function H2 (z) H1 (z)
{xn } {yn }
H1 (z)H2 (z)
X(z) Y (z)
Figure 14
Task
Obtain (a) the transfer function (b) the governing difference equation of the system
obtained by connecting two first order systems S1 and S2 in series. The governing
equations are:
S1 : yn − ayn−1 = bxn
S2 : yn − cyn−1 = dxn
(a) Begin by finding the transfer function of S1 and S2 and then use (6):
Your solution
HELM (2006): 73
Answer
b
S1 : Y (z) − az −1 Y (z) = bX(z) so H1 (z) =
1 − az −1
d
S2 : H2 (z) =
1 − cz −1
so the series arrangement has transfer function
bd
H(z) =
(1 − az −1 )(1 − cz −1 )
bd
=
1 − (a + c)z −1 + acz −2
If X(z) and Y (z) are the input and output transforms for the series arrangement, then
bdX(z)
Y (z) = H(z) X(z) =
1 − (a + c)z −1 + acz −2
(b) By transfering the denominator from the right-hand side to the left-hand side and taking inverse
z-transforms obtain the required difference equation of the series arrangement:
Your solution
Answer
We have
Y (z)(1 − (a + c)z −1 + acz −2 ) = bdX(z)
Y (z) − (a + c)z −1 Y (z) + acz −2 Y (z) = bdX(z)
from which, using the right shift theorem,
yn − (a + c)yn−1 + acyn−2 = bd xn .
which is the required difference equation.
You can see that the two first order systems in series have an equivalent second order system.
74 HELM (2006):
®
Feedback combination
{xn } {wn }
+ H1 (z) Y (z)
W (z)
X(z) +
−1 H2 (z)
Figure 15
For the above negative feedback arrangement of two discrete systems with transfer functions
H1 (z), H2 (z) we have, at zero initial conditions,
Y (z) = W (z)H1 (z) where W (z) = X(z) − H2 (z)Y (z)
Task
Eliminate W (z) and hence obtain the transfer function of the feedback system.
Your solution
Answer
Y (z) = (X(z) − H2 (z)Y (z))H1 (z)

= X(z)H1 (z) − H2 (z)H1 (z)Y (z)
so
Y (z)(1 + H2 (z)H1 (z)) = X(z)H1 (z)
Y (z) H1 (z)
∴ =
X(z) 1 + H2 (z)H1 (z)
This is the required transfer function of the negative feedback system.
HELM (2006): 75
2. Convolution and z-transforms
Consider a discrete system with transfer function H(z)
{xn } {yn }
H(z)
X(z) Y (z)
Figure 16
We know, from the definition of the transfer function that at zero initial conditions
Y (z) = X(z)H(z) (7)
We now investigate the corresponding relation between the input sequence {xn } and the output
sequence {yn }. We have seen earlier that the system itself can be characterised by its unit impulse
response {hn } which is the inverse z-transform of H(z).
We are thus seeking the inverse z-transform of the product X(z)H(z). We emphasize immediately
that this is not given by the product {xn }{hn }, a point we also made much earlier in the workbook.
We go back to basic definitions of the z-transform:
Y (z) = y0 + y1 z −1 + y2 z −2 + y3 z −3 + . . .
X(z) = x0 + x1 z −1 + x2 z −2 + x3 z −3 + . . .
H(z) = h0 + h1 z −1 + h2 z −2 + h3 z −3 + . . .
Hence, multiplying X(z) by H(z) we obtain, collecting the terms according to the powers of z −1 :
x0 h0 + (x0 h1 + x1 h0 )z −1 + (x0 h2 + x1 h1 + x2 h0 )z −2 + . . .
Task
Write out the terms in z −3 in the product X(z)H(z) and, looking at the emerging
pattern, deduce the coefficient of z −n .
Your solution
76 HELM (2006):
®
Answer
(x0 h3 + x1 h2 + x2 h1 + x3 h0 )z −3
which suggests that the coefficient of z −n is
x0 hn + x1 hn−1 + x2 hn−2 + . . . + xn−1 h1 + xn h0
Hence, comparing corresponding terms in Y (z) and X(z)H(z)


z 0 : y0 = x0 h0 

−1
z : y1 = x0 h1 + x1 h0

(8)
z −2 : y2 = x0 h2 + x1 h1 + x2 h0 

−3
z : y3 = x0 h3 + x1 h2 + x2 h1 + x3 h0

.. ..
. .
z −n : yn = x0 hn + x1 hn−1 + x2 hn−2 + . . . + xn−1 h1 + xn h0 (9)
n
X
= xk hn−k (10a)
k=0
n
X
= hk xn−k (10b)
k=0
(Can you see why (10b) also follows from (9)?)
The sequence {yn } whose n th term is given by (9) and (10) is said to be the convolution (or more
precisely the convolution summation) of the sequences {xn } and {hn },
The convolution of two sequences is usually denoted by an asterisk symbol (∗).
We have shown therefore that
Z−1 {X(z)H(z)} = {xn } ∗ {hn } = {hn } ∗ {xn }
where the general term of {xn } ∗ {hn } is in (10a) and that of {hn } ∗ {xn } is in (10b).
In words: the output sequence {yn } from a linear time invariant system is given by the convolution
of the input sequence with the unit impulse response sequence of the system.
This result only holds if initial conditions are zero.
HELM (2006): 77
Key Point 18
{xn } {yn }
H(z)
X(z) Y (z)
Figure 17
We have, at zero initial conditions
Y (z) = X(z)H(z) (definition of transfer function)
{yn } = {xn } ∗ {hn } (convolution summation)
where yn is given in general by (9) and (10) with the first four terms written out explicitly in (8).
Although we have developed the convolution summation in the context of linear systems the proof
given actually applies to any sequences i.e. for arbitrary causal sequences say {vn } {wn } with z-
transforms V (z) and W (z) respectively:
Z−1 {V (z)W (z)} = {vn } ∗ {wn } or, equivalently, Z({vn } ∗ {wn }) = V (z)W (z).
Indeed it is simple to prove this second result from the definition of the z-transform for any causal
sequences {vn } = {v0 , v1 , v2 , . . .} and {wn } = {w0 , w1 , w2 , . . .}
n
X
Thus since the general term of {vn } ∗ {wn } is vk wn−k
k=0
we have
∞
( n )
X X
Z({vn } ∗ {wn }) = vk wn−k z −n
n=0 k=0
or, since wn−k = 0 if k > n,

∞ X
X ∞
Z({vn } ∗ {wn }) = vk wn−k z −n
n=0 k=0
Putting m = n − k or n = m + k we obtain
∞ X
X ∞
Z({vn } ∗ {wn }) = vk wm z −(m+k) (Why is the lower limit m = 0 correct?)
m=0 k=0
Finally,
∞
X ∞
X
−m
Z({vn } ∗ {wn }) = wm z vk z −k = W (z)V (z)
m=0 k=0
which completes the proof.
78 HELM (2006):
®
Example 2
Calculate the convolution {yn } of the sequences
{vn } = {an } {wn } = {bn } a 6= b
(i) directly (ii) using z-transforms.
Solution
(i) We have from (10)
n
X n
X
yn = vk wn−k = ak bn−k
k=0 k=0
n
n
X a k
= b
k=0
b
a a 2 a n
n
= b 1+ + + ...
b b b
a
The bracketed sum involves n + 1 terms of a geometric series of common ratio .
b
a n+1
1−
n b
∴ yn = b a
1−
b
(bn+1 − an+1 )
=
(b − a)
(ii) The z-transforms are

z
V (z) =
z−a
z
W (z) =
z−b
so
−1 z2
∴ yn = Z { }
(z − a)(z − b)
bn+1 − an+1
= using partial fractions or residues
(b − a)
HELM (2006): 79
Task
Obtain by two methods the convolution of the causal sequence
{2n } = {1, 2, 22 , 23 , . . .}
with itself.
Your solution
Answer
(a) By direct use of (10) if {yn } = {2n } ∗ {2n }
n
X n
X
k n−k n
yn = 2 2 =2 1 = (n + 1)2n
k=0 k=0
(b) Using z-transforms:

z
Z{2n } =
z−2
z2
so {yn } = Z−1 { }
(z − 2)2
We will find this using the residue method. Y (z)z n−1 has a second order pole at z = 2.
n+1
z
∴ yn = Res , 2
(z − 2)2

d n+1
= z = (n + 1)2n
dz 2
80 HELM (2006):
®
3. Initial and final value theorems of z-transforms

These results are important in, for example, Digital Control Theory where we are sometimes partic-
ularly interested in the initial and ultimate behaviour of systems.
Initial value theorem

.
If fn is a sequence with z-transform F (z) then the ‘initial value’ f0 is given by
f0 = lim F (z) (provided, of course, that this limit exists).
z→∞
This result follows, at least informally, from the definition of the z-transform:
F (z) = f0 + f1 z −1 + f2 z −2 + . . .
from which, taking limits as z → ∞ the required result is obtained.
Task
Obtain the z-transform of
f (n) = 1 − an , 0<a<1
Verify the initial value theorem for the z-transform pair you obtain.
Your solution
Answer
Using standard z-transforms we obtain
z z
Z{fn } = F (z) = −
z−1 z−a
1 1
= −1
−
1−z 1 − az −1
hence, as z → ∞ : F (z) → 1 − 1 = 0
Similarly, as n → 0
fn → 1 − 1 = 0
so the initial value theorem is verified for this case.
HELM (2006): 81
Final value theorem
Suppose again that {fn } is a sequence with z-transform F (z). We further assume that all the poles
of F (z) lie inside the unit circle in the z−plane (i.e. have magnitude less than 1) apart possibly from
a first order pole at z = 1.
The ‘final value’ of fn i.e. lim fn is then given by lim fn = lim (1 − z −1 )F (z)
n→∞ n→∞ z→1
Proof: Recalling the left shift property

Z{fn+1 } = zF (z) − zf0
we have
k
X
Z{fn+1 − fn } = lim (fn+1 − fn )z −n = zF (z) − zf0 − F (z)
k→∞
n=0
or, alternatively, dividing through by z on both sides:

k
X
−1
(1 − z )F (z) − f0 = lim (fn+1 − fn )z −(n+1)
k→∞
n=0
Hence (1 − z −1 )F (z) = f0 + (f1 − f0 )z −1 + (f2 − f1 )z −2 + . . .

or as z → 1
lim (1 − z −1 )F (z) = f0 + (f1 − f0 ) + (f2 − f1 ) + . . .

z→1
= lim fk
k→∞
Example
Again consider the sequence fn = 1 − an 0 < a < 1 and its z-transform
z z 1 1
F (z) = − = −1
−
z−1 z−a 1−z 1 − az −1
Clearly as n → ∞ then fn → 1.
Considering the right-hand side
(1 − z −1 )
(1 − z −1 )F (z) = 1 − → 1 − 0 = 1 as z → 1.
1 − az −1
Note carefully that
z z
F (z) = −
z−1 z−a
has a pole at a (0 < a < 1) and a simple pole at z = 1.
The final value theorem does not hold for z-transform poles outside the unit circle
z
e.g. fn = 2n F (z) =
z−2
Clearly fn → ∞ as n → ∞
whereas

−1 z−1 z
(1 − z )F (z) = → 0 as z → 1
z (z − 2)
82 HELM (2006):
®
Exercises
1. A low pass digital filter is characterised by
yn = 0.1xn + 0.9yn−1
Two such filters are connected in series. Deduce the transfer function and governing difference
equation for the overall system. Obtain the response of the series system to (i) a unit step and
(ii) a unit alternating input. Discuss your results.
2. The two systems
yn = xn − 0.7xn−1 + 0.4yn−1
yn = 0.9xn−1 − 0.7yn−1
are connected in series. Find the difference equation governing the overall system.
3. A system S1 is governed by the difference equation
yn = 6xn−1 + 5yn−1
It is desired to stabilise S1 by using a feedback configuration. The system S2 in the feedback

loop is characterised by
yn = αxn−1 + βyn−1
Show that the feedback system S3 has an overall transfer function
H1 (z)
H3 (z) =
1 + H1 (z)H2 (z)
and determine values for the parameters α and β if H3 (z) is to have a second order pole at
z = 0.5. Show briefly why the feedback systems S3 stabilizes the original system.
4. Use z-transforms to find the sum of squares of all integers from 1 to n:

n
X
yn = k2
k=1
[Hint: yn − yn−1 = n2 ]
5. Evaluate each of the following convolution summations (i) directly (ii) using z-transforms:
(a) an ∗ bn a 6= b (b) an ∗ an (c) δn−3 ∗ δn−5

1 n = 0, 1, 2, 3
(d) xn ∗ xn where xn =
0 n = 4, 5, 6, 7 . . .
HELM (2006): 83
Answers
1. Step response: yn = 1 − (0.99)(0.9)n − 0.09n(0.9)n

1 2.61 1.71
Alternating response: yn = (−1)n + (0.9)n + n(0.9)n
361 361 361
2. yn + 0.3yn−1 − 0.28yn−2 = 0.9xn−1 − 0.63xn−2
3. α = 3.375 β = −4
n
X (2n + 1)(n + 1)n
4. k2 =
k=1
6
1
5. (a) (an+1 − bn+1 ) (b) (n + 1)an (c) δn−8 (d) {1, 2, 3, 4, 3, 2, 1}
(a − b)
84 HELM (2006):
®

Sampled Functions 21.5
Introduction
A sequence can be obtained by sampling a continuous function or signal and in this Section we
show first of all how to extend our knowledge of z-transforms so as to be able to deal with sampled
signals. We then show how the z-transform of a sampled signal is related to the Laplace transform
of the unsampled version of the signal.

Prerequisites • possess an outline knowledge of Laplace

transforms and of z-transforms

'
$
• take the z-transform of a sequence obtained
by sampling
Learning Outcomes • state the relation between the z-transform of
a sequence obtained by sampling and the
Laplace transform of the underlying
continuous signal
& %
HELM (2006): 85
Section 21.5: Sampled Functions
1. Sampling theory
If a continuous-time signal f (t) is sampled at terms t = 0, T, 2T, . . . nT, . . . then a sequence of
values
{f (0), f (T ), f (2T ), . . . f (nT ), . . .}
is obtained. The quantity T is called the sample interval or sample period.
f (t)
------
t
T 2T nT
Figure 18
In the previous Sections of this Workbook we have used the simpler notation {fn } to denote a
sequence. If the sequence has actually arisen by sampling then fn is just a convenient notation for
the sample value f (nT ).
Most of our previous results for z-transforms of sequences hold with only minor changes for sampled
signals.
So consider a continuous signal f (t); its z-transform is the z-transform of the sequence of sample
values i.e.
X∞
Z{f (t)} = Z{f (nT )} = f (nT )z −n
n=0
We shall briefly obtain z-transforms of common sampled signals utilizing results obtained earlier. You
may assume that all signals are sampled at 0, T, 2T, . . . nT, . . .
Unit step function

1 t≥0
u(t) =
0 t<0
Since the sampled values here are a sequence of 1’s,
1
Z{u(t)} = Z{un } =
1 − z −1
z
= |z| > 1
z−1
where {un } = {1, 1, 1, . . .} is the unit step sequence.

↑
86 HELM (2006):
®
Ramp function

t t≥0
r(t) =
0 t<0
The sample values here are
{r(nT )} = {0, T, 2T, . . .}
z
The ramp sequence {rn } = {0, 1, 2, . . .} has z-transform .
(z − 1)2
Tz
Hence Z{r(nT )} = since {r(nT )} = T {rn }.
(z − 1)2
Task
Obtain the z-transform of the exponential signal
−αt
e t≥0
f (t) =
0 t < 0.
[Hint: use the z-transform of the geometric sequence {an }.]
Your solution
Answer
The sample values of the exponential are
{1, e−αT , e−α2T , . . . , e−αnT , . . .}
i.e. f (nT ) = e−αnT = (e−αT )n .
z
But Z{an } =
z−a
z 1
∴ Z{(e−αT )n } = −αT
= −αT
z−e 1−e z −1
HELM (2006): 87
Sampled sinusoids
Earlier in this Workbook we obtained the z-transform of the sequence {cos ωn} i.e.
z 2 − z cos ω
Z{cos ωn} =
z 2 − 2z cos ω + 1
Hence, since sampling the continuous sinusoid
f (t) = cos ωt
yields the sequence {cos nωT } we have, simply replacing ω by ωT in the z-transform:
Z{cos ωt} = Z{cos nωT }

z 2 − z cos ωT
= 2
z − 2z cos ωT + 1
Task
Obtain the z-transform of the sampled version of the sine wave f (t) = sin ωt.
Your solution
Answer
z sin ω
Z{sin ωn} =
z 2 − 2z cos ω + 1
∴ Z{sin ωt} = Z{sin nωT }
z sin ωT
= 2
z − 2z cos ωT + 1
88 HELM (2006):
®
Shift theorems
These are similar to those discussed earlier in this Workbook but for sampled signals the shifts are
by integer multiples of the sample period T . For example a simple right shift, or delay, of a sampled
signal by one sample period is shown in the following figure:
f (nT )
t
T 2T 3T
f (nT − T )
t
T 2T 3T 4T
Figure 19
The right shift properties of z-transforms can be written down immediately. (Look back at the shift
properties in Section 21.2 subsection 5, if necessary:)
If y(t) has z-transform Y (z) which, as we have seen, really means that its sample values {y(nT )}
give Y (z), then for y(t) shifted to the right by one sample interval the z-transform becomes
Z{y(t − T )} = y(−T ) + z −1 Y (z)
The proof is very similar to that used for sequences earlier which gave the result:
Z{yn−1 } = y−1 + z −1 Y (z)
Task
Using the result
Z{yn−2 } = y−2 + y−1 z −1 + z −2 Y (z)
write down the result for Z{y(t − 2T )}
Your solution
Answer
Z{y(t − 2T )} = y(−2T ) + y(−T )z −1 + z −2 Y (z)
These results can of course be generalised to obtain Z{y(t − mT )} where m is any positive integer.
In particular, for causal or one-sided signals y(t) (i.e. signals which are zero for t < 0):
Z{y(t − mT )} = z −m Y (z)
Note carefully here that the power of z is still z −m not z −mT .
HELM (2006): 89
Examples:
For the unit step function we saw that:
z 1
Z{u(t)} = =
z−1 1 − z −1
Hence from the shift properties above we have immediately, since u(t) is certainly causal,
zz −1 z −1
Z{u(t − T )} = =
z−1 1 − z −1
zz −3 z −3
Z{u(t − 3T )} = =
z−1 1 − z −1
and so on.
u(t − T )
t
T 2T 3T
u(t − 3T )
t
T 2T 3T 4T 5T
Figure 20
2. z-transforms and Laplace transforms

In this Workbook we have developed the theory and some applications of the z-transform from first
principles. We mentioned much earlier that the z-transform plays essentially the same role for discrete
systems that the Laplace transform does for continuous systems. We now explore the precise link
between these two transforms. A brief knowledge of Laplace transform will be assumed.
At first sight it is not obvious that there is a connection. The z-transform is a summation defined,
for a sampled signal fn ≡ f (nT ), as
∞
X
F (z) = f (nT )z −n
n=0
while the Laplace transform written symbolically as L{f (t)} is an integral, defined for a continuous
time function f (t), t ≥ 0 as
Z ∞
F (s) = f (t)e−st dt.
0
90 HELM (2006):
®
Thus, for example, if
f (t) = e−αt (continuous time exponential)
1
L{f (t)} = F (s) =
s+α
which has a (simple) pole at s = −α = s1 say.
As we have seen, sampling f (t) gives the sequence {f (nT )} = {e−αnT } with z-transform
1 z
F (z) = = .
1 − e−αT z −1 z − e−αT
The z-transform has a pole when z = z1 where
z1 = e−αT = es1 T
[Note the abuse of notations in writing both F (s) and F (z) here since in fact these are different
functions.]
Task
The continuous time function f (t) = te−αt has Laplace transform
1
F (s) =
(s + α)2
Firstly write down the pole of this function and its order:
Your solution
Answer
1
F (s) = has its pole at s = s1 = −α. The pole is second order.
(s + α)2
Now obtain the z-transform F (z) of the sampled version of f (t), locate the pole(s) of F (z) and
state the order:
Your solution
HELM (2006): 91
Answer
Consider f (nT ) = nT e−αnT = (nT )(e−αT )n
Tz
The ramp sequence {nT } has z-transform
(z − 1)2
∴ f (nT ) has z-transform
T zeαT T ze−αT
F (z) = = (see Key Point 8)
(zeαT − 1)2 (z − e−αT )2
This has a (second order) pole when z = z1 = e−αT = es1 T .
We have seen in both the above examples a close link between the pole s1 of the Laplace transform
of f (t) and the pole z1 of the z-transform of the sampled version of f (t) i.e.
z1 = es1 T (1)
where T is the sample interval.
Multiple poles lead to similar results i.e. if F (s) has poles s1 , s2 , . . . then F (z) has poles z1 , z2 , . . .
where zi = esi T .
The relation (1) between the poles is, in fact, an example of a more general relation between the
values of s and z as we shall now investigate.
Key Point 19
The unit impulse function δ(t) can be defined informally as follows:
P! (t)
1
!
t
!
Figure 21
1
The rectangular pulse P (t) of width ε and height shown in Figure 21 encloses unit area and has
ε
Laplace transform
Z ε
1 −st 1
Pε (s) = e = (1 − e−εs ) (2)
0 ε εs
As ε becomes smaller Pε (t) becomes taller and narrower but still encloses unit area. The unit impulse
function δ(t) (sometimes called the Dirac delta function) can be defined as
92 HELM (2006):
®
δ(t) = lim Pε (t)

ε→0
The Laplace transform, say ∆(s), of δ(t) can be obtained correspondingly by letting → 0 in (2),
i.e.
1
∆(s) = lim (1 − e−εs )
ε→0 εs
(εs)2
1 − (1 − εs + − . . .)
= lim 2! (Using the Maclaurin seies expansion of e−εs )
ε→0 εs
(εs)2 (εs)3
εs − + + ...
= lim 2! 3!
ε→0 εs
= 1
i.e. Lδ(t) = 1 (3)
Task
A shifted unit impulse δ(t − nT ) is defined as lim Pε (t − nT ) as illustrated below.
ε→0
P! (t − nT )
1
!
t
nT nT + !
Obtain the Laplace transform of this rectangular pulse and, by letting ε → 0,

obtain the Laplace transform of δ(t − nT ).
Your solution
HELM (2006): 93
Answer
Z nT +ε nT +ε
1 −st 1 −st
L{Pε (t − nT )} = e dt = −e
nT ε εs nT
1 −snT
− e−s(nT +ε)

= e
εs
1 −snT
= e (1 − e−sε ) → e−snT as ε → 0
εs
Hence L{δ(t − nT )} = e−snT (4)
which reduces to the result (3)
L{δ(t)} = 1 when n = 0
These results (3) and (4) can be compared with the results
Z{δn } = 1
Z{δn−m } = z −m
for discrete impulses of height 1.
Now consider a continuous function f (t). Suppose, as usual, that this function is sampled at t = nT
for n = 0, 1, 2, . . .
f (t)
------ t
T 2T 3T 4T
Figure 22
This sampled equivalent of f (t), say f∗ (t) can be defined as a sequence of equidistant impulses, the
‘strength’ of each impulse being the sample value f (nT )i.e.
∞
X
f∗ (t) = f (nT )δ(t − nT )
n=0
This function is a continuous-time signal i.e. is defined for all t. Using (4) it has a Laplace transform
∞
X
F∗ (s) = f (nT )e−snT (5)
n=0
If, in this sum (5) we replace esT by z we obtain the z-transform of the sequence {f (nT )} of samples:
∞
X
f (nT )z −n
n=0
94 HELM (2006):
®
Key Point 20
The Laplace transform
∞
X
F (s) = f (nT )e−snT
n=0
of a sampled function is equivalent to the z-transform F (z) of the sequence {f (nT )} of sample
values with z = esT .
Table 2: z-transforms of some sampled signals

This table can be compared with the table of the z-transforms of sequences on the following page.
f (t) f (nT ) F (z) Radius of convergence

t≥0 n = 0, 1, 2, . . . R
z
1 1 1
z−1
z
t nT 1
(z − 1)2
2 2 T 2 z(z + 1)
t (nT ) 1
(z − 1)3
z
e−αt e−αnT |e−αT |
z − e−αT
z sin ωT
sin ωt sin nωT 1
z2 − 2z cos ωT + 1
z(z − cos ωT )
cos ωt cos nωT 1
z 2 − 2z cos ωT + 1
T ze−αT
te−αt nT e−αnT |e−αT |
(z − e−αT )2
e−αT z −1 sin ωT
e−αt sin ωt e−αnT sin ωnT |e−αT |
1 − 2e−αT z −1 cos ωT + e−2aT z −2
1 − e−αT z −1 cos ωT
e−αT cos ωt e−αnT cos ωnT |e−αT |
1 − 2e−αT z −1 cos ωT + e−2aT z −2
Note: R is such that the closed forms of F (z) (those listed in the above table) are valid for |z| > R.
HELM (2006): 95
Table of z-transforms
fn F (z) Name
δn 1 unit impulse
δn−m z −m
z
un unit step sequence
z−1
z
an geometric sequence
z−a
z
eαn
z − eα
z sinh α
sinh αn
z2 − 2z cosh α + 1
z 2 − z cosh α
cosh αn
z 2 − 2z cosh α + 1
z sin ω
sin ωn
z2 − 2z cos ω + 1
z 2 − z cos ω
cos ωn
z 2 − 2z cos ω + 1
ze−α sin ω
e−αn sin ωn
e−αn cos ωn
z
n ramp sequence
(z − 1)2
z(z + 1)
n2
(z − 1)3
z(z 2 + 4z + 1)
n3
(z − 1)4
z
n
a fn F
a
dF
n fn −z
dz
96 HELM (2006):
Contents 22
Eigenvalues and
Eigenvectors
22.1 Basic Concepts 2
22.2 Applications of Eigenvalues and Eigenvectors 18
22.3 Repeated Eigenvalues and Symmetric Matrices 30
22.4 Numerical Determination of Eigenvalues and Eigenvectors 46
Learning outcomes
In this Workbook you will learn about the matrix eigenvalue problem AX = kX where A
is a square matrix and k is a scalar (number). You will learn how to determine the
eigenvalues (k) and corresponding eigenvectors (X) for a given matrix A. You will learn
of some of the applications of eigenvalues and eigenvectors. Finally you will learn how
eigenvalues and eigenvectors may be determined numerically.

Basic Concepts 22.1
Introduction
From an applications viewpoint, eigenvalue problems are probably the most important problems that
arise in connection with matrix analysis. In this Section we discuss the basic concepts. We shall see
that eigenvalues and eigenvectors are associated with square matrices of order n × n. If n is small
(2 or 3), determining eigenvalues is a fairly straightforward process (requiring the solutiuon of a low
order polynomial equation). Obtaining eigenvectors is a little strange initially and it will help if you
read this preliminary Section first.
#
• have a knowledge of determinants and
matrices
Prerequisites
• have a knowledge of linear first order
differential equations
"
# !
• obtain eigenvalues and eigenvectors of 2 × 2
and 3 × 3 matrices
Learning Outcomes
• state basic properties of eigenvalues and
eigenvectors
" !
2 HELM (2006):
Workbook 22: Eigenvalues and Eigenvectors
®
1. Basic concepts
Determinants
A square matrix possesses an associated determinant. Unlike a matrix, which is an array of numbers,
a determinant has a single value.

c11 c12
A two by two matrix C = has an associated determinant
c21 c22

c11 c12
det (C) = = c11 c22 − c21 c12
c21 c22
(Note square or round brackets denote a matrix, straight vertical lines denote a determinant.)
A three by three matrix has an associated determinant

c11 c12 c13

det(C) = c21 c22 c23
c31 c32 c33
Among other ways this determinant can be evaluated by an “expansion about the top row”:

c22 c23 c21 c23 c21 c22
det(C) = c11
− c12
+ c13

c32 c33 c31 c33 c31 c32
Note the minus sign in the second term.
Task
Evaluate the determinants

6 5 4
4 6 4 8
det(A) = det(B) = det(C) = 2 −1 7

3 1 1 2
−3

2 0
Your solution
Answer
det A = 4 × 1 − 6 × 3 = −14 det B = 4 × 2 − 8 × 1 = 0

−1 7 2 7 2 −1
det C = 6
− 5
+ 4 = 6 × (−14) − 5(21) + 4(4 − 3) = −185
2 0 −3 0 −3 2

4 8
A matrix such as B = in the previous task which has zero determinant is called a singular
1 2
matrix. The other two matrices A and C are non-singular. The key factor to be aware of is as
follows:
HELM (2006): 3
Section 22.1: Eigenvalues and Eigenvectors
Key Point 1
Any non-singular n × n matrix C, for which det(C) 6= 0, possesses an inverse C −1 i.e.
CC −1 = C −1 C = I where I denotes the n × n identity matrix
A singular matrix does not possess an inverse.
Systems of linear equations

We first recall some basic results in linear (matrix) algebra. Consider a system of n equations in n
unknowns x1 , x2 , . . . , xn :
c11 x1 + c12 x2 + ... + c1n xn = k1
c21 x1 + c22 x2 + ... + c2n xn = k2
.. .. .. .
. + . + ... + . = ..
cn1 x1 + cn2 x2 + ... + cnn xn = kn
We can write such a system in matrix form:
    
c11 c12 . . . c1n x1 k1
 c21 c22 . . . c2n   x2   k2 
   
..   ..  =  ..  , or equivalently CX = K.
 
 .. ..
 . . ... .   .   .
cn1 cn2 . . . cnn xn kn
We see that C is an n × n matrix (called the coefficient matrix), X = {x1 , x2 , . . . , xn }T is the n × 1
column vector of unknowns and K = {k1 , k2 , . . . , kn }T is an n × 1 column vector of given constants.
The zero matrix will be denoted by O.
If K 6= O the system is called inhomogeneous; if K = O the system is called homogeneous.
Basic results in linear algebra

Consider the system of equations CX = K.
We are concerned with the nature of the solutions (if any) of this system. We shall see that this
system only exhibits three solution types:
• The system is consistent and has a unique solution for X
• The system is consistent and has an infinite number of solutions for X
• The system is inconsistent and has no solution for X
4 HELM (2006):
®
There are two basic cases to consider:

det(C) 6= 0 or det(C) = 0
Case 1: det(C) 6= 0
In this case C −1 exists and the unique solution to CX = K is
X = C −1 K
Case 2: det(C) = 0
In this case C −1 does not exist.
(a) If K 6= O the system CA = K has no solutions.

(b) If K = O the system CX = O has an infinite number of solutions.
We note that a homogeneous system

CX = O
has a unique solution X = O if det(C) 6= 0 (this is called the trivial solution) or an infinite number
of solutions if det(C) = 0.
Example 1
(Case 1) Solve the inhomogeneous system of equations
x1 + x2 = 1 2x1 + x2 = 2
which can be expressed as CX = K where

1 1 x1 1
C= X= K=
2 1 x2 2
Solution
Here det(C) = −1 6= 0.

x1 1
The system of equations has the unique solution: X= = .
x2 0
HELM (2006): 5
Example 2
(Case 2a) Examine the following inhomogeneous system for solutions
x1 + 2x2 = 1
3x1 + 6x2 = 0
Solution

1 2
Here det (C) = = 0. In this case there are no solutions.
3 6
To see this we see the first equation of the system states x1 + 2x2 = 1 whereas the second equation
(after dividing through by 3) states x1 + 2x2 = 0, a contradiction.
Example 3
(Case 2b) Solve the homogeneous system
x1 + x2 = 0
2x1 + 2x2 = 0
Solution

1 1
Here det(C) = = 0. The solutions are any pairs of numbers {x1 , x2 } such that x1 = −x2 ,
2 2

α
i.e. X= where α is arbitrary.
−α
There are an infinite number of solutions.
A simple eigenvalue problem

We shall be interested in simultaneous equations of the form:
AX = λX,
where A is an n × n matrix, X is an n × 1 column vector and λ is a scalar (a constant) and, in the
first instance, we examine some simple examples to gain experience of solving problems of this type.
6 HELM (2006):
®
Example 4
Consider the following system with n = 2:
2x + 3y = λx
3x + 2y = λy
so that

2 3 x
A= and X= .
3 2 y
It appears that there are three unknowns x, y, λ. The obvious questions to ask
are: can we find x, y? what is λ?
Solution
To solve this problem we firstly re-arrange the equations (take all unknowns onto one side);
(2 − λ)x + 3y = 0 (1)
3x + (2 − λ)y = 0 (2)
Therefore, from equation (2):
(2 − λ)
x=− y. (3)
3
Then when we substitute this into (1)
(2 − λ)2
− y + 3y = 0 which simplifies to [−(2 − λ)2 + 9] y = 0.
3
We conclude that either y = 0 or 9 = (2 − λ)2 . There are thus two cases to consider:
Case 1
If y = 0 then x = 0 (from (3)) and we get the trivial solution. (We could have guessed this
solution at the outset.)
Case 2
9 = (2 − λ)2
which gives, on taking square roots:
±3 = 2 − λ giving λ = 2 ± 3 so λ=5 or λ = −1.
Now, from equation (3), if λ = 5 then x = +y and if λ = −1 then x = −y.
We have now completed the analysis. We have found values for λ but we also see that we cannot
obtain unique values for x and y: all we can find is the ratio between these quantities. This behaviour
is typical, as we shall now see, of an eigenvalue problem.
HELM (2006): 7
2. General eigenvalue problems
Consider a given square matrix A. If X is a column vector and λ is a scalar (a number) then the
relation.
AX = λX (4)
is called an eigenvalue problem. Our purpose is to carry out an analysis of this equation in a
manner similar to the example above. However, we will attempt a more general approach which will
apply to all problems of this kind.
Firstly, we can spot an obvious solution (for X) to these equations. The solution X = 0 is a
possibility (for then both sides are zero). We will not be interested in these trivial solutions of the
eigenvalue problem. Our main interest will be in the occurrence of non-trivial solutions for X.
These may exist for special values of λ, called the eigenvalues of the matrix A. We proceed as in
the previous example:
take all unknowns to one side:
(A − λI)X = 0 (5)
where I is a unit matrix with the same dimensions as A. (Note that AX − λX = 0 does not
simplify to (A − λ)X = 0 as you cannot subtract a scalar λ from a matrix A). This equation (5)
is a homogeneous system of equations. In the notation of the earlier discussion C ≡ A − λI and
K ≡ 0. For such a system we know that non-trivial solutions will only exist if the determinant of the
coefficient matrix is zero:
det(A − λI) = 0 (6)
Equation (6) is called the characteristic equation of the eigenvalue problem. We see that the
characteristic equation only involves one unknown λ. The characteristic equation is generally a
polynomial in λ, with degree being the same as the order of A (so if A is 2 × 2 the characteristic
equation is a quadratic, if A is a 3 × 3 it is a cubic equation, and so on). For each value of λ that is
obtained the corresponding value of X is obtained by solving the original equations (4). These X’s
are called eigenvectors.
N.B. We shall see that eigenvectors are only unique up to a multiplicative factor: i.e. if X satisfies
AX = λX then so does kX when k is any constant.
8 HELM (2006):
®
Example 5
1 0
Find the eigenvalues and eigenvectors of the matrix A=
1 2
Solution
The eigenvalues and eigenvectors are found by solving the eigenvalue probelm

x
AX = λX X= i.e. (A − λI)X = 0.
y
Non-trivial solutions will exist if det (A − λI) = 0

1 0 1 0 1−λ 0
that is, det −λ = 0, ∴ = 0,
1 2 0 1 1 2−λ
expanding this determinant: (1 − λ)(2 − λ) = 0. Hence the solutions for λ are: λ = 1 and λ = 2.
So we have found two values of λ for this 2 × 2 matrix A. Since these are unequal they are said to
be distinct eigenvalues.
To each value of λ there corresponds an eigenvector. We now proceed to find the eigenvectors.
Case 1
λ = 1 (smaller eigenvalue). Then our original eigenvalue problem becomes: AX = X. In full this
is
x = x
x + 2y = y
Simplifying
x = x (a)
x+y = 0 (b)

x
All we can deduce here is that x = −y ∴ X= for any x 6= 0
−x
(We specify x 6= 0 as, otherwise, we would have the trivial solution.)

1 2
So the eigenvectors corresponding to eigenvalue λ = 1 are all proportional to , e.g. ,
−1 −2
−1
etc.
1
Sometimes we write the eigenvector in normalised form that is, with modulus or magnitude 1.
Here, the normalised form of X is

1 1
√ which is unique.
2 −1
HELM (2006): 9
Solution (contd.)
Case 2 Now we consider the larger eigenvalue λ = 2. Our original eigenvalue problem AX = λX
becomes AX = 2X which gives the following equations:

1 0 x x
=2
1 2 y y
i.e.
x = 2x
x + 2y = 2y
These equations imply that x = 0 whilst the variable y may take any value whatsoever (except zero
as this gives the trivial solution).

0 0 0
Thus the eigenvector corresponding to eigenvalue λ = 2 has the form , e.g. , etc.
y 1 2
0
The normalised eigenvector here is .
1

1 0
In conclusion: the matrix A = has two eigenvalues and two associated normalised eigen-
1 2
vectors:
λ1 = 1, λ2 = 2

1 1 0
X1 = √ X2 =
2 −1 1
Example 6
Find the eigenvalues and eigenvectors of the 3 × 3 matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
Solution
The eigenvalues and eigenvectors are found by solving the eigenvalue problem
 
x
AX = λX X= y 
z
Proceeding as in Example 5:
(A − λI)X = 0 and non-trivial solutions for X will exist if det (A − λI) = 0
10 HELM (2006):
®
Solution (contd.)
that is,
   
 2 −1 0 1 0 0 
det  −1 2 −1  − λ  0 1 0  = 0
0 −1 2 0 0 1
 

2 − λ −1 0

i.e. −1 2 − λ −1 = 0.

0 −1 2 − λ
Expanding this determinant we find:

2 − λ −1 −1 −1
(2 − λ) + =0
−1 2 − λ 0 2 − λ
that is,
(2 − λ) {(2 − λ)2 − 1} − (2 − λ) = 0
Taking out the common factor (2 − λ):
(2 − λ) {4 − 4λ + λ2 − 1 − 1}
which gives: (2 − λ) [λ2 − 4λ + 2] = 0.
√
16 − 8
4± √
This is easily solved to give: λ = 2 or λ = = 2 ± 2.
2
So (typically) we have found three possible values of λ for this 3 × 3 matrix A.
To each value of λ there corresponds an eigenvector.
√
Case 1: λ = 2 − 2 (lowest eigenvalue)
√
Then AX = (2 − 2)X implies
√
2x − y = (2 − 2)x
√
−x + 2y − z = (2 − 2)y
√
−y + 2z = (2 − 2)z
Simplifying
√
2x − y = 0 (a)
√
−x + 2y − z = 0 (b)
√
−y + 2z = 0 (c)
We conclude the following:

√ √
(c) ⇒ y = 2z (a) ⇒ y = 2x
∴ these two relations give x = z then (b) ⇒ −x + 2x − x = 0
The last equation gives us no information; it simply states that 0 = 0.
HELM (2006): 11
Solution (contd.)
 
√ x
∴ X =  2x  for any x 6= 0 (otherwise we would have the trivial solution). So the
x
 
√ √1
eigenvectors corresponding to eigenvalue λ = 2 − 2 are all proportional to  2 .
1
 
1
1 √ 
In normalised form we have an eigenvector 2 .
2
1
Case 2: λ = 2
    
2 −1 0 x x
Here AX = 2X implies  −1 2 −1   y  = 2  y 
0 −1 2 z z
i.e.
2x − y = 2x
−x + 2y − z = 2y
−y + 2z = 2z
After simplifying the equations become:
−y = 0 (a)
−x − z = 0 (b)
−y = 0 (c)
(a), (c) imply y = 0: (b) implies x = −z

 
x
∴ eigenvector has the form  0  for any x 6= 0.
−x
 
1
That is, eigenvectors corresponding to λ = 2 are all proportional to  0 .
−1
 
1
1
In normalised form we have an eigenvector √  0  .
2 −1
12 HELM (2006):
®
Solution (contd.)
√
Case 3: λ = 2 + 2 (largest eigenvalue)
Proceeding along similar lines to cases
 1,2 above
 we find that the eigenvectors
 corresponding
 to
√ 1
√ 1
√
1
λ = 2 + 2 are each proportional to  − 2  with normalised eigenvector  − 2  .
2
1 1
In conclusion the matrix A has three distinct eigenvalues:
√ √
λ1 = 2 − 2, λ2 = 2 λ3 = 2 + 2
and three corresponding normalised eigenvectors:
     
1 1 1
1 √ 1 1 √
X1 =  2  , X2 = √  0  , X3 =  − 2 
2 2 −1 2
1 1
Exercise
Find the eigenvalues and eigenvectors of each of the following matrices A:
   
2 0 −2 10 −2 4
4 −2 1 2
(a) (b) (c)  0 4 0 (d) −20 4 −10
1 1 −8 11
−2 0 5 −30 6 −13
Answer (eigenvectors are written in normalised form)
√ √
2/√5 1/√2
(a) 3 and 2; and
1/ 5 1/ 2

1 1 1 1
(b) 3 and 9; √ and √
2 1 17 4
     
2 0 1
1     1  
(c) 1, 4 and 6; √ 0 ; 1 ; √ 0
5 1 0 5 −2
     
1 0 1
1   1   1  
(d) 0, −1 and 2; √ 5 ; √ 2 ; √ 0
26 0 5 1 5 −2
HELM (2006): 13
3. Properties of eigenvalues and eigenvectors
There are a number of general properties of eigenvalues and eigenvectors which you should be familiar
with. You will be able to use them as a check on some of your calculations.
Property 1: Sum of eigenvalues
For any square matrix A:
sum of eigenvalues = sum of diagonal terms of A (called the trace of A)

n
X
Formally, for an n × n matrix A: λi = trace(A)
i=1
(Repeated eigenvalues must be counted according to their multiplicity.)

3
X
Thus if λ1 = 4, λ2 = 4, λ3 = 1 then λi = 9).
i=1
Property 2: Product of eigenvalues

For any square matrix A:
product of eigenvalues = determinant of A

n
Y
Formally: λ 1 λ 2 λ3 · · · λ n = λi = det(A)
i=1
Q P
The symbol simply denotes multiplication, as denotes summation.
Example 7
Verify Properties 1 and 2 for the 3 × 3 matrix:
 
2 −1 0
A =  −1 2 −1 
0 −1 2
whose eigenvalues were found earlier.
Solution
The three eigenvalues of this matrix are:
√ √
λ1 = 2 − 2, λ2 = 2, λ3 = 2 + 2
Therefore
√ √
λ1 + λ2 + λ3 = (2 − 2) + 2 + (2 + 2) = 6 = trace(A)
√ √
whilst λ1 λ2 λ3 = (2 − 2)(2)(2 + 2) = 4 = det(A)
14 HELM (2006):
®
Property 3: Linear independence of eigenvectors

Eigenvectors of a matrix A corresponding to distinct eigenvalues are linearly independent i.e. one
eigenvector cannot be written as a linear sum of the other eigenvectors. The proof of this result is
omitted but we illustrate this property with two examples.
We saw earlier that the matrix

1 0
A=
1 2
has distinct eigenvalues λ1 = 1 λ2 = 2 with associated eigenvectors

1 1 0
X (1) = √ X (2) =
2 −1 1
respectively.
Clearly X (1) is not a constant multiple of X (2) and these eigenvectors are linearly independent.
We also saw that the 3 × 3 matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
√ √
had the following distinct eigenvalues λ1 = 2 − 2, λ2 = 2, λ3 = 2 + 2 with corresponding
eigenvectors of the form shown:
     
√ 1 1 1
√
X (1) =  2  , X (2) =  0  , X (3) =  − 2 
1 −1 1
Clearly none of these eigenvectors is a constant multiple of any other. Nor is any one obtainable as
a linear combination of the other two. The three eigenvectors are linearly independent.
Property 4: Eigenvalues of diagonal matrices
A 2 × 2 diagonal matrix D has the form

a 0
D=
0 d
The characteristic equation

a−λ 0
|D − λI| = 0 is =0
0 d−λ
i.e. (a − λ)(d − λ) = 0
So the eigenvalues are simply the diagonal elements a and d.
Similarly a 3 × 3 diagonal matrix has the form
 
a 0 0
D= 0 b 0 
0 0 c
having characteristic equation
HELM (2006): 15
|D − λI| = (a − λ)(b − λ)(c − λ) = 0
so again the diagonal elements are the eigenvalues.
We can see that a diagonal matrix is a particularly simple matrix to work with. In addition to the
eigenvalues being obtainable immediately by inspection it is exceptionally easy to multiply diagonal
matrices.
Task
Obtain the products D1 D2 and D2 D1 of the diagonal matrices
   
a 0 0 e 0 0
D1 =  0 b 0  D2 =  0 f 0 
0 0 c 0 0 g
Your solution
Answer  
ae 0 0
D1 D2 = D2 D1 =  0 bf 0 
0 0 cg
which of course is also a diagonal matrix.
Exercise
If λ1 , λ2 , . . . λn are the eigenvalues of a matrix A, prove the following:
(a) AT has eigenvalues λ1 , λ2 , . . . λn .

(b) If A is upper triangular, then its eigenvalues are exactly the main diagonal entries.
1 1 1
(c) The inverse matrix A−1 has eigenvalues , ,... .
λ1 λ2 λn
(d) The matrix A − kI has eigenvalues λ1 − k, λ2 − k, . . . λn − k.
(e) (Harder) The matrix A2 has eigenvalues λ21 , λ22 , . . . λ2n .
(f) (Harder) The matrix Ak (k a non-negative integer) has eigenvalues λk1 , λk2 , . . . λkn .
Verify the above results for any 2 × 2 matrix and any 3 × 3 matrix found in the previous Exercises
on page 13.
N.B. Some of these results are useful in the numerical calculation of eigenvalues which we shall
consider later.
16 HELM (2006):
®
Answer
(a) Using the property that for any square matrix A, det(A) = det(AT ) we see that if
det(A − λI) = 0 then det(A − λI)T = 0
This immediately tells us that det(AT −λI) = 0 which shows that λ is also an eigenvalue
of AT .
(b) Here simply write down a typical upper triangular matrix U which has terms on the leading
diagonal u11 , u22 , . . . , unn and above it. Then construct (U − λI). Finally imagine how
you would then obtain det(U −λI) = 0. You should see that the determinant is obtained
by multiplying together those terms on the leading diagonal. Here the characteristic
equation is:
(u11 − λ)(u22 − λ) . . . (unn − λ) = 0
This polynomial has the obvious roots λ1 = u11 , λ2 = u22 , . . . , λn = unn .

(c) Here we begin with the usual eigenvalue problem AX = λX. If A has an inverse A−1
we can multiply both sides by A−1 on the left to give
A−1 (AX) = A−1 λX which gives X = λA−1 X
or, dividing through by the scalar λ we get

1
A−1 X = X which shows that if λ and X are respectively eigenvalue and eigen-
λ
vector of A then λ−1 and X are respectively eigenvalue and eigenvector of A−1 .

2 3
As an example consider A = . This matrix has eigenvalues λ1 = −1, λ2 = 5
3 2
1 1
with corresponding eigenvectors X1 = and X2 = . The reader should
−1 1
1 2 −3 1
verify (by direct multiplication) that A−1 = − has eigenvalues −1 and
5 −3 2 5
1 1
with respective eigenvectors X1 = and X2 = .
−1 1
(d) (e) and (f) are proved in similar way to the proof outlined in (c).
HELM (2006): 17
Applications of
Eigenvalues and
Eigenvectors 22.2
Introduction
Many applications of matrices in both engineering and science utilize eigenvalues and, sometimes,
eigenvectors. Control theory, vibration analysis, electric circuits, advanced dynamics and quantum
mechanics are just a few of the application areas.
Many of the applications involve the use of eigenvalues and eigenvectors in the process of trans-
forming a given matrix into a diagonal matrix and we discuss this process in this Section. We then
go on to show how this process is invaluable in solving coupled differential equations of both first
order and second order.
#
matrices
Prerequisites
" !
#
• diagonalize a matrix with distinct eigenvalues
using the modal matrix
Learning Outcomes
• solve systems of linear differential equations
by the ‘decoupling’ method
" !
18 HELM (2006):
®
1. Diagonalization of a matrix with distinct eigenvalues

Diagonalization means transforming a non-diagonal matrix into an equivalent matrix which is diagonal
and hence is simpler to deal with.
A matrix A with distinct eigenvalues has, as we mentioned in Property 3 in 22.1, eigenvectors
which are linearly independent. If we form a matrix P whose columns are these eigenvectors, it can
be shown that
det(P ) 6= 0
so that P −1 exists.
The product P −1 AP is then a diagonal matrix D whose diagonal elements are the eigenval-
ues of A. Thus if λ1 , λ2 , . . . λn are the distinct eigenvalues of A with associated eigenvectors
X (1) , X (2) , . . . , X (n) respectively, then
(1) .. (2) .. ..
" #
(n)
P = X . X . ··· . X
will produce a product

 
λ1 0 . . . 0
−1
 0 λ2 . . . 0 
P AP = D = 
 
.. 
 . 
0 . . . . . . λn
We see that the order of the eigenvalues in D matches the order in which P is formed from the
eigenvectors.
N.B.
(a) The matrix P is called the modal matrix of A

(b) Since D is a diagonal matrix with eigenvalues λ1 , λ2 , . . . , λn which are the same as those
of A, then the matrices D and A are said to be similar.
(c) The transformation of A into D using
P −1 AP = D
is said to be a similarity transformation.
HELM (2006): 19
Section 22.2: Applications of Eigenvalues and Eigenvectors
Example
8
2 3
Let A = . Obtain the modal matrix P and calculate the product P −1 AP .
3 2
(The eigenvalues and eigenvectors of this particular matrix A were obtained earlier
in this Workbook at page 7.)
Solution
The matrix
eigenvalues λ1 = −1, λ2 = 5 with corresponding eigenvectors
A has two distinct
x x
X1 = and X2 = . We can therefore form the modal matrix from the simplest
−x x
eigenvectors of these forms:

1 1
P =
−1 1

2 3
(Other eigenvectors would be acceptable e.g. we could use P = but there is no reason
−2 3
to over complicate the calculation.)
It is easy to obtain the inverse of this 2 × 2 matrix P and the reader should confirm that:
T
−1 1 1 1 1 1 1 −1
P = adj(P ) = =
det(P ) 2 −1 1 2 1 1
We can now construct the product P −1 AP :

−1 1 1 −1 2 3 1 1
P AP =
2 1 1 3 2 −1 1

1 1 −1 −1 5
=
2 1 1 1 5

1 −2 0
=
2 0 10

−1 0
=
0 5
which is a diagonal matrix with entries the eigenvalues,

as expected. Show (by repeating the method
1 1
outlined above) that had we defined P = (i.e. interchanged the order in which the
1 −1
−1 5 0
eigenvectors were taken) we would find P AP = (i.e. the resulting diagonal elements
0 −1
would also be interchanged.)
20 HELM (2006):
®
Task
−1 4
The matrix A = has eigenvalues −1 and 3 with respective
0 3

1 1
eigenvectors and .
0 1

1 1 2 2 1 1
If P1 = , P2 = , P3 = write down the
0 1 0 2 1 0
products P1−1 AP1 , P2−1 AP2 , P3−1 AP3
(You may not need to do detailed calculations.)
Your solution
Answer
−1 0 −1 0 3 0
P1−1 AP1 = = D1 P2−1 AP2 = = D2 P3−1 AP3 = = D3
0 3 0 3 0 −1
Note that D1 = D2 , demonstrating that any eigenvectors of A can be used to form P . Note also
that since the columns of P1 have been interchanged in forming P3 then so have the eigenvalues in
D3 as compared with D1 .
Matrix powers
If P −1 AP = D then we can obtain A (i.e. make A the subject of this matrix equation) as follows:
Multiplying on the left by P and on the right by P −1 we obtain
P P −1 AP P −1 = P DP −1
Now using the fact that P P −1 = P −1 P = I we obtain
IAI = P DP −1 and so
A = P DP −1
We can use this result to obtain the powers of a square matrix, a process which is sometimes useful
in control theory. Note that
A2 = A.A A3 = A.A.A. etc.
Clearly, obtaining high powers of A directly would in general involve many multiplications. The
process is quite straightforward, however, for a diagonal matrix D, as this next Task shows.
HELM (2006): 21
Task
2 3 3 0
Obtain D and D if D = . Write down D10 .
0 −2
Your solution
Answer
2
2 3 0 3 0 3 0 9 0
D = = =
0 −2 0 −2 0 (−2)2 0 4
2 3
3 3 0 3 0 3 0 27 0
D = = =
0 (−2)2 0 (−2) 0 (−2)3 0 −8
10
10 3 0 58049 0
Continuing in this way: D = =
0 (−2)10 0 1024
We now use the relation A = P DP −1 to obtain a formula for powers of A in terms of the easily
calculated powers of the diagonal matrix D:
A2 = A.A = (P DP −1 )(P DP −1 ) = P D(P −1 P )DP −1 = P DIDP −1 = P D2 P −1
Similarly: A3 = A2 .A = (P D2 P −1 )(P DP −1 ) = P D2 (P −1 P )DP −1 = P D3 P −1
The general result is given in the following Key Point:
Key Point 2
For a matrix A with distinct eigenvalues λ1 , λ2 , . . . , λn and associated eigenvectors
X (1) , X (2) , . . . , X (n) then if
P = [X (1) : X (2) : . . . : X (n) ]
D = P −1 AP is a diagonal matrix such that
 
λ1
 λ2 
D= and Ak = P Dk P −1
 
 ... 

λn
22 HELM (2006):
®
Example
9
2 3
If A = find A23 . (Use the results of Example 8.)
3 2
Solution

1 1 −1 −1 0
We know from Example 8 that if P = then P AP = =D
−1 1 0 5

1 1 −1
where P −1 =
2 1 1
∴ A = P DP −1 and A23 = P D23 P −1 using the general result in Key Point 2

1 1 −1 0 1 −1
i.e. A=
−1 1 0 523 1 1
which is easily evaluated.
Exercise
Find a diagonalizing matrix P if

4 2
(a) A =
−1 1
 
1 0 0
(b) A = 1 2 0
2 −2 3
Verify, in each case, that P −1 AP is diagonal, with the eigenvalues of A as its diagonal elements.
Answer

−1 −2 −1 1 0
(a) P = , P AP =
1 1 0 3
   
1 0 0 1 0 0
(b) P = −1 1 0, P AP −1 = 0 2 0
−2 2 1 0 0 3
HELM (2006): 23
2. Systems of first order differential equations
Systems of first order ordinary differential equations arise in many areas of mathematics and engi-
neering, for example in control theory and in the analysis of electrical circuits. In each case the basic
unknowns are each a function of the time variable t. A number of techniques have been developed
to solve such systems of equations; for example the Laplace transform. Here we shall use eigenvalues
and eigenvectors to obtain the solution. Our first step will be to recast the system of ordinary differ-
ential equations in the matrix form Ẋ = AX where A is an n × n coefficient matrix of constants,
X is the n × 1 column vector of unknown functions and Ẋ is the n × 1 column vector containing the
derivatives of the unknowns.. The main step will be to use the modal matrix of A to diagonalise
the system of differential equations. This process will transform Ẋ = AX into the form Ẏ = DY
where D is a diagonal matrix. We shall find that this new diagonal system of differential equations
can be easily solved. This special solution will allow us to obtain the solution of the original system.
Task
Obtain the solutions of the pair of first order differential equations

ẋ = −2x
(1)
ẏ = −5y
given the initial conditions
x(0) = 3 i.e. x = 3 at t = 0
y(0) = 2 i.e. y = 2 at t = 0
dx dy
(The notation is that ẋ ≡ , ẏ ≡ )
dt dt
[Hint: Recall, from your study of differential equations, that the general solution
dy
of the differential equation = Ky is y = y0 eKt .]
dt
Your solution
Answer
Using the hint: x = x0 e−2t y = y0 e−5t where x0 = x(0) and y0 = y(0).
From the given initial condition x0 = 3 y0 = 2 so finally x = 3e−2t y = 2e−5t .
24 HELM (2006):
®
In the above Task although we had two differential equations to solve they were really quite separate.
We needed no knowledge of matrix theory to solve them. However, we should note that the two
differential equations can be written in matrix form.

x ẋ −2 0
Thus if X = Ẋ = A=
y ẏ 0 −5
the two equations (1) can be written as

ẋ −2 0 x
=
ẏ 0 −5 y
i.e. Ẋ = AX.
Task
Write in matrix form the pair of coupled differential equations

ẋ = 4x + 2y
(2)
ẏ = −x + y
Your solution
Answer

ẋ 4 2 x
=
ẏ −1 1 y
Ẋ = AX
The essential difference between the two pairs of differential equations just considered is that the
pair (1) were really separate equations whereas pair (2) were coupled:
• The first equation of (1) involving only the unknown x, the second involving only y. In matrix
terms this corresponded to a diagonal matrix A in the system Ẋ = AX.
• The pair (2) were coupled in that both equations involved both x and y. This corresponded
to the non-diagonal matrix A in the system Ẋ = AX which you found in the last Task.
Clearly the second system here is more difficult to deal with than the first and this is where we can
use our knowledge of diagonalization.
Consider a system of differential equations written in matrix form: Ẋ = AX where

x(t) ẋ(t)
X= and Ẋ =
y(t) ẏ(t)

r(t)
We now introduce a new column vector of unknowns Y = through the relation
s(t)
HELM (2006): 25
X = PY
where P is the modal matrix of A. Then, since P is a matrix of constants:
Ẋ = P Ẏ so Ẋ = AX becomes P Ẏ = A(P Y )
Then, multiplying by P −1 on the left, Ẏ = (P −1 AP )Y
But, because of the properties of the modal matrix, we know that P −1 AP is a diagonal matrix.
Thus if λ1 , λ2 are distinct eigenvalues of A then:

−1 λ1 0
P AP =
0 λ2
Hence Ẏ = (P −1 AP )Y becomes

ṙ λ1 0 r
= .
ṡ 0 λ2 s
That is, when written out we have
ṙ = λ1 r
ṡ = λ2 s.
These equations are decoupled. The first equation only involves the unknown function r(t) and
has solution r(t) = Ceλ1 t . The second equation only involves the unknown function s(t) and has
solution s(t) = Keλ2 t . [C, K are arbitrary constants.]
Once r, s are known the original unknowns x, y can be found from the relation X = P Y .
Note that the theory outlined above is more widely applicable as specified in the next Key Point:
Key Point 3
For any system of differential equations of the form
Ẋ = AX
where A is an n×n matrix with distinct eigenvalues λ1 , λ2 , . . . , λn , and t is the independent variable
the solution is
X = PY
where P is the modal matrix of A and
Y = [C1 eλ1 t , C2 eλ2 t , . . . , Cn eλn t ]T
26 HELM (2006):
®
Example 10
Find the solution of the coupled differential equations
ẋ = 4x + 2y
ẏ = −x + y with initial conditions x(0) = 1 y(0) = 0
dx dy
Here ẋ ≡ and ẏ ≡ .
dt dt
Solution

4 2
Here A= . It is easily checked that A has distinct eigenvalues λ1 = 3 λ2 = 2 and
−1 1
−2 1
corresponding eigenvectors X1 = , X2 = .
1 −1

−2 1 −1 3 0
Therefore, taking P = then P AP =
1 −1 0 2
and using Key Point 3, r(t) = Ce3t s(t) = Ke2t .

x −2 1 r −2 1 Ce3t
So ≡ X = PY = =
y 1 −1 s 1 −1 Ke2t

−2Ce3t + Ke2t
= .
Ce3t − Ke2t
Therefore x = −2Ce3t + Ke2t and y = Ce3t − Ke2t .

We can now impose the initial conditions x(0) = 1 and y(0) = 0 to give
1 = −2C + K
0 = C − K.
Thus C = K = −1 and the solution to the original system of differential equations is
x(t) = 2e3t − e2t

y(t) = −e3t + e2t
The approach we have demonstrated in Example 10 can be extended to
(a) Systems of first order differential equations with n unknowns (Key Point 3)
(b) Systems of second order differential equations (described in the next subsection).
The only restriction, as we have said, is that the matrix A in the system Ẋ = AX has distinct
eigenvalues.
HELM (2006): 27
3. Systems of second order differential equations
The decoupling method discussed above can be readily extended to this situation which could arise,
for example, in a mechanical system consisting of coupled springs.
A typical example of such a system with two unknowns has the form
ẍ = ax + by ÿ = cx + dy
or, in matrix form,
d2 x d2 y

x a b
Ẍ = AX where X= A= , ẍ = , ÿ =
y c d dt2 dt2
Task
r(t)
Make the substitution X = P Y where Y = and P is the modal matrix
s(t)
of A, A being assumed here to have distinct eigenvalues λ1 and λ2 . Solve the
resulting pair of decoupled equations for the case, which arises in practice, where
λ1 and λ2 are both negative.
Your solution
Answer
Exactly as with a first order system, putting X = P Y into the second order system Ẍ = AX gives

−1 λ1 0 r̈
Ÿ = P AP Y that is Ÿ = DY where D = and Ÿ = so
0 λ2 s̈

r̈ λ1 0 r
=
s̈ 0 λ2 s
That is, r̈ = λ1 r = −ω12 r and s̈ = λ2 s = −ω22 s (where λ1 and λ2 are both negative.)
The two decoupled equations are of the form of the differential equation governing simple harmonic
motion. Hence the general solution is
r = K cos ω1 t + L sin ω1 t and s = M cos ω2 t + N sin ω2 t
The solutions for x and y are then obtained by use of X = P Y.
Note that in this second order case four initial conditions, two each for both x and y, are required
because four constants K, L, M, N arise in the solution.
28 HELM (2006):
®
Exercises
1. Solve by decoupling each of the following first order systems:

dX 3 4 1
(a) = AX where A = , X(0) =
dt 4 −3 3
(b) ẋ1 = x2 ẋ2 = x1 + 3x3 ẋ3 = x2 with x1 (0) = 2, x2 (0) = 0, x3 (0) = 2
   
2 2 1 1
dX 
(c) = 1 3 1 X, with X(0) = 0
 
dt
1 2 2 0
(d) ẋ1 = x1 ẋ2 = −2x2 + x3 ẋ3 = 4x2 + x3 with x1 (0) = x2 (0) = x3 (0) = 1
2. Matrix methods can be used to solve systems of second order differential equations such as
might arise with coupled electrical or mechanical systems. For example the motion of two
masses m1 and m2 vibrating on coupled springs, neglecting damping and spring masses, is
governed by
m1 ÿ1 = −k1 y1 + k2 (y2 − y1 )

m2 ÿ2 = −k2 (y2 − y1 )
where dots denote derivatives with respect to time.
Write this system as a matrix equation Ÿ = AY and use the decoupling method to find Y if
(a) m1 = m2 = 1, k1 = 3, k2 = 2
√ √
and the initial conditions are y1 (0) = 1, y2 (0) = 2, ẏ(0) = −2 6, ẏ2 (0) = 6
(b) m1 = m2 = 1, k1 = 6, k2 = 4
√ √
and the initial conditions are y1 (0) = y2 (0) = 0, ẏ1 (0) = 2, ẏ2 (0) = 2 2
Verify your solutions by substitution in each case.
Answers
 
2 cosh 2t
2e − e−5t
5t
1. (a) X = (b) X =  4 sinh 2t 
e5t + 2e−5t
2 cosh 2t
 5t   
e + 3et 5et
1 1
(c) X = e5t − et  (d) X = 2e2t + 3e−3t 
4 5
e5t − et 8e2t − 3e−3t
√ √
cos t − 2 sin √6t sin √2t
2. (a) Y = (b) Y =
2 cos t + sin 6t 2 sin 2t
HELM (2006): 29
Repeated Eigenvalues
and
Symmetric Matrices 22.3
Introduction
In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions.
Firstly we look at matrices where one or more of the eigenvalues is repeated. We shall see that this
sometimes (but not always) causes problems in the diagonalization process that was discussed in the
previous Section. We shall then consider the special properties possessed by symmetric matrices
which make them particularly easy to work with.
#
matrices
Prerequisites
"
' !
$
• state the conditions under which a matrix
with repeated eigenvalues may be
Learning Outcomes diagonalized
On completion you should be able to . . . • state the main properties of real symmetric
matrices
& %
30 HELM (2006):
®
1. Matrices with repeated eigenvalues

So far we have considered the diagonalization of matrices with distinct (i.e. non-repeated) eigen-
values. We have accomplished this by the use of a non-singular modal matrix P (i.e. one where
det P 6= 0 and hence the inverse P −1 exists). We now want to discuss briefly the case of a ma-
trix A with at least one pair of repeated eigenvalues. We shall see that for some such matrices
diagonalization is possible but for others it is not.
The crucial question is whether we can form a non-singular modal matrix P with the eigenvectors of
A as its columns.
Example
Consider the matrix

1 0
A=
−4 1
which has characteristic equation
det(A − λI) = (1 − λ)(1 − λ) = 0.
So the only eigenvalue is 1 which is repeated or, more formally, has multiplicity 2.
To obtain eigenvectors of A corresponding to λ = 1 we proceed as usual and solve
AX = 1X
or

1 0 x x
=
−4 1 y y
implying
x=x and − 4x + y = y
from which x = 0 and y is arbitrary.
Thus possible eigenvectors are

0 0 0 0
, , , ...
−1 1 2 3
However, if we attempt to form a modal matrix P from any two of these eigenvectors,

0 0 0 0
e.g. and then the resulting matrix P = has zero determinant.
−1 1 −1 1
Thus P −1 does not exist and the similarity transformation P −1 AP that we have used previously
to diagonalize a matrix is not possible here.
The essential point, at a slightly deeper level, is that the columns of P in this case are not linearly
independent since

0 0
= (−1)
−1 1
i.e. one is a multiple of the other.
This situation is to be contrasted with that of a matrix with non-repeated eigenvalues.
HELM (2006): 31
Section 22.3: Repeated Eigenvalues and Symmetric Matrices
Earlier, for example, we showed that the matrix

2 3
A=
3 2
has the non-repeated eigenvalues λ1 = −1, λ2 = 5 with associated eigenvectors

1 1
X1 = X2 =
−1 1
These two
eigenvectors
are
linearly independent.
1 1
since 6= k for any value of k 6= 0.
−1 1
Here the modal matrix

1 1
P =
−1 1
has linearly independent columns: so that det P 6= 0 and P −1 exists.
The general result, illustrated by this example, is given in the following Key Point.
Key Point 4
Eigenvectors corresponding to distinct eigenvalues are always linearly independent.
It follows from this that we can always diagonalize an n × n matrix with n distinct eigenvalues
since it will possess n linearly independent eigenvectors. We can then use these as the columns of
P , secure in the knowledge that these columns will be linearly independent and hence P −1 will exist.
It follows, in considering the case of repeated eigenvalues, that the key problem is whether or not
there are still n linearly independent eigenvectors for an n × n matrix.
We shall now consider two 3 × 3 cases as illustrations.
 
Task −2 0 1
Let A= 1 1 0 
0 0 −2
(a) Obtain the eigenvalues and eigenvectors of A.
(b) Can three linearly independent eigenvectors for A be obtained?
(c) Can A be diagonalized?
32 HELM (2006):
®
Your solution
Answer
−2 − λ 0 1

(a) The characteristic equation of A is det(A − λI) =
1 1−λ 0 =0

0 0 −2 − λ
i.e. (−2 − λ)(1 − λ)(−2 − λ) = 0 which gives λ = 1, λ = −2, λ = −2.

    
−2 0 1 x x
For λ = 1 the associated eigenvectors satisfy  1 1 0   y = y  from which
 
0 0  −2  z z
0
x = 0, z = 0 and y is arbitrary. Thus an eigenvector is X = α  where α is arbitrary, α 6= 0.

0
For the repeated eigenvalue λ = −2 we must solve AY = (−2)Y for the eigenvector Y :
    
−2 0 1 x −2x
 1 1 0   y  =  −2y  from which z = 0, x + 3y = 0 so the eigenvectors are
0 0 −2  z  −2z
 
−3β −3
of the form Y =  β  = β  1  where β 6= 0 is arbitrary.
0 0
(b) X and Y are certainly linearly independent (as we would expect since they correspond to distinct
eigenvalues.) However, there is only one independent eigenvector of the form Y corresponding to
the repeated eigenvalue −2.
(c) The conclusion is that since A is 3 × 3 and we can only obtain two linearly independent
eigenvectors then A cannot be diagonalized.
HELM (2006): 33
 
Task 5 −4 4
The matrix A =  12 −11 12  has eigenvalues −3, 1, 1. The eigenvector
4 −4 5  
1
corresponding to the eigenvalue −3 is X = 3  or any multiple.

1
Investigate carefully the eigenvectors associated with the repeated eigenvalue λ = 1
and deduce whether A can be diagonalized.
Your solution
Answer
We must solve AY = (1)Y for the required eigenvector
     
5 −4 4 x x
i.e.  12 −11 12   y = y 
 
4 −4 5 z z
Each equation here gives on simplification x − y + z = 0. So we have just one equation in three
unknowns so we can choose any two values arbitrarily. The choices x = 1, y = 0 (and hence
z = −1) and x =0, y = 1 (andhence  z = 1) for example, give rise to linearly independent
1 0
eigenvectors Y1 =  0  Y2 =  1 
−1 1
We can thus form a non-singular modal matrix P from Y1 and Y2 together with X (given)
 
1 1 0
P = 3 0 1 
1 −1 1
We can then indeed diagonalize A through the transformation
 
−3 0 0
P −1 AP = D =  0 1 0 
0 0 1
34 HELM (2006):
®
Key Point 5
An n × n matrix with repeated eigenvalues can be diagonalized provided we can obtain n linearly
independent eigenvectors for it. This will be the case if, for each repeated eigenvalue λi of multiplicity
mi > 1, we can obtain mi linearly independent eigenvectors.
2. Symmetric matrices
Symmetric matrices have a number of useful properties which we will investigate in this Section.
Task
Consider the following four matrices

3 1 3 1
A1 = A2 =
4 5 1 5
   
5 8 7 5 8 7
A3 =  −1 6 8  A4 =  8 6 4 
3 4 0 7 4 0
What property do the matrices A2 and A4 possess that A1 and A3 do not?
Your solution
Answer
Matrices A2 and A4 are symmetric across the principal diagonal. In other words transposing these
matrices, i.e. interchanging their rows and columns, does not change them.
AT2 = A2 AT4 = A4 .
This property does not hold for matrices A1 and A3 which are non-symmetric.
Calculating the eigenvalues of an n×n matrix with real elements involves, in principle at least, solving
an n th order polynomial equation, a quadratic equation if n = 2, a cubic equation if n = 3, and
so on. As is well known, such equations sometimes have only real solutions, but complex solutions
(occurring as complex conjugate pairs) can also arise. This situation can therefore arise with the
eigenvalues of matrices.
HELM (2006): 35
Task
Consider the non-symmetric matrix

2 −1
A=
5 −2
Obtain the eigenvalues of A and show that they form a complex conjugate pair.
Your solution
Answer
The characteristic equation of A is

2−λ −1
det(A − λI) = =0
5 −2 − λ
i.e.
−(2 − λ)(2 + λ) + 5 = 0 leading to λ2 + 1 = 0
giving eigenvalues ± i which are of course complex conjugates.
In particular any 2 × 2 matrix of the form

a b
A=
−b a
has complex conjugate eigenvalues a ± ib.
A 3 × 3 example of a matrix with some complex eigenvalues is

 
1 −1 −1
B =  1 −1 0 
1 0 −1
A straightforward calculation shows that the eigenvalues of B are
λ = −1 (real), λ = ±i (complex conjugates).
With symmetric matrices on the other hand, complex eigenvalues are not possible.
36 HELM (2006):
®
Key Point 6
The eigenvalues of a symmetric matrix with real elements are always real.
The general proof of this result in Key Point 6 is beyond our scope but a simple proof for symmetric
2 × 2 matrices is straightforward.

a b
Let A = be any 2 × 2 symmetric matrix, a, b, c being real numbers.
b c
The characteristic equation for A is
(a − λ)(c − λ) − b2 = 0 or, expanding: λ2 − (a + c)λ + ac − b2 = 0
from which
p
(a + c) ± (a + c)2 − 4ac + 4b2
λ=
2
The quantity under the square root sign can be treated as follows:
(a + c)2 − 4ac + 4b2 = a2 + c2 + 2ac − 4ac + b2 = (a − c)2 + 4b2
which is always positive and hence λ cannot be complex.
Task
Obtain the eigenvalues and the eigenvectors of the symmetric 2 × 2 matrix

4 −2
A=
−2 1
Your solution
HELM (2006): 37
Answer
The characteristic equation for A is
(4 − λ)(1 − λ) + 4 = 0 or λ2 − 5λ = 0
giving λ = 0 and λ = 5, both of which are of course
real and also unequal (i.e. distinct). For the
x
larger eigenvalue λ = 5 the eigenvector X = satisfy
y

4 −2 x 5x
= i.e. −x − 2y = 0, −2x − 4y = 0
−2 1 y 5y

2
Both equations tell us that x = −2y so an eigenvector for λ = 5 is X = or any multiple of
−1
this. For λ = 0 the associated eigenvectors satisfy
4x − 2y = 0 −2x + y = 0

1
i.e. y = 2x (from both equations) so an eigenvector is Y = or any multiple.
2
We now look more closely at the eigenvectors X and Y in the last task. In particular we consider
the product X T Y .
Task
Evaluate X T Y from the previous task i.e. where

2 1
X= Y =
−1 2
Your solution
Answer
T 1
X Y = [2, −1] =2×1−1×2=2−2=0
2
X T Y = 0 means are X and Y are orthogonal.
Key Point 7
Two n × 1 column vectors X and Y are orthogonal if X T Y = 0.
38 HELM (2006):
®
Task
We obtained earlier in Section 22.1 Example 6 the eigenvalues of the matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
which, as
√ we now √ emphasize, is symmetric. We found that the eigenvalues were
2, 2 + 2, 2 − 2 which are real and distinct. The corresponding eigenvectors
were, respectively
     
1 1
√ √1
X= 0  Y = − 2  Z= 2 
−1 1 1
(or, as usual, any multiple of these).
Show that these three eigenvectors X, Y, Z are mutually orthogonal.
Your solution
Answer  
1
√
X T Y = [1, 0, −1]  − 2  = 1 − 1 = 0
1
 
√ √1
Y T Z = [1, − 2, 1]  2  = 1 − 2 + 1 = 0
1
 
√ 1
Z T X = [1, 2, 1]  0  = 1 − 1 = 0
−1
verifying the mutual orthogonality of these three eigenvectors.
HELM (2006): 39
General theory
The following proof that eigenvectors corresponding to distinct eigenvalues of a symmetric matrix
are orthogonal is straightforward and you are encouraged to follow it through.
Let A be a symmetric n × n matrix and let λ1 , λ2 be two distinct eigenvalues of A i.e. λ1 6= λ2
with associated eigenvectors X, Y respectively. We have seen that λ1 and λ2 must be real since A
is symmetric. Then
AX = λ1 X AY = λ2 Y (1)
Transposing the first of there results gives
X T AT = λ 1 X T (2)
(Remember that for any two matrices the transpose of a product is the product of the transposes in
reverse order.)
We now multiply both sides of (2) on the right by Y (as well as putting AT = A, since A is
symmetric) to give:
X T AY = λ1 X T Y (3)
But, using the second eigenvalue equation of (1), equation (3) becomes
X T λ2 Y = λ1 X T Y
or, since λ2 is just a number,
λ2 X T Y = λ1 X T Y
Taking all terms to the same side and factorising gives
(λ2 − λ1 )X T Y = 0
from which, since by assumption λ1 6= λ2 , we obtain the result
XT Y = 0
and the orthogonality has been proved.
Key Point 8
The eigenvectors associated with distinct eigenvalues of a
symmetric matrix are mutually orthogonal.
The reader familiar with the algebra of vectors will recall that for two vectors whose Cartesian forms
are
a = ax i + ay j + az k b = bx i + by j + bz k
the scalar (or dot) product is
40 HELM (2006):
®
a · b = ax bx + ay by + az bz .
Furthermore, if a and b are mutually perpendicular then a·b = 0. (The word ‘orthogonal’ is sometimes
used instead of perpendicular in the case.) Our result, that two column vectors are orthogonal if
X T Y = 0, may thus be considered as a generalisation of the 3-dimensional result a · b = 0.
Diagonalization of symmetric matrices

Recall from our earlier work that
1. We can always diagonalize a matrix with distinct eigenvalues (whether these are real or com-
plex).
2. We can sometimes diagonalize a matrix with repeated eigenvalues. (The condition for this to
be possible is that any eigenvalue of multiplicity m had to have associated with it m linearly
independent eigenvectors.)
The situation with symmetric matrices is simpler. Basically we can diagonalize any symmetric matrix.
To take the discussions further we first need the concept of an orthogonal matrix.
A square matrix A is said to be orthogonal if its inverse (if it exists) is equal to its transpose:
A−1 = AT or, equivalently, AAT = AT A = I.
Example
An important example of an orthogonal matrix is

cos φ sin φ
A=
− sin φ cos φ
which arises when we use matrices to describe rotations in a plane.

T cos φ sin φ cos φ − sin φ
AA =
− sin φ cos φ sin φ cos φ
cos2 φ + sin2 φ

0 1 0
= = =I
0 sin2 φ + cos2 φ 0 1
It is clear that AT A = I also, so A is indeed orthogonal.
It can be shown, but we omit the details, that any 2 × 2 matrix which is orthogonal can be written
in one of the two forms.

cos φ sin φ cos φ − sin φ
or
− sin φ cos φ sin φ cos φ
If we look closely at either of these matrices we can see that
1. The two columns are mutually orthogonal e.g. for the first matrix we have

sin φ
(cos φ − sin φ) = cos φ sin φ − sin φ cos φ = 0
cos φ
p
2. Each column has magnitude 1 (because cos2 φ + sin2 φ = 1)
Although we shall not prove it, these results are necessary and sufficient for any order square matrix
to be orthogonal.
HELM (2006): 41
Key Point 9
A square matrix A is said to be orthogonal if its inverse (if it exists) is equal to its transpose:
A−1 = AT or, equivalently, AAT = AT A = I.
A square matrix is orthogonal if and only if its columns are mutually orthogonal and each column
has unit magnitude.
Task
For each of the following matrices verify that the two properties above are satisfied.
Then check in both cases that AAT = AT A = I i.e. that AT = A−1 .
 √   1 1 
3 1 √ 0 −√
 2 −2   2 2 
(a) A =  (b) A = 0 1 0
   
√    

1 3
 1 1 
−√ 0 √
2 2 2 2
Your solution
Answer
1
 
√ √ √
1  −
!
3 √ 2 3 3
(a) Since  3 =− + = 0 the columns are orthogonal.

2 2 4 4
√ 2
1 √3 r 1 3

3 1 r3 1
Since + = + = 1 and − + = + = 1 each column has unit

2 2 4 4 2 4 4 4
magnitude.

T T 1 0
Straightforward multiplication shows AA = A A = = I.
0 1
(b) Proceed as in (a).
42 HELM (2006):
®
The following is the key result of this Section.
Key Point 10
Any symmetric matrix A can be diagonalized using an orthogonal modal matrix P via the transfor-
mation
 
λ1 0 . . . 0
 0 λ2 . . . 0 
P T AP = D = 
 
.. ... 
 . 
0 λn
It follows that any n × n symmetric matrix must possess n mutually orthogonal eigenvectors even
if some of the eigenvalues are repeated.
It should be clear to the reader that Key Point 10 is a very powerful result for any applications that
involve diagonalization of a symmetric matrix. Further, if we do need to find the inverse of P , then
this is a trivial process since P −1 = P T (Key Point 9).
Task
The symmetric matrix
 √ 
1 0 2
A = √0 2 0 

2 0 0
has eigenvalues 2, 2, −1 (i.e. eigenvalue 2 is repeated with multiplicity 2.)
Associated with the non-repeated eigenvalue −1 is an eigenvector:
 
1
X= √ 0  (or any multiple)
− 2
(a) Normalize the eigenvector X:
Your solution
HELM (2006): 43
Answer q √ √
(a) Normalizing X which has magnitude 12 + (− 2)2 = 3 gives
 √ 
  1/ 3
√ 1 



1/ 3  0
√
 = 
 0 

− 2 
p

− 2/3
(b) Investigate the eigenvectors associated with the repeated eigenvalue 2:
Your solution
Answer
(b) The eigenvectors associated with λ = 2 satisfy AY = 2Y
 √    
−1 0 2 x 0
which gives  √0 0 0  y  =  0 
2 0 −2 z 0
The first and third equations give
√
−x + 2z = 0
√ √
2x − 2z = 0 i.e. x = 2z
The equations give us no information about y so its value is arbitrary.
 √ 
2β
Thus Y has the form Y =  α  where both α and β are arbitrary.
β
A certain amount of care is now required in the choice of α and β if we are to find an orthogonal
modal matrix to diagonalize A.
For any choice
 √ 
√ 2β √ √
X T Y = (1 0 − 2)  α  = 2β − 2β = 0.
β
So X and Y are orthogonal. (The normalization of X does not affect this.)
44 HELM (2006):
®
 √ 
2β
However, we also need two orthogonal eigenvectors of the form  α . Two such are
β
   √ 
0 2
Y (1) = 1  ( choosing β = 0, α = 1) and Y (2) =  0  ( choosing α = 0, β = 1)
0 1
    p
0 2/3
After normalization, these become Y (1) Y (2) =  0√ 
= 1 
0 1/ 3
 √ p 
1/ 3 0 2/3
h . . i
Hence the matrix P = X .. Y (1) .. Y (2) =  p0 1 0√ 
− 2/3 0 1/ 3
is orthogonal and diagonalizes A:
 
−1 0 0
P T AP =  0 2 0 
0 0 2
Hermitian matrices
In some applications, of which quantum mechanics is one, matrices with complex elements arise.
T
If A is such a matrix then the matrix A is the conjugate transpose of A, i.e. the complex
conjugate of each element of A is taken as well as A being transposed. Thus if

2+i 2 T 2 − i −3i
A= then A =
3i 5 − 2i 2 5 + 2i
An Hermitian matrix is one satisfying
T
A =A
The matrix A above is clearly non-Hermitian. Indeed the most obvious features of an Hermitian
matrix is that its diagonal elements must be real. (Can you see why?) Thus

6 4+i
A=
4 − i −2
is Hermitian.
A 3 × 3 example of an Hermitian matrix is
 
1 i 5 − 2i
A=  −i 3 0 
5 + 2i 0 2
An Hermitian matrix is in fact a generalization of a symmetric matrix. The key property of an
Hermitian matrix is the same as that of a real symmetric matrix – i.e. the eigenvalues are always
real.
HELM (2006): 45
Numerical
Determination
of Eigenvalues
and Eigenvectors 22.4
Introduction
In Section 22.1 it was shown how to obtain eigenvalues and eigenvectors for low order matrices, 2 × 2
and 3 × 3. This involved firstly solving the characteristic equation det(A − λI) = 0 for a given n × n
matrix A. This is an n th order polynomial equation and, even for n as low as 3, solving it is not
always straightforward. For large n even obtaining the characteristic equation may be difficult, let
alone solving it.
Consequently in this Section we give a brief introduction to alternative methods, essentially numerical
in nature, of obtaining eigenvalues and perhaps eigenvectors.
We would emphasize that in some applications such as Control Theory we might only require one
eigenvalue of a matrix A, usually the one largest in magnitude which is called the dominant eigen-
value. It is this eigenvalue which sometimes tells us how a control system will behave.
#
matrices
Prerequisites
"
' !
$
• use the power method to obtain the
dominant eigenvalue (and associated
Learning Outcomes eigenvector) of a matrix
On completion you should be able to . . . • state the main advantages and disadvantages
of the power method
& %
46 HELM (2006):
®
1. Numerical determination of eigenvalues and eigenvectors
Preliminaries
Before discussing numerical methods of calculating eigenvalues and eigenvectors we remind you of
three results for a matrix A with an eigenvalue λ and associated eigenvector X.
1
• A−1 (if it exists) has an eigenvalue with associated eigenvector X.
λ
• The matrix (A − kI) has an eigenvalue (λ − k) and associated eigenvector X.
• The matrix (A − kI)−1 , i.e. the inverse (if it exists) of the matrix (A − kI), has eigenvalue
1
and corresponding eigenvector X.
λ−k
Here k is any real number.
 
Task 2 1 1
The matrix A =  1 2 1  has eigenvalues λ = 5, 3, 1 with associated
  0 0 5  
1/2 1 1
eigenvectors  1/2  ,  1  ,  −1  respectively.
1 0 0
2 −1 −5
 
1
The inverse A−1 exists and is A−1 =  −1 2 −5 
3 3

0 0
5
Without further calculation write down the eigenvalues and eigenvectors of the
following matrices:
   −1
3 1 1 0 1 1
(a) A−1 (b)  1 3 1  (c)  1 0 1 
0 0 6 0 0 3
Your solution
HELM (2006): 47
Section 22.4: Numerical Determination of Eigenvalues and Eigenvectors
Answer
1 1
(a) The eigenvalues of A−1 are , , 1. (Notice that the dominant eigenvalue of A yields the
5 3
smallest magnitude eigenvalue of A−1 .)
(b) The matrix here is A + I. Thus its eigenvalues are the same as those of A increased by 1 i.e.
6, 4, 2.
(c) The matrix here is (A − 2I)−1 . Thus its eigenvalues are the reciprocals of the eigenvalues of
1
(A − 2I). The latter has eigenvalues 3, 1, −1 so (A − 2I)−1 has eigenvalues , 1, −1.
3
In each of the above cases the eigenvectors are the same as those of the original matrix A.
The power method

This is a direct iteration method for obtaining the dominant eigenvalue (i.e. the largest in mag-
nitude), say λ1 , for a given matrix A and also the corresponding eigenvector.
We will not discuss the theory behind the method but will demonstrate it in action and, equally
importantly, point out circumstances when it fails.
Task
4 2
Let A = . By solving det(A − λI) = 0 obtain the eigenvalues of A and
5 7
also obtain the eigenvector associated with the dominant eigenvalue.
Your solution
48 HELM (2006):
®
Answer
4−λ 2
det(A − λI) = =0
5 7−λ
which gives
λ2 − 11λ + 18 = 0 ⇒ (λ − 9)(λ − 2) = 0
so
λ1 = 9 ( the dominant eigenvalue) and λ2 = 2.

x
The eigenvector X = for λ1 = 9 is obtained as usual by solving AX = 9X, so
y

4 2 x 9x 2
= from which 5x = 2y so X = or any multiple.
5 7 y 9y 5
If we normalize here such that the largest component of X is 1

0.4
X=
1

0.4
We shall now demonstrate how the power method can be used to obtain λ1 = 9 and X =
1
4 2
where A = .
5 7
• We choose an arbitrary 2 × 1 column vector

(0) 1
X =
1
• We premultiply this by A to give a new column vector X (1) :

(1) 4 2 1 6
X = =
5 7 1 12
• We ‘normalize’ X (1) to obtain a column vector Y (1) with largest component 1: thus

(1) 1 6 1/2
Y = =
12 12 1
• We continue the process

(2) (1) 4 2 1/2 4
X = AY = =
6 7 1 9.5

(2) 1 4 0.421053
Y = =
9.5 9.5 1
HELM (2006): 49
Task
Continue this process for a further step and obtain X (3) and Y (3) , quoting values
to 6 d.p.
Your solution
Answer
(3) (2) 4 2 0.421053 3.684210
X = AY = =
5 7 1 9.105265

(3) 1 0.404624
Y =
9.105265 1
The first 8 steps of the above iterative process are summarized in the following table (the first
three rows of which have been obtained above):
Table 1
(r) (r) (r) (r)

Step r X1 X2 αr Y1 Y2
1 6 12 12 0.5 1
2 4 9.5 9.5 0.421053 1
3 3.684211 9.105265 9.105265 0.404624 1
4 3.618497 9.023121 9.023121 0.401025 1
5 3.604100 9.005125 9.005125 0.400228 1
6 3.600911 9.001138 9.001138 0.400051 1
7 3.600202 9.000253 9.000253 0.400011 1
8 3.600045 9.000056 9.000056 0.400002 1
In Table 1, αr refers to the largest magnitude component of X (r) which is used to normalize X (r)
to give Y (r) . We can see that αr is converging to 9 which we know is the dominant eigenvalue λ1
of A. Also Y (r) is converging towards the associated eigenvector [0.4, 1]T .
Depending on the accuracy required, we could decide when to stop the iterative process by looking
at the difference |αr − αr−1 | at each step.
Task
Using the power method obtain the dominant eigenvalue and associated
eigenvector of
   
3 −1 0 1
(0)
A = −2
 4 −3  using a starting column vector X = 1 

0 −1 1 1
50 HELM (2006):
®
Calculate X (1) , Y (1) and α1 :

Your solution
Answer     
3 −1 0 1 2
X (1) = AX (0) =  −2 4 −3   1  =  −1 
0 −1 1 1 0
 
1
so Y (1) = 21  −0.5  using α1 = 2, the largest magnitude component of X (1) .
0
Carry out the next two steps of this iteration to obtains X (2) , Y (2) , α2 and X (3) , Y (3) , α3 :
Your solution
Answer
      
3 −1 0 1 3.5 −0.875
1
X (2) =  −2 4 −3   −0.5 = −4  Y (2) = −  1  α2 = −4
4
0 −1 1 0 0.5 −0.125
      
3 −1 0 −0.875 −3.625 −0.5918
X (3) =  −2 4 −3   1 = 6.125  Y (3) = 1  1  α3 = 6.125
6.125
0 −1 1 −0.125 −1.125 −0.1837
After just 3 iterations there is little sign of convergence of the normalizing factor αr . However the
next two values obtained are
α4 = 5.7347 α5 = 5.4774
and, after 14 iterations, |α14 − α13 | < 0.0001 and the power method converges, albeit slowly, to
α14 = 5.4773
which (correct to 4 d.p.) is the dominant eigenvalue of A. The corresponding eigenvector is
 
−0.4037
 1 
−0.2233
It is clear that the power method requires, for its practical execution, a computer.
HELM (2006): 51
Problems with the power method
1. If the initial column vector X (0) is an eigenvector of A other than that corresponding to the
dominant eigenvalue, say λ1 , then the method will fail since the iteration will converge to the
wrong eigenvalue, say λ2 , after only one iteration (because AX (0) = λ2 X (0) in this case).
2. It is possible to show that the speed of convergence of the power method depends on the ratio
magnitude of dominant eigenvalue λ1
magnitude of next largest eigenvalue
If this ratio is small the method is slow to converge.
In particular, if the dominant eigenvalue λ1 is complex the method will fail completely to
converge because the complex conjugate λ1 will also be an eigenvalue and |λ1 | = |λ1 |
3. The power method only gives one eigenvalue, the dominant one λ1 (although this is often the
most important in applications).
Advantages of the power method

1. It is simple and easy to implement.
2. It gives the eigenvector corresponding to λ1 as well as λ1 itself. (Other numerical methods
require separate calculation to obtain the eigenvector.)
Finding eigenvalues other than the dominant

We discuss this topic only briefly.
1. Obtaining the smallest magnitude eigenvalue

1
If A has dominant eigenvalue λ1 then its inverse A−1 has an eigenvalue (as we discussed at the
λ1
1
beginning of this Section.) Clearly will be the smallest magnitude eigenvalue of A−1 . Conversely if
λ1
we obtain the largest magnitude eigenvalue, say λ01 , of A−1 by the power method then the smallest
1
eigenvalue of A is the reciprocal, 0 .
λ1
This technique is called the inverse power method.
Example
   
3 −1 0 1 1 3
If A =  −2 4 −3  then the inverse is A−1 =  2 3 9 .
0 −1 1 2 3 10
 
1
Using X = 1  in the power method applied to A−1 gives λ01 = 13.4090. Hence the smallest
(0) 
1  
0.3163
1
magnitude eigenvalue of A is = 0.0746. The corresponding eigenvector is  0.9254  .
13.4090
1
52 HELM (2006):
®
In practice, finding the inverse of a large order matrix A can be expensive in computational effort.
Hence the inverse power method is implemented without actually obtaining A−1 as follows.
As we have seen, the power method applied to A utilizes the scheme:
X (r) = AY (r−1) r = 1, 2, . . .
1
where Y (r−1) = X (r−1) , αr−1 being the largest magnitude component of X (r−1) .
αr−1
For the inverse power method we have
X (r) = A−1 Y (r−1)
which can be re-written as
AX (r) = Y (r−1)
Thus X (r) can actually be obtained by solving this system of linear equations without needing to
obtain A−1 . This is usually done by a technique called LU decomposition i.e. writing A (once and
for all) in the form
A = LU L being a lower triangular matrix and U upper triangular.
2. Obtaining the eigenvalue closest to a given number p

Suppose λk is the (unknown) eigenvalue of A closest to p . We know that if λ1 , λ2 , . . . , λn are the
eigenvalues of A then λ1 − p, λ2 − p, . . . , λn − p are the eigenvalues of the matrix A − pI. Then
1
λk −p will be the smallest magnitude eigenvalue of A−pI but will be the largest magnitude
λk − p
eigenvalue of (A − pI)−1 . Hence if we apply the power method to (A − pI)−1 we can obtain λk .
The method is called the shifted inverse power method.
3. Obtaining all the eigenvalues of a large order matrix

In this case neither solving the characteristic equation det(A − λI) = 0 nor the power method (and
its variants) is efficient.
The commonest method utilized is called the QR technique. This technique is based on similarity
transformations i.e. transformations of the form
B = M −1 AM
where B has the same eigenvalues as A. (We have seen earlier in this Workbook that one type of
similarity transformation is D = P −1 AP where P is formed from the eigenvectors of A. However,
we are now, of course, dealing with the situation where we are trying to find the eigenvalues and
eigenvectors of A.)
In the QR method A is reduced to upper (or lower) triangular form. We have already seen that a
triangular matrix has its eigenvalues on the diagonal.
For details of the QR method, or more efficient techniques, one of which is based on what is called
a Householder transformation, the reader should consult a text on numerical methods.
HELM (2006): 53
Contents 23
Fourier Series
23.1 Periodic Functions 2
23.2 Representing Periodic Functions by Fourier Series 9
23.3 Even and Odd Functions 30
23.4 Convergence 40
23.5 Half-range Series 46
23.6 The Complex Form 53
23.7 An Application of Fourier Series 68
Learning outcomes
In this Workbook you will learn how to express a periodic signal f(t) in a series of sines and
cosines. You will learn how to simplify the calculations if the signal happens to be an even
or an odd function. You will learn some brief facts relating to the convergence of the
Fourier series. You will learn how to approximate a non-periodic signal by a Fourier series.
You will learn how to re-express a standard Fourier series in complex form which paves the
way for a later examination of Fourier transforms. Finally you will learn about some simple
applications of Fourier series.

Periodic Functions 23.1

Introduction
You should already know how to take a function of a single variable f (x) and represent it by a
power series in x about any point x0 of interest. Such a series is known as a Taylor series or Taylor
expansion or, if x0 = 0, as a Maclaurin series. This topic was firs met in 16. Such an expansion
is only possible if the function is sufficiently smooth (that is, if it can be differentiated as often as
required). Geometrically this means that there are no jumps or spikes in the curve y = f (x) near
the point of expansion. However, in many practical situations the functions we have to deal with are
not as well behaved as this and so no power series expansion in x is possible. Nevertheless, if the
function is periodic, so that it repeats over and over again at regular intervals, then, irrespective of
the function’s behaviour (that is, no matter how many jumps or spikes it has), the function may be
expressed as a series of sines and cosines. Such a series is called a Fourier series.
Fourier series have many applications in mathematics, in physics and in engineering. For example
they are sometimes essential in solving problems (in heat conduction, wave propagation etc) that
involve partial differential equations. Also, using Fourier series the analysis of many engineering
systems (such as electric circuits or mechanical vibrating systems) can be extended from the case
where the input to the system is a sinusoidal function to the more general case where the input is
periodic but non-sinsusoidal.

Prerequisites • be familiar with trigonometric functions


'
$
• recognise periodic functions
• determine the frequency, the amplitude and

Learning Outcomes the period of a sinusoid
On completion you should be able to . . . • represent common periodic functions by
trigonometric Fourier series
& %
2 HELM (2006):
Workbook 23: Fourier Series
®
1. Introduction
You have met in earlier Mathematics courses the concept of representing a function by an infinite
series of simpler functions such as polynomials. For example, the Maclaurin series representing ex
has the form
x2 x3
ex = 1 + x + + + ...
2! 3!
or, in the more concise sigma notation,
∞
X xn
ex =
n=0
n!
(remembering that 0! is defined as 1).
The basic idea is that for those values of x for which the series converges we may approximate the
function by using only the first few terms of the infinite series.
Fourier series are also usually infinite series but involve sine and cosine functions (or their complex
exponential equivalents) rather than polynomials. They are widely used for approximating periodic
functions. Such approximations are of considerable use in science and engineering. For example,
elementary a.c. theory provides techniques for analyzing electrical circuits when the currents and
voltages present are assumed to be sinusoidal. Fourier series enable us to extend such techniques
to the situation where the functions (or signals) involved are periodic but not actually sinusoidal.
You may also see in 25 that Fourier series sometimes have to be used when solving partial
differential equations.
2. Periodic functions
A function f (t) is periodic if the function values repeat at regular intervals of the independent variable
t. The regular interval is referred to as the period. See Figure 1.
f (t)
t
period
Figure 1
If P denotes the period we have
f (t + P ) = f (t)
for any value of t.
HELM (2006): 3
Section 23.1: Periodic Functions
The most obvious examples of periodic functions are the trigonometric functions sin t and cos t, both
of which have period 2π (using radian measure as we shall do throughout this Workbook) (Figure
2). This follows since
sin(t + 2π) = sin t and cos(t + 2π) = cos t
y = sin t y = cos t
1 1
π 2π t π 2π t
period period
Figure 2
The amplitude of these sinusoidal functions is the maximum displacement from y = 0 and is clearly
1. (Note that we use the term sinusoidal to include cosine as well as sine functions.)
More generally we can consider a sinusoid
y = A sin nt
which has maximum value, or amplitude, A and where n is usually a positive integer.
For example
y = sin 2t
2π
is a sinusoid of amplitude 1 and period = π (Figure 3). The fact that the period is π follows
2
because
sin 2(t + π) = sin(2t + 2π) = sin 2t
for any value of t.
y = sin 2t
1
π π t
2
period
Figure 3
4 HELM (2006):
®
We see that y = sin 2t has half the period of sin t, π as opposed to 2π (Figure 4). This can
alternatively be phrased by stating that sin 2t oscillates twice as rapidly (or has twice the frequency)
of sin t.
y = sin t
y = sin 2t
1
π 2π t
Figure 4
2π
In general y = A sin nt has amplitude A, period and completes n oscillations when t changes
n
by 2π. Formally, we define the frequency of a sinusoid as the reciprocal of the period:
1
frequency =
period
and the angular frequency, often denoted the Greek Letter ω (omega) as
2π
angular frequency = 2π × frequency =
period
n
Thus y = A sin nt has frequency and angular frequency n.
2π
Task
State the amplitude, period, frequency and angular frequency of
2t
(a) y = 5 cos 4t (b) y = 6 sin .
3
Your solution
(a)
Answer
2π π 2
amplitude 5, period = , frequency , angular frequency 4
4 2 π
Your solution
(b)
Answer
1 2
amplitude 6, period 3π, frequency , angular frequency
3π 3
HELM (2006): 5
Harmonics
In representing a non-sinusoidal function of period 2π by a Fourier series we shall see shortly that
only certain sinusoids will be required:
(a) A1 cos t (and B1 sin t)

These also have period 2π and together are referred to as the first harmonic (or
fundamental harmonic).
(b) A2 cos 2t (and B2 sin 2t)
These have half the period, and double the frequency, of the first harmonic and are
referred to as the second harmonic.
(c) A3 cos 3t (and B3 sin 3t)
2π
These have period and constitute the third harmonic.
3
In general the Fourier series of a function of period 2π will require harmonics of the type
An cos nt ( and Bn sin nt) where n = 1, 2, 3, . . .
Non-sinusoidal periodic functions

The following are examples of non-sinusoidal periodic functions (they are often called “waves”).
Square wave
f (t)
1
−π π 2π t
−1
Figure 5
Analytically we can describe this function as follows:

−1 −π < t < 0
f (t) = (which gives the definition over one period)
+1 0<t<π
f (t + 2π) = f (t) (which tells us that the function has period 2π)
Saw-tooth wave
f (t)
4
−2 2 4 t
Figure 6
In this case we can describe the function as follows:
f (t) = 2t 0<t<2 f (t + 2) = f (t)
1 2π
Here the period is 2, the frequency is and the angular frequency is = π.
2 2
6 HELM (2006):
®
Triangular wave
f (t)
π
−π π 2π t
Figure 7
Here we can conveniently define the function using −π < t < π as the “basic period”:

−t −π < t < 0
f (t) =
t 0<t<π
or, more concisely,
f (t) = |t| −π <t<π
together with the usual statement on periodicity
f (t + 2π) = f (t).
Task
Write down an analytic definition for the following periodic function:
f (t)
2
−2 3
−5 −3 2 5 t
−1
Your solution
Answer

2−t 0<t<3
f (t) = f (t + 5) = f (t)
−1 3<t<5
HELM (2006): 7
Task
Sketch the graphs of the following periodic functions showing all relevant values:
 2
 t /2 0<t<4
(a) f (t) = 8 4<t<6 f (t + 8) = f (t)
0 6<t<8

(b) f (t) = 2t − t2 0<t<2 f (t + 2) = f (t)
Your solution
Answer
f (t)
(a)
8
4 6 8 t
period
f (t)
(b)
1 2 t
period
Figure 9
8 HELM (2006):
®
Representing Periodic
Functions by Fourier
Series 23.2
Introduction
In this Section we show how a periodic function can be expressed as a series of sines and cosines.
We begin by obtaining some standard integrals involving sinusoids. We then assume that if f (t) is
a periodic function, of period 2π, then the Fourier series expansion takes the form:
∞
a0 X
f (t) = + (an cos nt + bn sin nt)
2 n=1
Our main purpose here is to show how the constants in this expansion, an (for n = 0, 1, 2, 3 . . . and
bn (for n = 1, 2, 3, . . . ), may be determined for any given function f (t).
' $
• know what a periodic function is
Prerequisites • be able to integrate functions involving

sinusoids
• have knowledge of integration by parts
&
# %
• calculate Fourier coefficients of a function of
period 2π
Learning Outcomes
• calculate Fourier coefficients of a function of
general period
" !
HELM (2006): 9
Section 23.2: Representing Periodic Functions by Fourier Series
1. Introduction
We recall first a simple trigonometric identity:
1 1
cos 2t = −1 + 2 cos2 t or equivalently cos2 t = + cos 2t (1)
2 2
Equation 1 can be interpreted as a simple finite Fourier series representation of the periodic function
f (t) = cos2 t which has period π. We note that the Fourier series representation contains a constant
term and a period π term.
A more complicated trigonometric identity is
3 1 1
sin4 t = − cos 2t + cos 4t (2)
8 2 8
which again can be considered as a finite Fourier series representation. (Do not worry if you are
unfamiliar with the result (2).) Note that the function f (t) = sin4 t (which has period π) is being
1
written in terms of a constant function, a function of period π or frequency (the “first harmonic”)
π
π 2
and a function of period or frequency (the “second harmonic”).
2 π
The reason for the constant term in both (1) and (2) is that each of the functions cos2 t and sin4 t
is non-negative and hence each must have a positive average value. Any sinusoid of the form cos nt
or sin nt has, by symmetry, zero average value. Therefore, so would a Fourier series containing only
such terms. A constant term can therefore be expected to arise in the Fourier series of a function
which has a non-zero average value.
2. Functions of period 2π
We now discuss how to represent periodic non-sinusoidal functions f (t) of period 2π in terms of
sinusoids, i.e. how to obtain Fourier series representations. As already discussed we expect such
n
Fourier series to contain harmonics of frequency (n = 1, 2, 3, . . .) and, if the periodic function
2π
has a non-zero average value, a constant term.
Thus we seek a Fourier series representation of the general form
a0
f (t) = + a1 cos t + a2 cos 2t + . . . + b1 sin t + b2 sin 2t + . . .
2
a0
The reason for labelling the constant term as will be discussed later. The amplitudes a1 , a2 , . . .
2
b1 , b2 , . . . of the sinusoids are called Fourier coefficients.
Obtaining the Fourier coefficients for a given periodic function f (t) is our main task and is referred
to as Fourier Analysis. Before embarking on such an analysis it is instructive to establish, at least
qualitatively, the plausibility of approximating a function by a few terms of its Fourier series.
10 HELM (2006):
®
Task
Consider the square wave of period 2π one period of which
is shown in Figure 10.
−π −π π π t
2 2
(a) Write down the analytic description of this function,

(b) State whether you expect the Fourier series of this function to contain
a constant term,
(c) List any other possible features of the Fourier series that you might
expect from the graph of the square-wave function.
Your solution
Answer
(a) We have
 π π
 4
 − <t<
2 2

f (t) =
 0 −π < t < − π , π < t < π


2 2
f (t + 2π) = f (t)
(b) The Fourier series will contain a constant term since the square wave here is non-negative and
cannot therefore have a zero average value. This constant term is often referred to as the d.c.
(direct current) term by engineers.
(c) Since the square wave is an even function (i.e. the graph has symmetry about the y axis) then
its Fourier series will contain cosine terms but not sine terms because only the cosines are even
functions. (Well done if you spotted this at this early stage!)
HELM (2006): 11
It is possible to show, and we will do so later, that the Fourier series representation of this square
wave is

8 1 1 1
2+ cos t − cos 3t + cos 5t − cos 7t + . . .
π 3 5 7
i.e. the Fourier coefficients are
a0 8 8 8
= 2, a1 = , a2 = 0, a3 = − , a4 = 0, a5 = ,...
2 π 3π 5π
Note, as well as the presence of the constant term and of the cosine (but not sine) terms, that only
2π 2π 2π
odd harmonics are present i.e. sinusoids of period 2π, , , , . . . or of frequency 1, 3, 5, 7, . . .
3 5 7
1
times the fundamental frequency .
2π
We now show in Figure 8 graphs of
(i) the square wave
(ii) the first two terms of the Fourier series representing the square wave
(iii) the first three terms of the Fourier series representing the square wave
(iv) the first four terms of the Fourier series representing the square wave
(v) the first five terms of the Fourier series representing the square wave
Note: We show the graphs for 0 < t < π only since the square wave and its Fourier series are even.
(i)
4
−π −π π π t
2 2
(ii) 8 (iii) 8 1
2+ cos t 2+ (cos t − cos 3t )
π π 3
4 4
π t π t
2 π 2 π
8 1 1 1
(iv) 8 1 1 2+ (cos t − cos 3t + cos 5t − cos 7t )
2 + (cos t − cos 3t + cos 5t ) (v) π 3 5 7
π 3 5
4 4
π t π t
2 π π
2
Figure 8
12 HELM (2006):
®
We can clearly see from Figure 8 that as the number of terms is increased the graph of the Fourier
series gradually approaches that of the original square wave - the ripples increase in number but
π
decrease in amplitude. (The behaviour near the discontinuity, at t = , is slightly more complicated
2
and it is possible to show that however many terms are taken in the Fourier series, some “overshoot”
will always occur. This effect, which we do not discuss further, is known as the Gibbs Phenomenon.)
Orthogonality properties of sinusoids

As stated earlier, a periodic function f (t) with period 2π has a Fourier series representation
a0
f (t) = + a1 cos t + a2 cos 2t + . . . + b1 sin t + b2 sin 2t + . . . ,
2
∞
a0 X
= + (an cos nt + bn sin nt) (3)
2 n=1
a0
To determine the Fourier coefficients an , bn and the constant term use has to be made of certain
2
integrals involving sinusoids, the integrals being over a range α, α + 2π, where α is any number. (We
will normally choose α = −π.)
Task Z π Z π
Find sin nt dt and cos nt dt where n is an integer.
−π −π
Your solution
Answer
In fact both integrals are zero for
Z π π
1 1
sin nt dt = − cos nt = (− cos nπ + cos nπ) = 0 n 6= 0 (4)
−π n −π n
Z π π
1
cos nt dt = sin nt =0 6 0
n= (5)
−π n −π
As special cases, if n = 0 the first integral is zero and the second integral has value 2π.
N.B. Any integration range α, α + 2π, would give these same (zero) answers.
These integrals enable us to calculate the constant term in the Fourier series (3) as in the following
task.
HELM (2006): 13
Task
Integrate both sides of (3) from −π to π and use the results from the previous
Task. Hence obtain an expression for a0 .
Your solution
Answer
We get for the left-hand side
Z π
f (t)dt
−π
(whose value clearly depends on the function f (t)).

Integrating the right-hand side term by term we get
∞ Z π π ∞
1 π
Z Z π
X 1 X
a0 dt + an cos nt dt + bn sin nt dt = a0 t + {0 + 0}
2 −π n=1 −π −π 2 −π n=1
(using the integrals (4) and (5) shown above). Thus we get
Z π
1 π
Z
1
f (t) dt = (2a0 π) or a0 = f (t) dt (6)
−π 2 π −π
Key Point 1
The constant term in a trigonometric Fourier series for a function of period 2π is
Z π
a0 1
= f (t) dt = average value of f (t) over 1 period.
2 2π −π
14 HELM (2006):
®
This result ties in with our earlier discussion on the significance of the constant term. Clearly a signal
whose average value is zero will have no constant term in its Fourier series. The following square
wave (Figure 9) is an example.
f (t)
1
π 2π t
−1
Figure 9
We now obtain further integrals, known as orthogonality properties, which enable us to find the
remaining Fourier coefficients i.e. the amplitudes an and bn (n = 1, 2, 3, . . .) of the sinusoids.
Task
Using the standard trigonometric identity that
1
sin nt cos mt ≡ {sin(n + m)t + sin(n − m)t}
2
Z π
evaluate sin nt cos mt dt where n and m are any integers.
−π
Your solution
Answer
We get
Z π Z π Z π
1 1
sin nt cos mt dt = sin(n + m)t dt + sin(n − m)t dt = {0 + 0} = 0
−π 2 −π −π 2
using the results (4) and (5) since n + m and n − m are also integers.
This result holds for any interval of 2π.
HELM (2006): 15
Key Point 2
Orthogonality Relation
For any integers m, n, including the case m = n,
Z π
sin nt cos mt dt = 0
−π
We shall use this result shortly but need a few more integrals first.
Consider next
Z π
cos nt cos mt dt where m and n are integers.
−π
Using another trigonometric identity we have, for the case n 6= m,

Z π Z π
1
cos nt cos mt dt = {cos(n + m)t + cos(n − m)t}dt
−π 2 −π
1
= {0 + 0} = 0 using the integrals (4) and (5).
2
For the case n = m we must get a non-zero answer since cos2 nt is non-negative. In this case:
Z π Z π
2 1
cos nt dt = (1 + cos 2nt) dt
−π 2 −π
π
1 1
= t+ sin 2nt =π ( provided n 6= 0)
2 2n −π
Rπ
For the case n = m = 0 we have −π
cos nt cos mt dt = 2π
Task
Proceeding in a similar way to the above, evaluate
Z π
sin nt sin mt dt
−π
for integers m and n.

Again consider separately the three cases: (a) n 6= m, (b) n = m 6= 0 and (c)
n = m = 0.
16 HELM (2006):
®
Your solution
Answer
(a) Using the identity sin nt sin mt ≡ 12 {cos(n − m)t − cos(n + m)t} and integrating the right-
hand side terms, we get, using (4) and (5)
Z π
sin nt sin mt dt = 0 n, m integers n 6= m
−π
(b) Using the identity cos 2θ = 1 − 2 sin2 θ with θ = nt gives for n = m 6= 0

Z π
1 π
Z
2
sin nt dt = (1 − cos 2nt)dt = π
−π 2 −π
Z π
(c) When n = m = 0, sin nt sin mt dt = 0.
−π
We summarise these results in the following Key Point:
Key Point 3
For integers n, m
Z π
sin nt cos mt dt = 0
−π

Z π  0 n 6= m
cos nt cos mt dt = π n = m 6= 0
−π 
2π n = m = 0
Z π
0 n 6= m, n = m = 0
sin nt sin mt dt =
−π π n=m
All these results hold for any integration range of width 2π.
HELM (2006): 17
3. Calculation of Fourier coefficients
Consider the Fourier series for a function f (t) of period 2π:
∞
a0 X
f (t) = + (an cos nt + bn sin nt) (7)
2 n=1
To obtain the coefficients an (n = 1, 2, 3, . . .), we multiply both sides by cos mt where m is some
positive integer and integrate both sides from −π to π.
For the left-hand side we obtain
Z π
f (t) cos mt dt
−π
For the right-hand side we obtain

∞
a0 π
Z X Z π Z π
cos mt dt + an cos nt cos mt dt + bn sin nt cos mt dt
2 −π n=1 −π −π
The first integral is zero using (5).

Using the orthogonality relations all the integrals in the summation give zero except for the case
n = m when, from Key Point 3
Z π
cos2 mt dt = π
−π
Hence
Z π
f (t) cos mt dt = am π
−π
from which the coefficient am can be obtained.

Rewriting m as n we get
1 π
Z
an = f (t) cos nt dt for n = 1, 2, 3, . . . (8)
π −π
Using (6), we see the formula also works for n = 0 (but we must remember that the constant term
a0
is .)
2
From (8)
an = 2 × average value of f (t) cos nt over one period.
18 HELM (2006):
®
Task
By multiplying (7) by sin mt obtain an expression for the Fourier Sine coefficients
bn , n = 1, 2, 3, . . .
Your solution
Answer
A similar calculation to that performed to find the an gives
Z π Z π ∞ Z π Z π
a0 X
f (t) sin mt dt = sin mt dt + an cos nt sin mt dt + bn sin nt sin mt dt
−π 2 −π n=1 −π −π
All terms on the right-hand side integrate to zero except for the case n = m where
Z π
bm sin2 mt dt = bm π
−π
Relabelling m as n gives
1 π
Z
bn = f (t) sin nt dt n = 1, 2, 3, . . . (9)
π −π
(There is no Fourier coefficient b0 .)
Clearly bn = 2 × average value of f (t) sin nt over one period.
HELM (2006): 19
Key Point 4
A function f (t) with period 2π has a Fourier series
∞
a0 X
f (t) = + (an cos nt + bn sin nt)
2 n=1
The Fourier coefficients are

Z π
1
an = f (t) cos nt dt n = 0, 1, 2, . . .
π −π
Z π
1
bn = f (t) sin nt dt n = 1, 2, . . .
π −π
In the integrals any convenient integration range extending over an interval of 2π may be used.
4. Examples of Fourier series

We shall obtain the Fourier series of the “half-rectified” square wave shown in Figure 10.
f (t)
1
π 2π t
period
Figure 10
We have

1 0<t<π
f (t) =
0 π < t < 2π
f (t + 2π) = f (t)
The calculation of the Fourier coefficients is merely straightforward integration using the results
already obtained:
1 π
Z
an = f (t) cos nt dt
π −π
in general. Hence, for our square wave
π
1 π
Z
1 sin nt
an = (1) cos nt dt = =0 provided n 6= 0
π 0 π n 0
20 HELM (2006):
®
Z π
1 a0 1
But a0 = (1) dt = 1 so the constant term is = .
π 0 2 2
1
(The square wave takes on values 1 and 0 over equal length intervals of t so is clearly the mean
2
value.)
Similarly
Z π π
1 1 cos nt
bn = (1) sin nt dt = −
π 0 π n 0
Some care is needed now!

1
bn = (1 − cos nπ)
nπ
But cos nπ = +1 n = 2, 4, 6, . . . ,
∴ bn = 0 n = 2, 4, 6, . . .
However, cos nπ = −1 n = 1, 3, 5, . . .
1 2
∴ bn = (1 − (−1)) = n = 1, 3, 5, . . .
nπ nπ
2 2 2
i.e. b1 = , b3 = , b5 = ,...
π 3π 5π
Hence the required Fourier series is
∞
a0 X
f (t) = + (an cos nt + bn sin nt) in general
2 n=1

1 2 1 1
f (t) = + sin t + sin 3t + sin 5t + . . . in this case
2 π 3 5
Note that the Fourier series for this particular form of the square wave contains a constant term and
odd harmonic sine terms. We already know why the constant term arises (because of the non-zero
mean value of the functions) and will explain later why the presence of any odd harmonic sine terms
could have been predicted without integration.
The Fourier series we have found can be written in summation notation in various ways:
∞
1 2 X 1
+ sin nt or, since n is odd, we may write n = 2k − 1 k = 1, 2, . . . and write the
2 π n=1 n
(n odd)
∞
1 2X 1
Fourier series as + sin(2k − 1)t
2 π k=1 (2k − 1)
HELM (2006): 21
Task
Obtain the Fourier series of the square wave one period of which is shown:
−π −π π π t
2 2
Your solution
22 HELM (2006):
®
Answer
We have, since the function is non-zero only for − π2 < t < π2 ,
Z π
1 2
a0 = 4 dt = 4
π − π2
a0
∴ = 2 is the constant term as we would expect. Also
2
Z π π
1 2 4 sin nt 2
an = 4 cos nt dt =
π − π2 π n −π
2
4 n nπ o nπ 8 nπ
= sin − sin − = sin n = 1, 2, 3, . . .
nπ 2 2 nπ 2
It follows from a knowledge of the sine function that


 0 n = 2, 4, 6, . . .




 8

an = n = 1, 5, 9, . . .
 nπ




 8
 −
 n = 3, 7, 11, . . .
nπ
Also
Z π π
1 2 4 cos nt 2 4 n nπ nπ o
bn = 4 sin nt dt = − =− cos − cos − =0
π − π2 π n −π nπ 2 2
2
Hence, the required Fourier series is

8 1 1 1
f (t) = 2 + cos t − cos 3t + cos 5t − cos 7t + . . .
π 3 5 7
which, like the previous square wave, contains a constant term and odd harmonics, but in this case
odd harmonic cosine terms rather than sine.
You may recall that this particular square wave was used earlier and we have already sketched the
form of the Fourier series for 2, 3, 4 and 5 terms in Figure 8.
Clearly, in finding the Fourier series of square waves, the integration is particularly simple because
f (t) takes on piecewise constant values. For other functions, such as saw-tooth waves this will
not be the case. Before we tackle such functions however we shall generalise our formulae for the
Fourier coefficients an , bn to the case of a periodic function of arbitrary period, rather than confining
ourselves to period 2π.
HELM (2006): 23
5. Fourier series for functions of general period
This is a straightforward extension of the period 2π case that we have already discussed.
Using x (instead of t) temporarily as the variable. We have seen that a 2π periodic function f (x)
has a Fourier series
∞
a0 X
f (x) = + (an cos nx + bn sin nx)
2 n=1
with
Z π Z π
1 1
an = f (x) cos nx dx n = 0, 1, 2, . . . bn = f (x) sin nx dx n = 1, 2, . . .
π −π π −π
2π
Suppose we now change the variable to t where x = t.
T
Thus x = π corresponds to t = T /2 and x = −π corresponds to t = −T /2.
Hence regarded as a function of t, we have a function with period T .
2π 2π
Making the substitution x = t, and hence dx = dt, in the expressions for an and bn we obtain
T T
Z T
2 2 2nπt
an = f (t) cos dt n = 0, 1, 2 . . .
T − T2 T
Z T
2 2 2nπt
bn = f (t) sin dt n = 1, 2 . . .
T − T2 T
These integrals give the Fourier coefficients for a function of period T whose Fourier series is
∞
a0 X 2nπt 2nπt
f (t) = + an cos + bn sin
2 n=1
T T
Various other notations are commonly used in this case e.g. it is sometimes convenient to write
the period T = 2`. (This is particularly useful when Fourier series arise in the solution of partial
differential equations.) Another alternative is to use the angular frequency ω and put T = 2π/ω.
Task
Write down the form of the Fourier series and expressions for the coefficients if
(a) T = 2` (b) T = 2π/ω.
Your solution
24 HELM (2006):
®
Answer
∞
1 `
Z
a0 X nπt nπt nπt
(a) f (t) = + an cos + bn sin with an = f (t) cos dt
2 n=1
` ` ` −` `
and similarly for bn .
∞ Z π
a0 X ω ω
(b) f (t) = + {an cos(nωt) + bn sin(nωt)} with an = f (t) cos(nωt) dt
2 n=1
π − ωπ
and similarly for bn .
2π
You should note that, as usual, any convenient integration range of length T (or 2` or ) can be
ω
used in evaluating an and bn .
Example 1
Find the Fourier series of the function shown in Figure 11 which is a saw-tooth
wave with alternate portions removed.
f (t)
2
−2 2 t
Figure 11
Solution
Here the period T = 2` = 4 so ` = 2. The Fourier series will have the form
∞
a0 X nπt nπt
f (t) = + an cos + bn sin
2 n=1
2 2
The coefficients an are given by
1 2
Z
nπt
an = f (t) cos dt
2 −2 2
where

−2 < t < 0
0
f (t) = f (t + 4) = f (t)
t
0<t<2
1 2
Z
nπt
Hence an = t cos dt.
2 0 2
HELM (2006): 25
Solution (contd.)
The integration is readily performed using integration by parts:
Z 2 2 Z 2
nπt 2 nπt 2 nπt
t cos dt = t sin − sin dt
0 2 nπ 2 0 nπ 0 2
2
4 nπt
= cos n 6= 0
n2 π 2 2 0
4
= (cos nπ − 1).
n2 π 2
Z 2
1 nπt
Hence, since an = t cos( )dt
2 0 2


 0 n = 2, 4, 6, . . .
an =
 − 4

n = 1, 3, 5, . . .
n2 π 2
Z 2
a0 1
The constant term is where a0 = t dt = 1.
2 2 0
Similarly
Z 2
1 nπt
bn = t sin dt
2 0 2
where
Z 2 2 Z 2
nπt 2 nπt 2 nπt
t sin dt = −t cos + cos dt.
0 2 nπ 2 0 nπ 0 2
The second integral gives zero. Hence


2
− n = 2, 4, 6, . . .


nπ

2 
bn = − cos nπ =
nπ 
 2
 +
 n = 1, 3, 5, . . .
nπ
Hence, using all these results for the Fourier coefficients, the required Fourier series is

1 4 πt 1 3πt 1 5πt
f (t) = − cos + cos + cos + ...
2 π2 2 9 2 25 2

2 πt 1 2πt 1 3πt
+ sin − sin + sin ...
π 2 2 2 3 2
1 1
Notice that because the Fourier coefficients depend on (rather than as was the case for
n2 n
the square wave) the sinusoidal components in the Fourier series have quite rapidly decreasing
amplitudes. We would therefore expect to be able to approximate the original saw-tooth function
using only a quite small number of terms in the series.
26 HELM (2006):
®
Task
Obtain the Fourier series of the function
f (t) = t2 −1<t<1
f (t + 2) = f (t)
f (t)
−2 −1 1 2 t
First write out the form of the Fourier series in this case:
Your solution
Answer
Since T = 2` = 2 and since the function has a non-zero average value, the form of the Fourier
series is
∞
a0 X
+ {an (cos nπt) + bn sin(nπt)}
2 n=1
Now write out integral expressions for an and bn . Will there be a constant term in the Fourier series?
Your solution
Answer
Because the function is non-negative there will be a constant term. Since T = 2` = 2 then ` = 1
and we have
Z 1
an = t2 cos(nπt) dt n = 0, 1, 2, . . .
−1
Z 1
bn = t2 sin(nπt) dt n = 1, 2, . . .
−1
Z 1
a0
The constant term will be where a0 = t2 dt.
2 −1
HELM (2006): 27
Now evaluate the integrals. Try to spot the value of the integral for bn so as to avoid integration.
Note that the integrand is an even functions for an and an odd functon for bn .
Your solution
Answer
The integral for bn is zero for all n because the integrand is an odd function of t. Since the integrand
is even in the integrals for an we can write
Z 1
an = 2 t2 cos nπt dt n = 0, 1, 2, . . .
0
Z 1
ao 2
The constant term will be where a0 = 2 t2 dt = .
2 0 3
For n = 1, 2, 3, . . . we must integrate by parts (twice)
( 1 Z 1 )
t2 2
an = 2 sin(nπt) − t sin(nπt) dt
nπ 0 nπ 0
( 1 Z 1 )
4 t 1
= − − cos(nπt) + cos(nπt) dt .
nπ nπ 0 nπ 0
4
The integral in the second term gives zero so an = cos nπ.
n2 π 2
Now writing out the final form of the Fourier series we have
∞
1 4 X cos nπ 1 4 1 1
f (t) = + 2 cos(nπt) = + 2 − cos(πt) + cos(2πt) − cos(3πt) + . . .
3 π n=1 n2 3 π 4 9
28 HELM (2006):
®
Exercises
For each of the following periodic signals
• sketch the given function over a few periods
• find the trigonometric Fourier coefficients
• write out the first few terms of the Fourier series.


1 0 < t < π/2
1. f (t) = f (t + 2π) = f (t) square wave
0 π/2 < t < 2π

2. f (t) = t2 − 1 < t < 1 f (t + 2) = f (t)


−1 −T /2 < t < 0
3. f (t) = f (t + T ) = f (t) square wave
1 0 < t < T /2


 0 −π < t < 0
4. f (t) = f (t + 2π) = f (t)
2
t 0<t<π

0
 −T /2 < t < 0
5. f (t) = f (t + T ) = f (t) half-wave rectifier
A sin 2πt

0 < t < T /2
T
Answers
1.

1 1 cos 3t cos 5t
+ cos t − + − ...
4 π 3 5

1 2 sin 2t sin 3t sin 5t 2 sin 6t
+ sin t + + + + + ...
π 2 3 5 6

1 4 cos 2πt cos 3πt cos 4πt
2. − cos πt − + − + ...
3 π2 4 9 16

4 1 1
3. sin ω t + sin 3 ω t + sin 5ω t + . . . where ω = 2π/T .
π 3 5
4.
π2

cos 2t cos 3t
− 2 cos t − + − ...
6 22 32

4 π π 4 π
+ π− sin t − sin 2t + − sin 3t − sin 4t + . . .
π 2 3 33 π 4

A A 2A cos 2 ω t cos 4ω t
5. + sin ω t − + + ...
π 2 π (1)(3) (3)(5)
HELM (2006): 29
Even and Odd
Functions 23.3

Introduction
In this Section we examine how to obtain Fourier series of periodic functions which are either even
or odd. We show that the Fourier series for such functions is considerably easier to obtain as, if the
signal is even only cosines are involved whereas if the signal is odd then only sines are involved. We
also show that if a signal reverses after half a period then the Fourier series will only contain odd
harmonics.
' $
• know how to obtain a Fourier series
Prerequisites • be able to integrate functions involving

sinusoids
&
# %
• determine if a function is even or odd or
neither
Learning Outcomes
On completion you should be able to . . . • easily calculate Fourier coefficients of even or
odd functions
" !
30 HELM (2006):
®
1. Even and odd functions

We have shown in the previous Section how to calculate, by integration, the coefficients an (n =
0, 1, 2, 3, . . .) and bn (n = 1, 2, 3, . . .) in a Fourier series. Clearly this is a somewhat tedious pro-
cess and it is advantageous if we can obtain as much information as possible without recourse to
integration. In the previous Section we showed that the square wave (one period of which shown in
Figure 12) has a Fourier series containing a constant term and cosine terms only (i.e. all the Fourier
coefficients bn are zero) while the function shown in Figure 13 has a more complicated Fourier series
containing both cosine and sine terms as well as a constant.
−π −π π π t
2 2
Figure 12: Square wave
f (t)
2
−2 2 t
Figure 13: Saw-tooth wave
Task
Contrast the symmetry or otherwise of the functions in Figures 12 and 13.
Your solution
Answer
The square wave in Figure 12 has a graph which is symmetrical about the y-axis and is called an
even function. The saw-tooth wave shown in Figure 13 has no particular symmetry.
In general a function is called even if its graph is unchanged under reflection in the y-axis. This is
equivalent to
f (−t) = f (t) for all t
Obvious examples of even functions are t2 , t4 , |t|, cos t, cos2 t, sin2 t, cos nt.
A function is said to be odd if its graph is symmetrical about the origin (i.e. it has rotational
symmetry about the origin). This is equivalent to the condition
f (−t) = −f (t)
HELM (2006): 31
Section 23.3: Even and Odd Functions
Figure 14 shows an example of an odd function.
f (t)
Figure 14
Examples of odd functions are t, t3 , sin t, sin nt. A periodic function which is odd is the saw-tooth
wave in Figure 15.
f (t)
−1 1 t
−1
Figure 15
Some functions are neither even nor odd. The periodic saw-tooth wave of Figure 13 is an example;
another is the exponential function et .
Task
State the period of each of the following periodic functions and say whether it is
even or odd or neither.
(a) f (t) (b) f (t)
π
−π π t −π −π − π 4
π
2
π t
2 4
Your solution
Answer
(a) is neither even nor odd (with period 2π)
(b) is odd (with period π).
32 HELM (2006):
®
A Fourier series contains a sum of terms while the integral formulae for the Fourier coefficients an
and bn contain products of the type f (t) cos nt and f (t) sin nt. We need therefore results for sums
and products of functions.
Suppose, for example, g(t) is an odd function and h(t) is an even function.
Let F1 (t) = g(t) h(t) (product of odd and even functions)

so F1 (−t) = g(−t)h(−t) (replacing t by − t)
= (−g(t))h(t) (since g is odd and h is even)
= −g(t)h(t)
= −F1 (t)
So F1 (t) is odd.
Now suppose F2 (t) = g(t) + h(t) (sum of odd and even functions)
∴ F2 (−t) = g(−t) + h(t)
= −g(t) + h(t)
We see that 6
F2 (−t) = F2 (t)
and F2 (−t) = 6 −F2 (t)
So F2 (t) is neither even nor odd.
Task
Investigate the odd/even nature of sums and products of
(a) two odd functions g1 (t), g2 (t)
(b) two even functions h1 (t), h2 (t)
Your solution
HELM (2006): 33
Answer
G1 (t) = g1 (t)g2 (t)

G1 (−t) = (−g1 (t))(−g2 (t))
= g1 (t)g2 (t)
= G1 (t)
so the product of two odd functions is even.
G2 (t) = g1 (t) + g2 (t)

G2 (−t) = g1 (−t) + g2 (−t)
= −g1 (t) − g2 (t)
= −G2 (t)
so the sum of two odd functions is odd.
H1 (t) = h1 (t)h2 (t)

H2 (t) = h1 (t) + h2 (t)
A similar approach shows that
H1 (−t) = H1 (t)
H2 (−t) = H2 (t)
i.e. both the sum and product of two even functions are even.
These results are summarized in the following Key Point.
Key Point 5
Products of functions
(even) × (even) = (even)

(even) × (odd) = (odd)
(odd) × (odd) = (even)
Sums of functions
(even) + (even) = (even)

(even) + (odd) = (neither)
(odd) + (odd) = (odd)
34 HELM (2006):
®
Useful properties of even and of odd functions in connection with integrals can be readily deduced if
we recall that a definite integral has the significance of giving us the value of an area:
y = f (t)
a
b t
Figure 16
Z b
f (t) dt gives us the net value of the shaded area, that above the t-axis being positive, that below
a
being negative.
Task
For the case of a symmetrical interval (−a, a) deduce what you can about
Z a Z a
g(t) dt and h(t) dt
−a −a
where g(t) is an odd function and h(t) is an even function.
h(t)
g(t)
−a
a t −a a t
Your solution
Answer
We have
Z a
g(t) dt = 0 for an odd function
−a
Z a Z a
h(t) dt = 2 h(t) dt for an even function
−a 0
(Note that neither result holds for a function which is neither even nor odd.)
HELM (2006): 35
2. Fourier series implications
Since a sum of even functions is itself an even function it is not unreasonable to suggest that a Fourier
series containing only cosine terms (and perhaps a constant term which can also be considered as
an even function) can only represent an even periodic function. Similarly a series of sine terms (and
no constant) can only represent an odd function. These results can readily be shown more formally
using the expressions for the Fourier coefficients an and bn .
Task
Recall that for a 2π-periodic function
1 π
Z
bn = f (t) sin nt dt
π −π
If f (t) is even, deduce whether the integrand is even or odd (or neither) and hence
evaluate bn . Repeat for the Fourier coefficients an .
Your solution
Answer
We have, if f (t) is even,
f (t) sin nt = (even) × (odd) = odd
1 π
Z
hence bn = (odd function) dt = 0
π −π
Thus an even function has no sine terms in its Fourier series.
Also f (t) cos nt = (even) × (even) = even
1 π 2 π
Z Z
∴ an = (even function) dt = f (t) cos nt dt.
π −π π 0
It should be obvious that, for an odd function f (t),

1 π 1 π
Z Z
an = f (t) cos nt dt = (odd function) dt = 0
π −π π −π
2 π
Z
bn = f (t) sin nt dt
π 0
Analogous results hold for functions of any period, not necessarily 2π.
36 HELM (2006):
®
For a periodic function which is neither even nor odd we can expect at least some of both the an
and bn to be non-zero. For example consider the square wave function:
f (t)
1
−π π 2π t
Figure 17: Square wave

This function is neithereven nor odd and we have already seen in Section 23.2 that its Fourier series
contains a constant 12 and sine terms.
This result could be expected because we can write
1
f (t) = + g(t)
2
where g(t) is as shown:
g(t)
1
2
−π π 2π t
−1
2
Figure 18
Clearly g(t) is odd and will contain only sine terms. The Fourier series are in fact

1 2 1 1
f (t) = + sin t + sin 3t + sin 5t + . . .
2 π 3 5
and

2 1 1
g(t) = sin t + sin 3t + sin 5t + . . .
π 3 5
HELM (2006): 37
Task
For each of the following functions deduce whether the corresponding Fourier series
contains
(a) sine terms only or cosine terms only or both
(b) a constant term
y
1 2
y
−π π
−π 0 π t 0 t
3 y 4 y
−π 0 π t −π −a 0 a π t
5 y 6 y
−π 0 π 2π −π 0 π 2π t
t
7 y
−π 0 π 2π t
Your solution
Answer
1. cosine terms only (plus constant). 5. sine terms only (no constant).
2. cosine terms only (no constant). 6. sine and cosine terms (plus constant).
3. sine terms only (no constant). 7. cosine terms only (plus constant).
4. cosine terms only (plus constant).
38 HELM (2006):
®
Task
Confirm the result obtained for the triangular wave, function 7 in the last Task,
by finding the Fourier series fully. The function involved is
f (t) = |t| −π <t<π

f (t + 2π) = f (t)
Your solution
Answer
Since f (t) is even we can say immediately
bn = 0 n = 1, 2, 3, . . .
Also

0 n even
π
Z 
2 
an = t cos nt dt = (after integration by parts)
π 0  − 4

n odd
n2 π
Z π
2
Also a0 = t dt = π so the Fourier series is
π 0

π 4 1 1
f (t) = − cos t + cos 3t + cos 5t + . . .
2 π 9 25
HELM (2006): 39

Convergence 23.4
Introduction
In this Section we examine, briefly, the convergence characteristics of a Fourier series. We have seen
that a Fourier series can be found for functions which are not necessarily continuous (there may be
jumps in the curve) − the only requirement that we have made is that the function be periodic. We
have seen that the more terms we take in the Fourier series the better is the approximation to the
given signal. But an obvious question to ask is what happens at the points of discontinuity? What
does the Fourier series converge to at these points? It must converge to something (finite) since a
Fourier series is a sum of very smooth continuous functions. In this Section we give the answer to
this question.

Prerequisites
• be familiar with the limit process as applied
Before starting this Section you should . . . to functions

'
$
• determine what a Fourier series converges to
at each point, including at a point of
Learning Outcomes discontinuity
On completion you should be able to . . . • use the convergence property of Fourier
Series to obtain series for the number π
& %
40 HELM (2006):
Workbook 23: Fourier series
1. Convergence of a Fourier series
We have now shown how to obtain a Fourier series for periodic functions. We have suggested that
we would expect to be able to approximate such functions by using a few terms of the Fourier series.
The detailed question of the convergence or otherwise of Fourier series has not been discussed.
The reason for this is that the great majority of functions likely to be encountered in practice have
Fourier series that do indeed converge and can therefore be safely used as approximations.
The precise conditions that have to be fulfilled for a Fourier series to converge are known as Dirichlet
conditions after the French mathematician who investigated the matter. The three conditions are
listed in the following Key Point.
Key Point 6
The Dirichlet conditions for the convergence of a Fourier series of a periodic function f (t) are:
1. f (t) must have only a finite number of finite discontinuities, within one period
2. f (t) must have a finite number of maxima and minima over one period
Z T
2
3. the integral |f (t)| dt must be finite.
− T2
It follows, for example, that if f (t) is defined over (−π, π) as one of the following functions t3 or
1/(t − 4) or 3t + 2 and f (t + 2π) = f (t), then f (t) can indeed be represented as a Fourier series as
each function satisfies the Dirichlet conditions.
1 1
On the other hand, if, over (−π, π), f (t) is or or tan t then f (t) cannot be expanded in a
t t−2
Fourier series because each of these functions has an infinite discontinuity within (−π, π).
If the Dirichlet conditions are satisfied at a point t = t0 where f (t) is continuous then, as we would
expect, the Fourier series at t0 given by
∞
a0 X 2nπt0 2nπt0
+ an cos + bn sin converges to the function value f (t0 )
2 n=1
T T
At a point, say t = t1 , at which f (t) has a discontinuity then the series

∞
a0 X 2nπt1 2nπt1 1
+ an cos + bn sin converges to {f (t1− ) + f (t1+ )}
2 n=1
T T 2
where f (t1− ) is the limit of f (t) as t approaches t1 from the left and f (t1+ ) is the limit as t approaches
t1 from the right (Figure 19).
HELM (2006): 41
Section 23.4: Convergence
f (t)
t1 t
Figure 19
Key Point 7
If Dirichlet conditions are satisfied then at a point of continuity t = to
∞
a0 X 2nπt0 2nπt0
f (t0 ) = + an cos + bn sin
2 n=1
T T
whereas at a point of discontinuity t = t1 the Fourier series converges to the average of the two
limiting values
∞
1 a0 X 2nπt1 2nπt1
{f (t1− ) + f (t1+ )} = + an cos + bn sin
2 2 n=1
T T
Example 2
Suppose we consider the square wave
f (t)
1
−π π 2π t
Figure 20
Here f (t) has finite discontinuities at −π, 0 and π. The Fourier series of f (t) is (see Section 23.3,
subsection 2)

1 2 1 1
+ sin t + sin 3t + sin 5t + . . . .
2 π 3 5
42 HELM (2006):
π π
At t = , for example, where f (t) is continuous the square wave converges to f = 1. On the
2 2
1
other hand at t = π the Fourier series clearly has the value (since all the sine terms are zero here).
2
1 π 1 1
This value agrees with the average of the two limiting values of f (t) at t = : (1 + 0) = . If
2 2 2 2
π
we actually put t = in the Fourier series we obtain
2

1 2 π 1 3π 1 5π
+ sin + sin + sin + ...
2 π 2 3 2 5 2

1 2 1 1 1
= + 1 − + − + ...
2 π 3 5 7
π
Since the series converges, as we have seen, to f = 1, we obtain the interesting result
2

1 2 1 1 1 1 1 1 π
+ 1 − + − + . . . = 1 or 1 − + − + . . . =
2 π 3 5 7 3 5 7 4
Task
The function

0 −π < t < 0
f (t) =
t2 0<t<π
f (t + 2π) = f (t)
f (t)
−π π 2π t
has Fourier series (see Exercise 4 at the end of Section 23.2)
π2

cos 2t cos 3t
− 2 cos t − + − ...
6 4 9

4 π π 4 π
+ π− sin t − sin 2t + − sin 3t − sin 4t + . . .
π 2 3 9π 4
By using a suitable value of t show that

1 1 1 π2
1+ + + + ... =
4 9 16 6
HELM (2006): 43
First decide on the appropriate value of t to use:
Your solution
Answer
Looking at the Fourier series, the numerical series we seek is present in the cosine terms so we need
to remove the sine terms. This we can do by selecting t = 0 or t = π. The choice t = 0 will make
the cosine terms become:
1 1
1 − + − ...
4 9
which is not what we seek. Hence we put t = π.
Now put t = π in the series and decide what the Fourier series will converge to at this value. Hence
complete the question:
Your solution
Answer
At t = π the Fourier series is
π2

cos 2π cos 3π
− 2 cos π − + − ...
6 4 9
π2 π2

1 1 1 1 1 1
= − 2 −1 − − − − ... = +2 1+ + + + ...
6 4 9 16 6 4 9 16
At t = π the Fourier series will converge to
1 2 π2
π +0 = (the average of the left and right hand limits)
2 2
π2 π2 1 π2 π2 π2

1 1 1 1 1 1
So +2 1 + + + + ... = ∴ 1+ + + +. . . = − =
6 4 9 16 2 4 9 16 2 2 6 6
Note that in the last Task if we substitute t = 0 in the Fourier series (which converges to f (0) = 0)
we obtain another infinite series but with alternating signs:
π2 π2

1 1 1 1 1
− 2 1 − + − . . . = 0 or 1 − + − + ... =
6 4 9 4 9 16 12
44 HELM (2006):
Exercises
1. Obtain the Fourier series of
f (t) = |t| −π ≤t≤π f (t + 2π) = f (t)
By putting t = 0 show that

∞
X 1 π2
=
n=1
(2n − 1)2 8
2. (a) Obtain the Fourier series of the 2π periodic function

t2
f (t) = −π ≤t≤π
4
and use it to obtain the following identities:
1 1 1 π2 1 1 1 π2
(i) 1 + + + + · · · = (ii) 1 − + − + · · · =
22 32 42 6 22 32 42 12
2
1 1 1 π
(b) Show that 1 + 2 + 2 + 2 · · · =
3 5 7 8
3. Obtain the Fourier series of the 2π periodic function
f (t) = t −π ≤t≤π
Use the series to show that

1 1 1 π
1− + − + ··· =
3 5 7 4
Answers
∞
π X (−4)
1. + cos[(2n − 1)t]
2 n=1 (2n − 1)2 π
∞
π 2 X cos(nπ)
2. (a) + cos nt (i) Put t = π (ii) Put t = 0
12 n=1 n2
(b) Add the two series from (a).
∞
X (−1)n
3. −2 sin nt
n=1
n
HELM (2006): 45

Half-Range Series 23.5
Introduction
In this Section we address the following problem:
Can we find a Fourier series expansion of a function defined over a finite interval?
Of course we recognise that such a function could not be periodic (as periodicity demands an infinite
interval). The answer to this question is yes but we must first convert the given non-periodic function
into a periodic function. There are many ways of doing this. We shall concentrate on the most useful
extension to produce a so-called half-range Fourier series.
' $
Prerequisites • be familiar with odd and even functions and

their properties
&
%

• choose to expand a non-periodic function
Learning Outcomes either as a series of sines or as a series of
cosines

46 HELM (2006):
®
1. Half-range Fourier series

So far we have shown how to represent given periodic functions by Fourier series. We now consider a
slight variation on this theme which will be useful in 25 on solving Partial Differential Equations.
Suppose that instead of specifying a periodic function we begin with a function f (t) defined only
over a limited range of values of t, say 0 < t < π. Suppose further that we wish to represent this
function, over 0 < t < π, by a Fourier series. (This situation may seem a little artificial at this point,
but this is precisely the situation that will arise in solving differential equations.)
To be specific, suppose we define f (t) = t2 0<t<π
f (t)
π2
t2
π t
Figure 21
We shall consider the interval 0 < t < π to be half a period of a 2π periodic function. We must
therefore define f (t) for −π < t < 0 to complete the specification.
Task
Complete the definition of the above function f (t) = t2 , 0 < t < π
by defining it over −π < t < 0 such that the resulting functions will have a Fourier
series containing
(a) only cosine terms, (b) only sine terms, (c) both cosine and sine terms.
Your solution
HELM (2006): 47
Section 23.5: Half-Range Series
Answer
(a) We must complete the definition so as to have an even periodic function:
f (t) = t2 , −π < t < 0
f1 (t)
−π π 2π t
(b) We must complete the definition so as to have an odd periodic function:

f (t) = −t2 , −π < t < 0
f2 (t)
−π π 2π t
(c) We may define f (t) in any way we please (other than (a) and (b) above). For example we might
define f (t) = 0 over −π < t < 0:
f3 (t)
−π π 2π t
The point is that all three periodic functions f1 (t), f2 (t), f3 (t) will give rise to a different Fourier
series but all will represent the function f (t) = t2 over 0 < t < π. Fourier series obtained by
extending functions in this sort of way are often referred to as half-range series.
Normally, in applications, we require either a Fourier Cosine series (so we would complete a definition
as in (i) above to obtain an even periodic function) or a Fourier Sine series (for which, as in (ii)
above, we need an odd periodic function.)
The above considerations apply equally well for a function defined over any interval.
48 HELM (2006):
®
Example 3
Obtain the half range Fourier Sine series to represent f (t) = t2 0 < t < 3.
Solution
We first extend f (t) as an odd periodic function F (t) of period 6: f (t) = −t2 , −3 < t < 0
F (t)
3 t
Figure 22
We now evaluate the Fourier series of F (t) by standard techniques but take advantage of the
symmetry and put an = 0, n = 0, 1, 2, . . ..
Using the results for the Fourier Sine coefficients for period T from 23.2 subsection 5,
Z T
2 2 2nπt
bn = F (t) sin dt,
T − T2 T
we put T = 6 and, since the integrand is even (a product of 2 odd functions), we can write
2 3 2 3 2
Z Z
2nπt nπt
bn = F (t) sin dt = t sin dt.
3 0 6 3 0 3
(Note that we always integrate over the originally defined range, in this case 0 < t < 3.)
We now have to integrate by parts (twice!)
( 3 Z 3 )
3t2

2 nπt 3 nπt
bn = − cos +2 t cos dt
3 nπ 3 0 nπ 0 3
( 3 Z 3 )
2 27 6 3 nπt 6 3 nπt
= − cos nπ + t sin − sin dt
3 nπ nπ nπ 3 0 nπ nπ 0 3
( 3 )
2 27 18 3 nπt 2 27 54
= − cos nπ − 2 2 − cos = − cos nπ + 3 3 (cos nπ − 1)
3 nπ nπ nπ 3 0 3 nπ nπ

18
− n = 2, 4, 6, . . .


nπ


=

 18 72

 − 3 3 n = 1, 3, 5, . . .
nπ n π
So the required Fourier Sine series is

1 4 πt 18 2πt 1 4
F (t) = 18 − sin − sin + 18 − sin(πt) − . . .
π π3 3 2π 3 3π 27π 3
HELM (2006): 49
Task
Obtain a half-range Fourier Cosine series to represent the function
f (t) = 4 − t 0 < t < 4.
f (t)
4
4 t
First complete the definition to obtain an even periodic function F (t) of period 8. Sketch F (t):
Your solution
Answer
F(t)
4
−4 4 t
Now formulate the integral from which the Fourier coefficients an can be calculated:
Your solution
Answer
We have with T = 8
2 4
Z
2nπt
an = F (t) cos dt
8 −4 8
Utilising the fact that the integrand here is even we get
1 4
Z
nπt
an = (4 − t) cos dt
2 0 4
50 HELM (2006):
®
Now integrate by parts to obtain an and also obtain a0 :

Your solution
Answer
Using integration by parts we obtain for n = 1, 2, 3, . . .
( 4 Z 4 )
1 4 nπt 4 nπt
an = (4 − t) sin + sin dt
2 nπ 4 0 nπ 0 4
4
1 4 4 nπt 8
= − cos = 2 2 [− cos(nπ) + 1]
2 nπ nπ 4 0 nπ

 0
 n = 2, 4, 6, . . .
i.e. an =
 16

n = 1, 3, 5, . . .
n2 π 2
1 4
Z
a0
Also a0 = (4 − t) dt = 4. So the constant term is = 2.
2 0 2
Now write down the required Fourier series:

Your solution
Answer
16 πt 1 3πt 1 5πt
We get 2+ 2 cos + cos + cos + ...
π 4 9 4 25 4
HELM (2006): 51
Note that the form of the Fourier series (a constant of 2 together with odd harmonic cosine terms)
could be predicted if, in the sketch of F (t), we imagine raising the t-axis by 2 units i.e. writing
F (t) = 2 + G(t)
G(t)
2
−4 4 t
−2
Figure 23
Clearly G(t) possesses half-period symmetry
G(t + 4) = −G(t)
and hence its Fourier series must contain only odd harmonics.
Exercises
Obtain the half-range Fourier series specified for each of the following functions:
1. f (t) = 1 0≤t≤π (sine series)
2. f (t) = t 0 ≤ t ≤ 1 (sine series)
3. (a) f (t) = e2t 0 ≤ t ≤ 1 (cosine series)

2t
(b) f (t) = e 0≤t≤π (sine series)
4. (a) f (t) = sin t 0≤t≤π (cosine series)

(b) f (t) = sin t 0≤t≤π (sine series)
Answers

4 1 1
1. sin t + sin 3t + sin 5t + · · ·
π 3 5
2 1 1
2. {sin πt − sin 2πt + sin 3πt − · · · }
π 2 3
∞
e2 − 1 X 4
3. (a) + 2π2
{e2 cos(nπ) − 1} cos nπt
2 n=1
4 + n
∞
X 2nπ
(b) {1 − e2 cos(nπ)} sin nπt
n=1
4 + n2 π 2
∞
2 X1 1 1
4. (a) + (1 − cos(1 − n)π) + (1 − cos(1 + n)π) cos nt
π n=2 π 1−n 1+n
(b) sin t itself (!)
52 HELM (2006):

The Complex Form 23.6

Introduction
In this Section we show how a Fourier series can be expressed more concisely if we introduce the
complex number i where i2 = −1. By utilising the Euler relation:
e iθ ≡ cos θ + i sin θ
we can replace the trigonometric functions by complex exponential functions. By also combining the
Fourier coefficients an and bn into a complex coefficient cn through
1
cn = (an − ibn )
2
we find that, for a given periodic signal, both sets of constants can be found in one operation.
We also obtain Parseval’s theorem which has important applications in electrical engineering.
The complex formulation of a Fourier series is an important precursor of the Fourier transform which
attempts to Fourier analyse non-periodic functions.
' $
• be competent working with the complex

Prerequisites numbers
Before starting this Section you should . . . • be familiar with the relation between the
exponential function and the trigonometric
functions
&
%

• express a periodic function in terms of its
Learning Outcomes Fourier series in complex form
On completion you should be able to . . . • understand Parseval’s theorem

HELM (2006): 53
Section 23.6: The Complex Form
1. Complex exponential form of a Fourier series
So far we have discussed the trigonometric form of a Fourier series i.e. we have represented functions
of period T in the terms of sinusoids, and possibly a constant term, using
∞
a0 X 2nπt 2nπt
f (t) = + an cos + bn sin .
2 n=1
T T
If we use the angular frequency
2π
ω0 =
T
we obtain the more concise form
∞
a0 X
f (t) = + (an cos nω0 t + bn sin nω0 t).
2 n=1
We have seen that the Fourier coefficients are calculated using the following integrals:
Z T
2 2
an = f (t) cos nω0 t dt n = 0, 1, 2, . . . (1)
T − T2
Z T
2 2
bn = f (t) sin nω0 t dt n = 1, 2, . . . (2)
T − T2
An alternative, more concise form, of a Fourier series is available using complex quantities. This
form is quite widely used by engineers, for example in Circuit Theory and Control Theory, and leads
naturally into the Fourier Transform which is the subject of 24.
2. Revision of the exponential form of a complex number

Recall that a complex number in Cartesian form which is written as
z = a + ib,
where a and b are real numbers and i2 = −1, can be written in polar form as
z = r(cos θ + i sin θ)
√
where r = |z| = a2 + b2 and θ, the argument or phase of z, is such that
a = r cos θ b = r sin θ.
A more concise version of the polar form of z can be obtained by defining a complex exponential
quantity e iθ by Euler’s relation
e iθ ≡ cos θ + i sin θ
The polar angle θ is normally expressed in radians. Replacing i by − i we obtain the alternative
form
e− iθ ≡ cos θ − i sin θ
54 HELM (2006):
Task
Write down in cos θ±i sin θ form and also in Cartesian form (a) e iπ/6 (b) e− iπ/6 .
Use Euler’s relation:

Your solution
Answer
We have, by definition,
π π √3 1 π π √
3 1
(a) e iπ/6 = cos + i sin = + i (b) e− iπ/6 = cos − i sin = − i
6 6 2 2 6 6 2 2
Task π π
Write down (a) cos (b) sin in terms of e iπ/6 and e− iπ/6 .
6 6
Your solution
Answer
We have, adding the two results from the previous task
π π 1
− iπ/6
iπ/6
e iπ/6 + e− iπ/6

e +e = 2 cos or cos =
6 6 2
Similarly, subtracting the two results,
π π 1
e iπ/6 − e− iπ/6 = 2 i sin e iπ/6 − e− iπ/6

or sin =
6 6 2i
(Don’t forget the factor i in this latter case.)
Clearly, similar calculations could be carried out for any angle θ. The general results are summarised
in the following Key Point.
HELM (2006): 55
Key Point 8
Euler’s Relations
eiθ ≡ cos θ + i sin θ, e−iθ ≡ cos θ − i sin θ

1 iθ 1
e + e− iθ e iθ − e− iθ

cos θ ≡ sin θ ≡
2 2i
Using these results we can redraft an expression of the form

an cos nθ + bn sin nθ
in terms of complex exponentials.
(This expression, with θ = ω0 t, is of course the nth harmonic of a trigonometric Fourier series.)
Task
Using the results from the Key Point 8 (with nθ instead of θ) rewrite
an cos nθ + bn sin nθ
in complex exponential form.
First substitute for cos nθ and sin nθ with exponential expressions using Key Point 8:
Your solution
Answer
We have
an inθ bn
e + e− inθ e inθ − e− inθ

an cos nθ = bn sin nθ =
2 2i
so
an inθ bn
e + e− inθ + e inθ − e− inθ

an cos nθ + bn sin nθ =
2 2i
56 HELM (2006):
1
Now collect the terms in e inθ and in e− inθ and use the fact that = −i:
i
Your solution
Answer
We get

1 bn inθ 1 bn − inθ
an + e + an − e
2 i 2 i
1 i 1 1
or, since = 2 =−i (an − ibn )e inθ + (an + ibn )e− inθ .
i i 2 2
Now write this expression in more concise form by defining

1 1
cn = (an − ibn ) which has complex conjugate c∗n = (an + ibn ).
2 2
Write the concise complex exponential expression for an cos nθ + bn sin nθ:
Your solution
Answer
an cos nθ + bn sin nθ = cn e inθ + c∗n e− inθ
Clearly, we can now rewrite the trigonometric Fourier series

∞ ∞
a0 X a0 X
cn e inω0 t + c∗n e− inω0 t

+ (an cos nω0 t + bn sin nω0 t) as + (3)
2 n=1
2 n=1
A neater, and particularly concise, form of this expression can be obtained as follows:
a0
Firstly write = c0 (which is consistent with the general definition of cn since b0 = 0).
2
The second term in the summation
X ∞
c∗n e− inω0 t = c∗1 e− iω0 t + c∗2 e−2 iω0 t + . . .
n=1
can be written, if we define c−n = c∗n = 12 (an + ibn ), as

−∞
X
− iω0 t −2 iω0 t −3 iω0 t
c−1 e + c−2 e + c−3 e + ... = cn e inω0 t
n=−1
∞
X −∞
X ∞
X
Hence (3) can be written c0 + cn e inω0 t + cn e inω0 t or in the very concise form cn e inω0 t .
n=1 n=−1 n=−∞
HELM (2006): 57
The complex Fourier coefficients cn can be readily obtained as follows using (1) and (2) for an , bn .
Firstly
Z T
a0 1 2
c0 = = f (t) dt (4)
2 T − T2
For n = 1, 2, 3, . . . we have
Z T Z T
1 1 2 1 2
cn = (an − ibn ) = f (t)(cos nω0 t − i sin nω0 t) dt i.e. cn = f (t)e− inω0 t dt (5)
2 T − T2 T − T2
Also for n = 1, 2, 3, . . . we have

Z T
1 1 2
c−n = c∗n = (an + ibn ) = f (t)e inω0 t dt
2 T − T2
This last expression is equivalent to stating that for n = −1, −2, −3, . . .
Z T
1 2
cn = f (t)e− inω0 t dt (6)
T − T2
The three equations (4), (5), (6) can thus all be contained in the one expression
Z T
1 2
cn = f (t)e− inω0 t dt for n = 0, ±1, ±2, ±3, . . .
T − T2
The results of this discussion are summarised in the following Key Point.
Key Point 9
Fourier Series in Complex Form
A function f (t) of period T has a complex Fourier series
∞ Z T
X 1 2
f (t) = cn e inω0 t
where cn = f (t)e− inω0 t dt
n=−∞
T −2 T
For the special case T = 2π, so that ω0 = 1, these formulae become particularly simple:
∞ Z π
X 1
f (t) = cn e int
cn = f (t)e− int dt.
n=−∞
2π −π
58 HELM (2006):
3. Properties of the complex Fourier coefficients
Using properties of the trigonometric Fourier coefficients an , bn we can readily deduce the following
results for the cn coefficients:
a0
1. c0 = is always real.
2
2. Suppose the periodic function f (t) is even so that all bn are zero. Then, since in the complex
form the bn arise as the imaginary part of cn , it follows that for f (t) even the coefficients cn
(n = ±1, ±2, . . .) are wholly real.
Task
If f (t) is odd, what can you deduce about the Fourier coefficients cn ?
Your solution
Answer
Since, for an odd periodic function the Fourier coefficients an (which constitute the real part of cn )
are zero, then in this case the complex coefficients cn are wholly imaginary.
3. Since
Z T
1 2
cn = f (t)e− inω0 t dt
T − T2
then if f (t) is even, cn will be real, and we have two possible methods for evaluating cn :

T T
(a) Evaluate the integral above as it stands i.e. over the full range − , . Note
2 2
carefully that the second term in the integrand is neither an even nor an odd function so
the integrand itself is
( even function) × ( neither even nor odd function) = neither even nor odd function.
2 T /2
Z
Thus we cannot write cn = f (t)e− inω0 t dt
T 0
(b) Put e− inω0 t = cos nω0 t − i sin nω0 t so
f (t)e− inω0 t = f (t) cos nω0 t − if (t) sin nω0 t = ( even)( even) − i( even)( odd)
= ( even) − i( odd).
Z T
2 2 an
Hence cn = f (t) cos nω0 t dt = .
T 0 2
T
4. If f (t + ) = −f (t) then of course only odd harmonic coefficients cn (n = ±1, ±3, ±5, . . .)
2
will arise in the complex Fourier series just as with trigonometric series.
HELM (2006): 59
Example 4
Find the complex Fourier series of the saw-tooth wave shown in Figure 24:
f (t)
−T T 2T t
Figure 24
Solution
We have
At
f (t) = 0<t<T f (t + T ) = f (t)
T
2π
The period is T in this case so ω0 = .
T
Looking at the graph of f (t) we can say immediately
(a) the Fourier series will contain a constant term c0

A
(b) if we imagine shifting the horizontal axis up to the signal can be written
2
A
f (t) = + g(t), where g(t) is an odd function with complex Fourier coefficients that
2
are purely imaginary.
A
Hence we expect the required complex Fourier series of f (t) to contain a constant term and
2
complex exponential terms with purely imaginary coefficients. We have, from the general theory,
and using 0 < t < T as the basic period for integrating,
1 T At − inω0 t A T − inω0 t
Z Z
cn = e dt = 2 te dt
T 0 T T 0
We can evaluate the integral using parts:
T T Z T
te− inω0 t
Z
− inω0 t 1
te dt = + e− inω0 t dt
0 (− inω0 )
0 inω0 0
T
T e inω0 T

1 − inω0 t
= − e
(− inω0 ) ( inω0 )2 0
60 HELM (2006):
Solution (contd.)
2π
But ω0 = so
T
e− inω0 T = e− in2π = cos 2nπ − i sin 2nπ
= 1−0 i=1
Hence the integral becomes

T 1
e− inω0 T − 1

− 2
− inω0 ( inω0 )
Hence

A T iA
cn = 2 = n = ±1, ±2, . . .
T − inω0 2πn
Note that
iA − iA
c−n = = = c∗n as it must
2π(−n) 2πn
Z T
1 At A
Also c0 = dt = as expected.
T 0 T 2
Hence the required complex Fourier series is
∞
A iA X e inω0 t
f (t) = +
2 2π n=−∞ n
n6=0
which could be written, showing only the constant and the first two harmonics, as
e− i2ω0 t e i2ω0 t

A − iω0 t iω0 t
f (t) = ... − i − ie + π + ie + i + ... .
2π 2 2
The corresponding trigonometric Fourier series for the function can be readily obtained from this
complex series by combining the terms in ±n, n = 1, 2, 3, . . .
For example this first harmonic is
A A
− ie− iω0 t + ie iω0 t =

{− i(cos ω0 t − i sin ω0 t) + i(cos ω0 t + i sin ω0 t)}
2π 2π
A A
= (−2 sin ω0 t) = − sin ω0 t
2π π
Performing similar calculations on the other harmonics we obtain the trigonometric form of the
Fourier series
∞
A A X sin nω0 t
f (t) = − .
2 π n=1 n
HELM (2006): 61
Task
Find the complex Fourier series of the periodic function:
f (t) = et −π <t<π
f (t + 2π) = f (t)
f (t)
−π π 3π t
Firstly write down an integral expression for the Fourier coefficients cn :

Your solution
Answer
We have, since T = 2π, so ω0 = 1
Z π
1
cn = et e− int dt
2π −π
Now combine the real exponential and the complex exponential as one term and carry out the
integration:
Your solution
Answer
We have
π π
1 e(1− in)t
Z
1 1 1
(1− in)t
e(1− in)π − e−(1− in)π

cn = e dt = =
2π −π 2π (1 − in) −π 2π (1 − in)
62 HELM (2006):
Now simplify this as far as possible and write out the Fourier series:
Your solution
Answer
e(1− in)π = eπ e− inπ = eπ (cos nπ − i sin nπ) = eπ cos nπ
e−(1− in)π = e−π e inπ = e−π cos nπ

1 1 sinh π (1 + in)
Hence cn = (eπ − e−π ) cos nπ = cos nπ
2π (1 − in) π (1 + n2 )
Note that the coefficients cn n = ±1, ±2, . . . have both real and imaginary parts in this case as the
function being expanded is neither even nor odd.
sinh π (1 − in) sinh π (1 − in)
Also c−n = 2
cos(−nπ) = cos nπ = c∗n as required.
π (1 + (−n) ) π (1 + n2 )
sinh π
This includes the constant term c0 = . Hence the required Fourier series is
π
∞
sinh π X (1 + in) int
f (t) = (−1)n e since cos nπ = (−1)n .
π n=−∞ (1 + n2 )
HELM (2006): 63
4. Parseval’s theorem
This is essentially a mathematical theorem but has, as we shall see, an important engineering in-
terpretation particularly in electrical engineering. Parseval’s theorem states that if f (t) is a periodic
function with period T and if cn (n = 0, ±1, ±2, . . .) denote the complex Fourier coefficients of f (t),
then
Z T ∞
1 2
2
X
f (t) dt = |cn |2 .
T −2 T
n=−∞
In words the theorem states that the mean square value of the signal f (t) over one period equals the
sum of the squared magnitudes of all the complex Fourier coefficients.
Proof of Parseval’s theorem.

Assume f (t) has a complex Fourier series of the usual form:
∞
X
inω0 t 2π
f (t) = cn e ω0 =
n=−∞
T
where
Z T
1 2
cn = f (t)e− inω0 t dt
T − T2
Then
X X
f 2 (t) = f (t)f (t) = f (t) cn e inω0 t = cn f (t)e inω0 t
Hence
Z T Z T
1 2
2 1 2 X
f (t) dt = cn f (t)e inω0 t dt
T − T2 T − T2
Z T
1X 2
= cn f (t)e inω0 t dt
T − T2
X
= cn c∗n
∞
X
= |cn |2
n=−∞
which completes the proof.

Parseval’s theorem can also be written in terms of the Fourier coefficients an , bn of the trigonometric
Fourier series. Recall that
a0 an − ibn an + ibn
c0 = cn = n = 1, 2, 3, . . . cn = n = −1, −2, −3, . . .
2 2 2
so
2a2n + b2n
|cn | = n = ±1, ±2, ±3, . . .
4
so
64 HELM (2006):
∞ ∞
X
2 a20 X a2n + b2n
|cn | = +2
n=−∞
4 n=1
4
and hence Parseval’s theorem becomes

Z T ∞
1 2
2 a20 1 X 2
f (t)dt = + (a + b2n ) (7)
T − T2 4 2 n=1 n
The engineering interpretation of this theorem is as follows. Suppose f (t) denotes an electrical signal
(current or voltage), then from elementary circuit theory f 2 (t) is the instantaneous power (in a 1
ohm resistor) so that
Z T
1 2
f 2 (t) dt
T − T2
is the energy dissipated in the resistor during one period.
Now a sinusoid wave of the form
A cos ωt ( or A sin ωt)
A2 A2
has a mean square value so a purely sinusoidal signal would dissipate a power in a 1 ohm
2 2
resistor. Hence Parseval’s theorem in the form (7) states that the average power dissipated over 1
period equals the sum of the powers of the constant (or d.c.) components and of all the sinusoidal
(or alternating) components.
Task
The triangular signal shown below has trigonometric Fourier series
∞
π 4 X cos nt
f (t) = − .
2 π n=1 n2
( odd n)
[This was deduced in the Task in Section 23.3, page 39.]
f (t)
π
−π π t
∞
X 1 π4
Use Parseval’s theorem to show that = .
n=1
n4 96
(n odd)
HELM (2006): 65
First, identify a0 , an and bn for this situation and write down the definition of f (t) for this case:
Your solution
Answer
a0 π
We have =
2 2
4

 − 2
 n = 1, 3, 5, . . .
an = nπ

0 n = 2, 4, 6, . . .

bn = 0 n = 1, 2, 3, 4, . . .
Also
f (t) = |t| −π <t<π
f (t + 2π) = f (t)
Now evaluate the integral on the left hand side of Parseval’s theorem and hence complete the problem:
Your solution
66 HELM (2006):
Answer
We have f 2 (t) = t2 so
Z T Z π π
1 2
2 1 2 1 t3 π2
f (t) dt = t dt = =
T − T2 2π −π 2π 3 −π 3
The right-hand side of Parseval’s theorem is
∞ ∞
a20 X 2 π 2 1 X 16
+ an = +
4 n=1
4 2 n=1 n4 π 2
(n odd)
Hence
∞ ∞ ∞
π2 π2 8 X 1 8 X 1 π2 X 1 π4
= + 2 ∴ = ∴ = .
3 4 π n=1 n4 π 2 n=1 n4 12 n=1
n4 96
(n odd) (n odd) (n odd)
Exercises
Obtain the complex Fourier series for each of the following functions of period 2π.
1. f (t) = t −π ≤t≤π
2. f (t) = t 0 ≤ t ≤ 2π
3. f (t) = et −π ≤t≤π
Answers
X (−1)n
1. i eint (sum from −∞ to ∞ excluding n = 0).
n
X1
2. π + i eint (sum from −∞ to ∞ excluding n = 0).
n
sinh π X (1 + in) int
3. (−1)n e (sum from −∞ to ∞).
π (1 + n2 )
HELM (2006): 67
An Application of
Fourier Series 23.7
Introduction
In this Section we look at a typical application of Fourier series. The problem we study is that of a
differential equation with a periodic (but non-sinusoidal) forcing function. The differential equation
chosen models a lightly damped vibrating system.
' $
• be competent to use complex numbers

Prerequisites
• be familiar with the relation between the
exponential function and the trigonometric
functions
&
%

Learning Outcomes • solve a linear differential equation with a

periodic forcing function using Fourier series

68 HELM (2006):
1. Modelling vibration by differential equation
Vibration problems are often modelled by ordinary differential equations with constant coefficients.
For example the motion of a spring with stiffness k and damping constant c is modelled by
d2 y dy
m 2
+ c + ky = 0 (1)
dt dt
where y(t) is the displacement of a mass m connected to the spring. It is well-known that if c2 < 4mk,
usually referred to as the lightly damped case, then
y(t) = e−αt (A cos ωt + B sin ωt) (2)
i.e. the motion is sinusoidal but damped by the negative exponential term. In (2) we have used the
notation
c 1 √
α= ω= 4km − c2 to simplify the equation.
2m 2m
The values of A and B depend upon initial conditions.
The system represented by (1), whose solution is (2), is referred to as an unforced damped har-
monic oscillator.
A lightly damped oscillator driven by a time-dependent forcing function F (t) is modelled by the
differential equation
d2 y dy
m 2
+ c + ky = F (t) (3)
dt dt
The solution or system response in (3) has two parts:
(a) A transient solution of the form (2),

(b) A forced or steady state solution whose form, of course, depends on F (t).
If F (t) is sinusoidal such that

F (t) = A sin(Ωt + φ) where Ω and φ are constants,
then the steady state solution is fairly readily obtained by standard techniques for solving differential
equations. If F (t) is periodic but non-sinusoidal then Fourier series may be used to obtain the steady
state solution. The method is based on the principle of superposition which is actually applicable
to any linear (homogeneous) differential equation. (Another engineering application is the series
LCR circuit with an applied periodic voltage.)
The principle of superposition is easily demonstrated:-
Let y1 (t) and y2 (t) be the steady state solutions of (3) when F (t) = F1 (t) and F (t) = F2 (t)
respectively. Then
d2 y1 dy1
m 2
+c + ky1 = F1 (t)
dt dt
d2 y2 dy2
m 2 +c + ky2 = F2 (t)
dt dt
Simply adding these equations we obtain
d2 d
m 2 (y1 + y2 ) + c (y1 + y2 ) + k(y1 + y2 ) = F1 (t) + F2 (t)
dt dt
HELM (2006): 69
Section 23.7: An Application of Fourier Series
from which it follows that if F (t) = F1 (t) + F2 (t) then the system response is the sum y1 (t) + y2 (t).
This, in its simplest form, is the principle of superposition. More generally if the forcing function is
N
X
F (t) = Fn (t)
n=1
N
X
then the response is y(t) = yn (t) where yn (t) is the response to the forcing function Fn (t).
n=1
Returning to the specific case where F (t) is periodic, the solution procedure for the steady state
response is as follows:
Step 1: Obtain the Fourier series of F (t).
Step 2: Solve the differential equation (3) for the response yn (t) corresponding to the n th har-
monic in the Fourier series. (The response yo to the constant term, if any, in the Fourier
series may have to be obtained separately.)
Step 3: Superpose the solutions obtained to give the overall steady state motion:
N
X
y(t) = y0 (t) + yn (t)
n=1
The procedure can be lengthy but the solution is of great engineering interest
r because if the frequency
k
of one harmonic in the Fourier series is close to the natural frequency of the undamped system
m
then the response to that harmonic will dominate the solution.
2. Applying Fourier series to solve a differential equation

The following Task which is quite long will provide useful practice in applying Fourier series to a
practical problem. Essentially you should follow Steps 1 to 3 above carefully.
Task
The problem is to find the steady state response y(t) of a spring/mass/damper
system modelled by
d2 y dy
m 2
+ c + ky = F (t) (4)
dt dt
where F (t) is the periodic square wave function shown in the diagram.
F (t)
F0
−t0 t0 t
− F0
70 HELM (2006):
Step 1: Obtain the Fourier series of F (t) noting that it is an odd function:
Your solution
Answer
The calculation is similar to those you have performed earlier in this Workbook.
2π π
Since F (t) is an odd function and has period 2t0 so that ω = = , it has Fourier coefficients:
2t0 t0
2 to
Z
nπt
bn = F0 sin dt n = 1, 2, 3, . . .
t0 0 t0
t
2F0 t0 nπt 0
= − cos
t0 nπ t0 0
4F0
(
2F0 n odd
= (1 − cos nπ) = nπ
nπ 0 n even
∞
4F0 X sin nωt
so F (t) = (where the sum is over odd n only).
π n=1 n
Step 2(a):
Since each term in the Fourier series is a sine term you must now solve (4) to find the steady state
response yn to the n th harmonic input: Fn (t) = bn sin nωt n = 1, 3, 5, . . .
From the basic theory of linear differential equations this response has the form
yn = An cos nωt + Bn sin nωt (5)
where An and Bn are coefficients to be determined by substituting (5) into (4) with F (t) = Fn (t).
Do this to obtain simultaneous equations for An and Bn :
HELM (2006): 71
Your solution
Answer
We have, differentiating (5),
yn0 = nω(−An sin nωt + Bn cos nωt)

yn00 = (nω)2 (−An cos nωt − Bn sin nωt)
from which, substituting into (4) and collecting terms in cos nωt and sin nωt,
(−m(nω)2 An + cnωBn + kAn ) cos nωt + (−m(nω)2 Bn − cnωAn + kBn ) sin nωt = bn sin nωt
Then, by comparing coefficients of cos nωt and sin nωt, we obtain the simultaneous equations:
(k − m(nω)2 )An + c(nω)Bn = 0 (6)
−c(nω)An + (k − m(nω)2 )Bn = bn (7)
Step 2(b):
Now solve (6) and (7) to obtain An and Bn :
Your solution
72 HELM (2006):
Answer
cωn bn
An = − (8)
(k − mωn2 )2 + ωn2 c2
(k − mωn2 )bn
Bn = (9)
(k − mωn2 )2 + ωn2 c2
where we have written ωn for nω as the frequency of the n th harmonic
It follows that the steady state response yn to the n th harmonic of the Fourier series of the forcing
function is given by (5). The amplitudes An and Bn are given by (8) and (9) respectively in terms of
the systems parameters k, c, m, the frequency ωn of the harmonic and its amplitude bn . In practice
it is more convenient to represent yn in the so-called amplitude/phase form:
yn = Cn sin(ωn t + φn ) (10)
where, from (5) and (10),
An cos ωn t + Bn sin ωn t = Cn (cos φn sin ωn t + sin φn cos ωn t).
Hence
Cn sin φn = An Cn cos φn = Bn
so
An cωn
tan φn = = (11)
Bn (mωn2 − k)2
p bn
Cn = A2n + Bn2 = p (12)
(mωn2 − k)2 + ωn2 c2
Step 3:
Finally, use the superposition principle, to state the complete steady state response of the system to
the periodic square wave forcing function:
Your solution
Answer
∞
X X
y(t) = yn (t) = Cn (sin ωn t + φn ) where Cn and φn are given by (11) and (12).
n=1 n=1
(n odd)
4F0 1
In practice, since bn = it follows that the amplitude Cn also decreases as . However, if one of
nπ r n
k
the harmonic frequencies say ωn0 is close to the natural frequency of the undamped oscillator
m
then that particular frequency harmonic will dominate in the steady state response. The particular
value ωn0 will, of course, depend on the values of the system parameters k and m.
HELM (2006): 73
Contents 24
Fourier Transforms
24.1 The Fourier Transform 2
24.2 Properties of the Fourier Transform 14
24.3 Some Special Fourier Transform Pairs 27
Learning outcomes
In this Workbook you will learn about the Fourier transform which has many applications
in science and engineering. You will learn how to find Fourier transforms of some
standard functions and some of the properties of the Fourier transform. You will learn
about the inverse Fourier transform and how to find inverse transforms directly and by
using a table of transforms. Finally, you will learn about some special Fourier transform
pairs.

The Fourier transform 24.1
Introduction
Fourier transforms have for a long time been a basic tool of applied mathematics, particularly for
solving differential equations (especially partial differential equations) and also in conjunction with
integral equations.
There are really three Fourier transforms, the Fourier Sine and Fourier Cosine transforms and a
complex form which is usually referred to as the Fourier transform.
The last of these transforms in particular has extensive applications in Science and Engineering, for
example in physical optics, chemistry (e.g. in connection with Nuclear Magnetic Resonance and
Crystallography), Electronic Communications Theory and more general Linear Systems Theory.

Prerequisites • be familiar with basic Fourier series,

particularly in the complex form

'
$
• calculate simple Fourier transforms from the
definition
Learning Outcomes
• state how the Fourier transform of a function
On completion you should be able to . . . (signal) depends on whether that function is
even or odd or neither
& %
2 HELM (2006):
Workbook 24: Fourier transforms
®
1. The Fourier transform

Unlike Fourier series, which are mainly useful for periodic functions, the Fourier transform permits
alternative representations of mostly non-periodic functions.
We shall firstly derive the Fourier transform from the complex exponential form of the Fourier series
and then study its various properties.
2. Informal derivation of the Fourier transform

Recall that if f (t) is a period T function, which we will temporarily re-write as fT (t) for emphasis,
then we can expand it in a complex Fourier series,
∞
X
fT (t) = cn einω0 t (1)
n=−∞
2π 2π
where ω0 = . In words, harmonics of frequency nω0 = n n = 0, ±1, ±2, . . . are present in
T T
the series and these frequencies are separated by
2π
nω0 − (n − 1)ω0 = ω0 = .
T
Hence, as T increases the frequency separation becomes smaller and can be conveniently written as
∆ω. This suggests that as T → ∞, corresponding to a non-periodic function, then ∆ω → 0 and
the frequency representation contains all frequency harmonics.
To see this in a little more detail, we recall ( 23: Fourier series) that the complex Fourier
coefficients cn are given by
Z T
1 2
cn = fT (t)e−inω0 t dt. (2)
T − T2
1 ω0
Putting as and then substituting (2) in (1) we get
T 2π
∞
( Z T )
X ω0 2
fT (t) = fT (t)e−inω0 t dt einω0 t .
n=−∞
2π −2T
In view of the discussion above, as T → ∞ we can put ω0 as ∆ω and replace the sum over the
discrete frequencies nω0 by an integral over all frequencies. We replace nω0 by a general frequency
variable ω. We then obtain the double integral representation
Z ∞ Z ∞
1 −iωt
f (t) = f (t)e dt eiωt dω. (3)
−∞ 2π −∞
The inner integral (over all t) will give a function dependent only on ω which we write as F (ω).
Then (3) can be written
Z ∞
1
f (t) = F (ω)eiωt dω (4)
2π −∞
where
HELM (2006): 3
Section 24.1: The Fourier transform
Z ∞
F (ω) = f (t)e−iωt dt. (5)
−∞
The representation (4) of f (t) which involves all frequencies ω can be considered as the equivalent
for a non-periodic function of the complex Fourier series representation (1) of a periodic function.
The expression (5) for F (ω) is analogous to the relation (2) for the Fourier coefficients cn .
The function F (ω) is called the Fourier transform of the function f (t). Symbolically we can write
F (ω) = F{f (t)}.
Equation (4) enables us, in principle, to write f (t) in terms of F (ω). f (t) is often called the inverse
Fourier transform of F (ω) and we denote this by writing
f (t) = F −1 {F (ω)}.
1
Looking at the basic relation (3) it is clear that the position of the factor is somewhat arbitrary
2π
in (4) and (5). If instead of (5) we define
Z ∞
1
F (ω) = f (t)e−iωt dt.
2π −∞
then (4) must be written
Z ∞
f (t) = F (ω)eiωt dω.
−∞
A third, more symmetric, alternative is to write

Z ∞
1
F (ω) = √ f (t)e−iωt dt
2π −∞
and, consequently:
Z ∞
1
f (t) = √ F (ω)eiωt dω.
2π −∞
We shall use (4) and (5) throughout this Section but you should be aware of these other possibilities
which might be used in other texts.
Engineers often refer to F (ω) (whichever precise definition is used!) as the frequency domain
representation of a function or signal and f (t) as the time domain representation. In what follows
we shall use this language where appropriate. However, (5) is really a mathematical transformation for
obtaining one function from another and (4) is then the inverse transformation for recovering the initial
function. In some applications of Fourier transforms (which we shall not study) the time/frequency
interpretations are not relevant. However, in engineering applications, such as communications theory,
the frequency representation is often used very literally.
As can be seen above, notationally we will use capital letters to denote Fourier transforms: thus a
function f (t) has a Fourier transform denoted by F (ω), g(t) has a Fourier transform written G(ω)
and so on. The notation F (iω), G(iω) is used in some texts because ω occurs in (5) only in the
term e−iωt .
4 HELM (2006):
®
3. Existence of the Fourier transform

We will discuss this question in a little detail at a later stage when we will also consider briefly the
relation between the Fourier transform and the Laplace Transform ( 20). For now we will use
(5) to obtain the Fourier transforms of some important functions.
Example 1
Find the Fourier transform of the one-sided exponential function

0 t<0
f (t) = −αt
e t>0
where α is a positive constant, shown below:
f (t)
Figure 1
Solution
Using (5) then by straightforward integration
Z ∞
F (ω) = e−αt e−iωt dt (since f (t) = 0 for t < 0)
0
Z ∞
= e−(α+iωt) dt
0 −(α+iω)t ∞
e
=
−(α + iω) 0
1
=
α + iω
since e−αt → 0 as t → ∞ for α > 0.
This important Fourier transform is written in the following Key Point:
HELM (2006): 5
Key Point 1
1
F{e−αt u(t)} = , α > 0.
α + iω
Note that this real function has a complex Fourier transform.
Note that if u(t) is used to denote the Heaviside unit step function:

0 t<0
u(t) =
1 t>0
then we can write the function in Example 1 as: f (t) = e−αt u(t). We shall frequently use this
concise notation for one-sided functions.
Task
Write down the Fourier transforms of
t
(a) e−t u(t) (b) e−3t u(t) (c) e− 2 u(t)
Use Key Point 1:

Your solution
(a)
(b)
(c)
Answer
1
(a) α = 1 so F{e−t u(t)} =
1 + iω
1
(b) α = 3 so F{e−3t u(t)} =
3 + iω
t 1
(c) α = 12 so F{e− 2 u(t)} = 1
2
+ iω
6 HELM (2006):
®
Task
Obtain, using the integral definition (5), the Fourier transform of the rectangular
pulse

1 −a < t < a
p(t) = .
0 otherwise
Note that the pulse width is 2a as indicated in the diagram below.
p(t)
1
−a a t
First use (5) to write down the integral from which the transform will be calculated:
Your solution
Answer Z a
P (ω) ≡ F{p(t)} = (1)e−iωt dt using the definition of p(t)
−a
Now evaluate this integral and write down the final Fourier transform in trigonometric, rather than
complex exponential form:
Your solution
Answer
a a
e−iωt e−iωa − e+iωa
Z
−iωt
P (ω) = (1)e dt = =
−a (−iω) −a (−iω)
(cos ωa − i sin ωa) − (cos ωa + i sin ωa) 2i sin ωa
= =
(−iω) iω
i.e.
2 sin ωa
P (ω) = F{p(t)} = (6)
ω
Note that in this case the Fourier transform is wholly real.
HELM (2006): 7
sin x
Engineers often call the function the sinc function. Consequently if we write, the transform
x
(6) of the rectangular pulse as
sin ωa
P (ω) = 2a ,
ωa
we can say
P (ω) = 2a sinc(ωa).
Using the result (6) in (4) we have the Fourier integral representation of the rectangular pulse.
Z ∞
1 sin ωa iωt
p(t) = 2 e dω.
2π −∞ ω
As we have already mentioned, this corresponds to a Fourier series representation for a periodic
function.
Key Point 2
The Fourier transform of a Rectangular Pulse

1 −a < t < a
If pa (t) = then:
0 otherwise
sin ωa
F{pa (t)} = 2a = 2a sinc(ωa)
ωa
Clearly, if the rectangular pulse has width 2, corresponding to a = 1 we have:

sin ω
P1 (ω) ≡ F{p1 (t)} = 2 .
ω
sin ω sin ω
As ω → 0, then 2 → 2. Also, the function 2 is an even function being the product of two
ω ω
1
odd functions 2 sin ω and . The graph of P1 (ω) is as follows:
ω
P1 (ω)
−π π
ω
Figure 2
8 HELM (2006):
®
Task
Obtain the Fourier transform of the two sided exponential function
αt
e t<0
f (t) = −αt
e t>0
where α is a positive constant.
f (t)
Your solution
Answer
We must separate the range of the integrand into [−∞, 0] and [0, ∞] since the function f (t) is
defined separately in these two regions: then
Z 0 Z ∞ Z 0 Z ∞
αt −iωt −αt −iωt
F (ω) = e e dt + e e dt = e(α−iω)t
dt + e−(α+iω)t dt
−∞ 0 −∞ 0
0 ∞
e(α−iω)t e−(α+iω)t

= +
(α − iω) −∞ −(α + iω) 0
1 1 2α
= + = 2 .
α − iω α + iω α + ω2
HELM (2006): 9
Note that, as in the case of the rectangular pulse, we have here a real even function of t giving a
Fourier transform which is wholly real. Also, in both cases, the Fourier transform is an even (as well
as real) function of ω.
Note also that it follows from the above calculation that
1
F{e−αt u(t)} = (as we have already found)
α + iω
and
αt
αt 1 αt e t<0
F{e u(−t)} = where e u(−t) = .
α − iω 0 t>0
4. Basic properties of the Fourier transform

Real and imaginary parts of a Fourier transform
Using the definition (5) we have,
Z ∞
F (ω) = f (t)e−iωt dt.
−∞
If we write e−iωt = cos ωt − i sin ωt, then

Z ∞ Z ∞
F (ω) = f (t) cos ωt dt − i f (t) sin ωt dt
−∞ −∞
where both integrals are real, assuming that f (t) is real. Hence the real and imaginary parts of the
Fourier transform are:
Z ∞ Z ∞
Re (F (ω)) = f (t) cos ωt dt Im (F (ω)) = − f (t) sin ωt dt.
−∞ −∞
Task Z a Z a
Recalling that if h(t) is even and g(t) is odd then h(t) dt = 2 h(t) dt and
Z a −a 0
g(t) dt = 0, deduce Re(F (ω)) and Im(F (ω)) if

−a
(a) f (t) is a real even function
(b) f (t) is a real odd function.
Your solution
(a)
10 HELM (2006):
®
Answer
If f (t) is real and even
Z ∞
R(ω) ≡ Re F (ω) = 2 f (t) cos ωt dt (because the integrand is even)
0
Z ∞
I(ω) ≡ Im F (ω) = − f (t) sin ωt dt = 0 (because the integrand is odd).
−∞
Thus, any real even function f (t) has a wholly real Fourier transform. Also since
cos((−ω)t) = cos(−ωt) = cos ωt
the Fourier transform in this case will be a real even function.
Your solution
(b)
Answer
Now
Z ∞ Z ∞ Z ∞
Re F (ω) = f (t) cos ωt dt = (odd) × (even) dt = (odd) dt = 0
−∞ −∞ −∞
and
Z ∞ Z ∞
Im F (ω) = − f (t) sin ωt dt = −2 f (t) sin ωt dt
−∞ 0
(because the integrand is (odd)×(odd)=(even)).

Also since sin((−ω)t) = − sin ωt, the Fourier transform in this case is an odd function of ω.
These results are summarised in the following Key Point:
HELM (2006): 11
Key Point 3
f (t) F (ω) = F{f (t)}
real and even real and even
real and odd purely imaginary and odd
neither even nor odd complex , F (ω) = R(ω) + iI(ω)
Polar form of a Fourier transform
Task
The one-sided exponential function f (t) = e−αt u(t) has Fourier transform
1
F (ω) = . Find the real and imaginary parts of F (ω).
α + iω
Your solution
Answer
1 α − iω
F (ω) = = 2 .
α + iω α + ω2
α −ω
Hence R(ω) = Re F (ω) = I(ω) = Im F (ω) =
α2 + ω2 α2
+ ω2
We can rewrite F (ω), like any other complex quantity, in polar form by calculating the magnitude
and the argument (or phase). For the Fourier transform in the last Task
s
p α2 + ω 2 1
|F (ω)| = R2 (ω) + I 2 (ω) = 2 2 2
=√
(α + ω ) α + ω2
2

−1 I(ω) −1 −ω
and arg F (ω) = tan = tan .
R(ω) α
12 HELM (2006):
®
|F (ω)| argF (ω)

1
α
π/2
ω ω
−π/2
Figure 3
In general, a Fourier transform whose Cartesian form is F (ω) = R(ω) + iI(ω) has a polar form
F (ω) = |F (ω)|eiφ(ω) where φ(ω) ≡ arg F (ω).
Graphs, such as those shown in Figure 3, of |F (ω)| and arg F (ω) plotted against ω, are often referred
to as magnitude spectra and phase spectra, respectively.
Exercises
1. Obtain the Fourier transform of the rectangular pulses
1

 |t| ≤ 3
1 |t| ≤ 1

(a) f (t) = (b) f (t) = 4
0 |t| > 1 
0 |t| > 3

2. Find the Fourier transform of

 t
 1− 0≤t≤2
2






f (t) = 1 + t −2 ≤ t ≤ 0



 2



0 |t| > 2
Answers
2
1.(a) F (ω) = sin ω
ω
sin 3ω
(b) F (ω) =
2ω
1 − cos 2ω
2.
ω2
HELM (2006): 13
Properties of the
Fourier Transform 24.2
Introduction
In this Section we shall learn about some useful properties of the Fourier transform which enable
us to calculate easily further transforms of functions and also in applications such as electronic
communication theory.

• be aware of the basic definitions of the
Prerequisites Fourier transform and inverse Fourier
Before starting this Section you should . . . transform

'
$
• state and use the linearity property and the
time and frequency shift properties of Fourier
Learning Outcomes transforms
On completion you should be able to . . . • state various other properties of the Fourier
transform
& %
14 HELM (2006):
Workbook 24: Fourier Transforms
1. Linearity properties of the Fourier transform
(i) If f (t), g(t) are functions with transforms F (ω), G(ω) respectively, then
• F{f (t) + g(t)} = F (ω) + G(ω)
i.e. if we add 2 functions then the Fourier transform of the resulting function is simply the sum of
the individual Fourier transforms.
(ii) If k is any constant,
• F{kf (t)} = kF (ω)
i.e. if we multiply a function by any constant then we must multiply the Fourier transform by the
same constant. These properties follow from the definition of the Fourier transform and from the
properties of integrals.
Examples
1.
F{2e−t u(t) + 3e−2t u(t)} = F{2e−t u(t)} + F{3e−2t u(t)}
= 2F{e−t u(t)} + 3F{e−2t u(t)}
2 3
= +
1 + iω 2 + iω
2.

4 −3 ≤ t ≤ 3
If f (t) =
0 otherwise
then f (t) = 4p3 (t)
8
so F (ω) = 4P3 (ω) = sin 3ω
ω
using the standard result for F{pa (t)}.
Task
6 −2 ≤ t ≤ 2
If f (t) = write down F (ω).
0 otherwise
Your solution
Answer
12
We have f (t) = 6p2 (t) so F (ω) = sin 2ω.
ω
HELM (2006): 15
Section 24.2: Properties of the Fourier Transform
2. Shift properties of the Fourier transform
There are two basic shift properties of the Fourier transform:
(i) Time shift property: • F{f (t − t0 )} = e−iωt0 F (ω)
(ii) Frequency shift property • F{eiω0 t f (t)} = F (ω − ω0 ).
Here t0 , ω0 are constants.
In words, shifting (or translating) a function in one domain corresponds to a multiplication by a
complex exponential function in the other domain.
We omit the proofs of these properties which follow from the definition of the Fourier transform.
Example 2
Use the time-shifting property to find the Fourier transform of the function

1 3≤t≤5
g(t) =
0 otherwise
g(t)
3 5 t
Figure 4
Solution
g(t) is a pulse of width 2 and can be obtained by shifting the symmetrical rectangular pulse

1 −1 ≤ t ≤ 1
p1 (t) =
0 otherwise
by 4 units to the right.
Hence by putting t0 = 4 in the time shift theorem
2
G(ω) = F{g(t)} = e−4iω sin ω.
ω
16 HELM (2006):
Task
Verify the result of Example 2 by direct integration.
Your solution
Answer
5 5
e−iωt e−5iω − e−3iω eiω − e−iω
Z
−iωt sin ω
G(ω) = 1e dt = = = e−4iω = e−4iω 2 ,
3 −iω 3 −iω iω ω
as obtained using the time-shift property.
Task
Use the frequency shift property to obtain the Fourier transform of the
modulated wave
g(t) = f (t) cos ω0 t
where f (t) is an arbitrary signal whose Fourier transform is F (ω).
First rewrite g(t) in terms of complex exponentials:

Your solution
Answer
eiω0 t + e−iω0 t

1 1
g(t) = f (t) = f (t)eiω0 t + f (t)e−iω0 t
2 2 2
HELM (2006): 17
Now use the linearity property and the frequency shift property on each term to obtain G(ω):
Your solution
Answer
We have, by linearity:
1 1
F{g(t)} = F{f (t)eiω0 t } + F{f (t)e−iω0 t }
2 2
and by the frequency shift property:
1 1
G(ω) = F (ω − ω0 ) + F (ω + ω0 ).
2 2
F (ω) G(ω)
1
1
2
ω −ω0 ω0 ω
3. Inversion of the Fourier transform

Formal inversion of the Fourier transform, i.e. finding f (t) for a given F (ω), is sometimes possible
using the inversion integral (4). However, in elementary cases, we can use a Table of standard Fourier
transforms together, if necessary, with the appropriate properties of the Fourier transform.
The following Examples and Tasks involve such inversion.
18 HELM (2006):
Example 3
sin 5ω
Find the inverse Fourier transform of F (ω) = 20 .
5ω
Solution
The appearance of the sine function implies that f (t) is a symmetric rectangular pulse.
sin ωa sin ωa
We know the standard form F{pa (t)} = 2a or F −1 {2a } = pa (t).
ωa ωa
sin 5ω
Putting a = 5 F −1 {10 } = p5 (t). Thus, by the linearity property
5ω
sin 5ω
f (t) = F −1 {20 } = 2p5 (t)
5ω
f (t)
2
−5 5 t
Figure 4
Example 4
sin 5ω
Find the inverse Fourier transform of G(ω) = 20 exp (−3iω).
5ω
Solution
The occurrence of the complex exponential factor in the Fourier transform suggests the time-shift
property with the time shift t0 = +3 (i.e. a right shift).
From Example 3
sin 5ω sin 5ω −3iω
F −1 {20 } = 2p5 (t) so g(t) = F −1 {20 e } = 2p5 (t − 3)
5ω 5ω
g(t)
2
t
−2 8
Figure 5
HELM (2006): 19
Task
Find the inverse Fourier transform of
sin 2ω −4iω
H(ω) = 6 e .
ω
Firstly ignore the exponential factor and find the inverse Fourier transform of the remaining terms:
Your solution
Answer
sin ωa
We use the result: F −1 {2a } = pa (t)
ωa
sin 2ω sin 2ω
Putting a = 2 gives F −1 {2 } = p2 (t) ∴ F −1 {6 } = 3p2 (t)
ω ω
Now take account of the exponential factor:

Your solution
Answer
Using the time-shift theorem for t0 = 4
sin 2ω −4iω
h(t) = F −1 {6 e } = 3p2 (t − 4)
ω
h(t)
3
2 6 t
20 HELM (2006):
Example 5
Find the inverse Fourier transform of
2
K(ω) =
1 + 2(ω − 1)i
Solution
The presence of the term (ω − 1) instead of ω suggests the frequency shift property.
Hence, we consider first
2
K̂(ω) = .
1 + 2iω
The relevant standard form is
1 1
F{e−αt u(t)} = or F −1 { } = e−αt u(t).
α + iω α + iω
1 1
Hence, writing K̂(ω) = 1 k̂(t) = e− 2 t u(t).
2
+ iω
2 1
Then, by the frequency shift property with ω0 = 1 k(t) = F −1 { } = e− 2 t eit u(t).
1 + 2(ω − 1)i
Here k(t) is a complex time-domain signal.
Task
Find the inverse Fourier transforms of
sin {3(ω − 2π)} eiω
(a) L(ω) = 2 (b) M (ω) =
(ω − 2π) 1 + iω
Your solution
HELM (2006): 21
Answer
(a) Using the frequency shift property with ω0 = 2π
l(t) = F −1 {L(ω)} = p3 (t)ei2πt
(b) Using the time shift property with t0 = −1
m(t) = e−(t+1) u(t + 1)
m(t)
−1 t
4. Further properties of the Fourier transform

We state these properties without proof. As usual F (ω) denotes the Fourier transform of f (t).
(a) Time differentiation property:
F{f 0 (t)} = iωF (ω)
(Differentiating a function is said to amplify the higher frequency components because of

the additional multiplying factor ω.)
(b) Frequency differentiation property:
dF dF
F{tf (t)} = i or F{(−it)f (t)} =
dω dω
Note the symmetry between properties (a) and (b).

(c) Duality property:
If F{f (t)} = F (ω) then F{F (t)} = 2πf (−ω).
Informally, the duality property states that we can, apart from the 2π factor, interchange the time
and frequency domains provided we put −ω rather than ω in the second term, this corresponding to
a reflection in the vertical axis. If f (t) is even this latter is irrelevant.

1 −1 < t < 1 sin ω
For example, we know that if f (t) = p1 (t) = , then F (ω) = 2 .
0 otherwise ω
sin t
Then, by the duality property, since p1 (ω) is even, F{2 } = 2πp1 (−ω) = 2πp1 (ω).
t
22 HELM (2006):
Graphically:
p1 (t) P1 (ω)
1 F
−1 1 t ω
P1 (t) 2πp1 (ω)

2π
t −1 1 ω
Figure 6
Task
Recalling the Fourier transform pair
−2t
e t>0 4
f (t) = 2t F (ω) = ,
e t<0 4 + ω2
obtain the Fourier transforms of
1 1
(a) g(t) = 2
(b) h(t) = cos 2t.
4+t 4 + t2
(a) Use the linearity and duality properties:
Your solution
HELM (2006): 23
Answer
4 1 −2|t| 1
We have F{f (t)} ≡ F{e−2|t| } = . ∴ F{ e } = (by linearity)
4 + ω2 4 4 + ω2
1 1 π
∴ F{ 2
} = 2π e−2|−ω| = e−2|ω| = G(ω) (by duality).
4+t 4 2
f (t) F (ω)
1 F 1
t ω
g(t) G(ω)
π
2
1
4
F
t ω
(b) Use the modulation property based on the frequency shift property:
Your solution
Answer
We have h(t) = g(t) cos 2t. ∴ F{g(t) cos ω0 t} = 12 (G(ω − ω0 ) + G(ω + ω0 )),
π −2|ω−2|
+ e−2|ω+2| = H(ω)

so with ω0 = 2 F{h(t)} = e
4
H(ω)
−2 2 ω
24 HELM (2006):
Exercises
1. Using the superposition and time delay theorems and the known result for the transform of the
rectangular pulse p(t), obtain the Fourier transforms of each of the signals shown.
xa (t) xb (t)
(a) 1 (b) 1
t t
−2 −1 0 1 2 −2 −1 0 1 2
−1
xc (t) xd (t)
2 2
(c) 1 (d) 1
t t
−2 −1 0 1 2 1 2 3
2. Obtain the Fourier transform of the signal
f (t) = e−t u(t) + e−2t u(t)
where u(t) denotes the unit step function.
3. Use the time-shift property to obtain the Fourier transform of


1 1 ≤ t ≤ 3
f (t) =
0 otherwise

Verify your result using the definition of the Fourier transform.
4. Find the inverse Fourier transforms of

sin(5ω) −3iω
(a) F (ω) = 20 e
5ω
8
(b) F (ω) = sin 3ω eiω
ω
eiω
(c) F (ω) =
1 − iω
5. If f (t) is a signal with transform F (ω) obtain the Fourier transform of f (t) cos(ω0 t) cos(ω0 t).
HELM (2006): 25
Answer
4 ω 3ω
1. Xa (ω) = sin( ) cos( )
ω 2 2
−4i ω 3ω
Xb (ω) = sin( ) sin( )
ω 2 2
2
Xc (ω) = [sin(2ω) + sin(ω)]
ω

2 3ω ω −3iω/2
Xd (ω) = sin( ) + sin( e
ω 2 2
3 + 2iω
2. F (ω) = (using the superposition property)
2 − ω 2 + 3iω
sin ω −2iω
3. F (ω) = 2 e
ω

2 −2 < t < 8
4. (a) f (t) =
0 otherwise

4 −4 < t < 2
(b) f (t) =
0 otherwise
t+1
e t < −1
(c) f (t) =
0 otherwise
1 1
5. F (ω) + [F (ω + 2ω0 ) + F (ω − 2ω0 )]
2 4
26 HELM (2006):
Some Special Fourier
Transform Pairs 24.3
Introduction
In this final Section on Fourier transforms we shall study briefly a number of topics such as Parseval’s
theorem and the relationship between Fourier transform and Laplace transforms. In particular we
shall obtain, intuitively rather than rigorously, various Fourier transforms of functions such as the unit
step function which actually violate the basic conditions which guarantee the existence of Fourier
transforms!

• be aware of the definitions and simple
Prerequisites properties of the Fourier transform and
Before starting this Section you should . . . inverse Fourier transform.

Learning Outcomes • use the unit impulse function (the Dirac delta
function) to obtain various Fourier transforms

HELM (2006): 27
Section 24.3: Some Special Fourier transform Pairs
1. Parseval’s theorem
Recall from 23.2 on Fourier series that for a periodic signal fT (t) with complex Fourier coeffi-
cients cn (n = 0, ±1, ±2, . . .) Parseval’s theorem holds:
Z T ∞
1 +2 2 X
fT (t)dt = |cn |2 ,
T − T2 n=−∞
where the left-hand side is the mean square value of the function (signal) over one period.
For a non-periodic real signal f (t) with Fourier transform F (ω) the corresponding result is
Z ∞ Z ∞
2 1
f (t)dt = |F (ω)|2 dω.
−∞ 2π −∞
This result is particularly significant in filter theory. For reasons that we do not have space to go
into, the left-hand side integral is often referred to as the total energy of the signal. The integrand
on the right-hand side
1
|F (ω)|2
2π
is then referred to as the energy density (because it is the frequency domain quantity that has to
be integrated to obtain the total energy).
Task
Verify Parseval’s theorem using the one-sided exponential function
f (t) = e−t u(t).
Firstly evaluate the integral on the left-hand side:

Your solution
Answer
Z ∞ ∞ ∞
e−2t
Z
2 −2t 1
f (t)dt = e dt = = .
−∞ 0 −2 0 2
Now obtain the Fourier transform F (ω) and evaluate the right-hand side integral:
Your solution
28 HELM (2006):
Answer
1
F (ω) = F{e−t u(t)} = ,
1 + iω
so
1 1 1
|F (ω)|2 = . = .
(1 + iω) (1 − iω) 1 + ω2
Then
Z ∞ Z ∞
1 2 1
|F (ω)| dω = |F (ω)|2 dω
2π −∞ π 0
Z ∞ ∞
1 1 1 −1 1 π 1
= 2
dω = tan ω = × = .
π 0 1+ω π 0 π 2 2
Since both integrals give the same value, Parseval’s theorem is verified for this case.
2. Existence of Fourier transforms

Formally, sufficient conditions for the Fourier transform of a function f (t) to exist are
R∞
(a) −∞ |f (t)|2 dt is finite
(b) f (t) has a finite number of maxima and minima in any finite interval
(c) f (t) has a finite number of discontinuities.
Like the equivalent conditions for the existence of Fourier series these conditions are known as
Dirichlet conditions.
If the above conditions hold then f (t) has a unique Fourier transform. However certain functions,
such as the unit step function, which violate one or more of the Dirichlet conditions still have Fourier
transforms in a more generalized sense as we shall see shortly.
3. Fourier transform and Laplace transforms

Suppose f (t) = 0 for t < 0. Then the Fourier transform of f (t) becomes
Z ∞
F{f (t)} = f (t)e−iωt dt. (1)
0
As you may recall from earlier units, the Laplace transform of f (t) is
Z ∞
L{f (t)} = f (t)e−st dt. (2)
0
Comparison of (1) and (2) suggests that for such one-sided functions, the Fourier transform of f (t)
can be obtained by simply replacing s by iω in the Laplace transform.
An obvious example where this can be done is the function
f (t) = e−αt u(t).
HELM (2006): 29
1
In this case L{f (t)} = = F (s) and, as we have seen earlier,
α+s
1
F{f (t)} = = F (iω).
α + iω
However, care must be taken with such substitutions. We must be sure that the conditions for the
existence of the Fourier transform are met. Thus, for the unit step function,
1
L{u(t)} = ,
s
1 1
whereas, F{u(t)} =
6 . (We shall see that F{u(t)} does actually exist but is not equal to .)
iω iω
We should also point out that some of the properties we have discussed for Fourier transforms are
similar to those of the Laplace transforms e.g. the time-shift properties:
Fourier: F{f (t − t0 )} = e−iωt0 F (ω) Laplace: L{f (t − t0 )} = e−st0 F (s).
4. Some special Fourier transform pairs

As mentioned in the previous subsection it is possible to obtain Fourier transforms for some important
functions that violate the Dirichlet conditions. To discuss this situation we must introduce the unit
impulse function, also known as the Dirac delta function. We shall study this topic in an inituitive,
rather than rigorous, fashion.
Recall that a symmetrical rectangular pulse

1 −a < t < a
pa (t) =
0 otherwise
has a Fourier transform
2
Pa (ω) = sin ωa.
ω
1
If we consider a pulse whose height is rather than 1 (so that the pulse encloses unit area), then
2a
we have, by the linearity property of Fourier transforms,

1 sin ωa
F pa (t) = .
2a ωa
As the value of a becomes smaller, the rectangular pulse becomes narrower and taller but still has
unit area.
2 a= 1
4
a= 1
1 2
a=1
1
2
−1 1 1 t
−1 −1 1
2 4 4 2
Figure 7
30 HELM (2006):
We define the unit impulse function δ(t) as
1
δ(t) = lim pa (t)
2a
a→0
and show it graphically as follows:
δ(t)
t=0 t
Figure 8
Then,

1 1
F{δ(t)} = F lim pa (t) = lim F pa (t)
a→0 2a a→0 2a
sin ωa
= lim
a→0 ωa
= 1.
Here we have assumed that interchanging the order of taking the Fourier transform with the limit
operation is valid.
Now consider a shifted unit impulse δ(t − t0 ):
δ(t − t0 )
t=0 t0 t
Figure 9
We have, by the time shift property
F{δ(t − t0 )} = e−iωt0 (1) = e−iωt0 .
These results are summarized in the following Key Point:
HELM (2006): 31
Key Point 4
The Fourier transform of a Unit Impulse
F{δ(t − t0 )} = e−iωt0 .
If t0 = 0 then F{δ(t)} = 1.
Task
Apply the duality property to the result
F{δ(t)} = 1.
(From the way we have introduced the unit impluse function it must clearly be
treated as an even function.)
Your solution
Answer
We have F{δ(t)} = 1. Therefore by the duality property
F{1} = 2πδ(−ω) = 2πδ(ω).
We see that the signal
f (t) = 1, −∞ < t < ∞
which is infinitely wide, has Fourier transform F (ω) = 2πδ(ω) which is infinitesimally narrow. This
reciprocal effect is characteristic of Fourier transforms.
f (t) F (ω)
2πδ(ω)
t ω
This result is intuitively plausible since a constant signal would be expected to have a frequency
representation which had only a component at zero frequency (ω = 0).
32 HELM (2006):
Task
Use the result F{1} = 2πδ(ω) and the frequency shift property to obtain
F{eiω0 t }.
Your solution
Answer
F{eiω0 t } = F{eiω0 t f (t)} where f (t) = 1, −∞ < t < ∞.
But F{f (t)} = 2πδ(ω), therefore, by the frequency shift property F{eiω0 t } = 2πδ(ω − ω0 ).
F {eiω0 t } 2πδ(ω − ω0 )
ω0 ω
Task
Obtain the Fourier transform of a pure cosine wave
f (t) = cos ω0 t −∞<t<∞
by writing f (t) in terms of complex exponentials and using the result of the previous
Task.
Your solution
HELM (2006): 33
Answer
We have f (t) = cos ω0 t = 12 eiω0 t + e−iω0 t

so
1 1
F{cos ω0 t} = F{eiω0 t } + F{e−iω0 t } = πδ(ω − ω0 ) + πδ(ω + ω0 )
2 2
F (ω)
−ω0 ω0 ω
Z ∞
Note that because | cos ω0 t| dt diverges, one of the Dirichlet conditions is violated. Nevertheless,
−∞
as we can see via the use of the unit impulse functions, the Fourier transform of cos ω0 t exists.
By similar reasoning we can readily show
π π
F{sin ω0 t} = δ(ω − ω0 ) − δ(ω + ω0 ).
i i
Note that the usual results for Fourier transforms of even and odd functions still hold.
5. Fourier transform of the unit step function

We have already pointed out that although
1
L{u(t)} =
s
we cannot simply replace s by iω to obtain the Fourier transform of the unit step.
We proceed via the Fourier transform of the signum function sgn(t) which is defined as

1 t>0
sgn t =
−1 t<0
sgn(t)
1
−1
Figure 10
We obtain F{sgn(t)} as follows.
34 HELM (2006):
Consider the odd two-sided exponential function fα (t) defined as
−αt
e t>0
fα (t) = αt ,
−e t<0
where α > 0:
fα (t)
−1
Figure 11
By slightly adapting our earlier calculation for the even two-sided exponential function we find
1 1
F{fα (t)} = − +
(α − iω) (α + iω)
−(α + iω) + (α − iω)
=
α2 + ω 2
2iω
= − .
α2 + ω 2
The parameter α controls how rapidly the exponential function varies:
α1 > α2 > α3
fα (t)
1 α3
α2
α1
t
−1
Figure 12
As we let α → 0 the exponential function resembles more and more closely the signum function.
This suggests that
F{sgn(t)} = lim F{fα (t)}

α→0

2iω 2i 2
= lim − 2 =− = .
α→0 α + ω2 ω iω
HELM (2006): 35
Task
Write the unit step function in terms of the signum function and hence obtain
F{u(t)}.
First express u(t) in terms of sgn(t):

Your solution
Now, using the linearity property of Fourier transforms and previously obtained Fourier transforms,
find F{u(t)} :
Your solution
Answer
From the graphs
sgn(t) u(t)
1 1
t t
−1
the step function can be obtained by adding 1 to the signum function for all t and then dividing the
resulting function by 2 i.e.
1
u(t) = (1 + sgn(t)).
2
36 HELM (2006):
Now, using the linearity property and previously obtained Fourier transforms, find F{u(t)} :
Your solution
Answer
We have, using linearity,
1 1 1 12 1
F{u(t)} = F{1} + F{sgn(t)} = 2πδ(ω) + = πδ(ω) +
2 2 2 2 iω iω
Thus, the Fourier transform of the unit step function contains the additional impulse term πδ(ω)
1
as well as the odd term .
iω
Exercises
1. Use Parserval’s theorem and the Fourier transform of a ‘two-sided’ exponential function to
show that
Z ∞
dω π
2 2 2
=
−∞ (a + ω ) 2|a|3
2 1
2. Using F{sgn(t)} = find the Fourier transforms of (a) f1 (t) = (b) f2 (t) = |t|
iω t
1 2
Hence obtain the transforms of (c) f3 (t) = − (d) f4 (t) =
t2 t3
3. Show that
F{sin ω0 t} = iπ[δ(ω + ω0 ) − δ(ω − ω0 )]
Verify your result using inverse Fourier transform properties.

Answers
1
2 (a) F{ } = −πi sgn(ω) (by the duality property)
t
2
(b) F{|t|} = − 2
ω

1 πω, ω > 0
(c) F{− 2 } = πω sgn(ω) =
t −πω, ω < 0
1 iπω 2
(d) F{ 3 } = sgn(ω)
t 2
(Using time differentiation property in (b), (c) and (d).)
HELM (2006): 37
Contents 25
Partial Differential
Equations
25.1 Partial Differential Equations 2
25.2 Applications of PDEs 11
25.3 Solution Using Separation of Variables 19
25.4 Solutions Using Fourier Series 35
Learning outcomes
By studying this Workbook you will learn to recognise the two-dimensional Laplace's
equation and the one-dimensinal diffusion and wave equations.
You will learn how to verify solutions of these equations and how to find solutions by
using the separation of variable method and by using Fourier series.
Partial Differential
Equations 25.1
Introduction
A partial differential equation (PDE) is a differential equation involving partial derivatives of one
dependent variable with respect to two or more independent variables. The independent variables
may be space variables only or one or more space variables and time. Mathematical modelling of
many situations involving natural phenomena leads to PDEs.
The subject of PDEs is a very large one. We shall discuss only a few special PDEs which model a
wide range of applied problems.

• be able to carry out partial differentiation
Prerequisites
• be able to solve constant coefficient ordinary
Before starting this Section you should . . . differential equations

Learning Outcomes • verify solutions of given partial differential

equations arising in engineering and science

2 HELM (2006):
Workbook 25: Partial Differential Equations
®
1. Introduction
You have already studied ordinary differential equations (ODEs) and have learnt how to obtain the
solution of certain types. Since a knowledge of the solution of certain ODEs (i.e. those with constant
coefficients) will be required in solving partial differential equations (PDEs), we will begin this unit
reminding you of some important results.
Key Point 1
The first order ODE
dy
= ky
dx
has general solution
y = Aekx
Here k is a constant which can be positive or negative and A is an arbitrary constant.
In Key Point 1 the quantity A in the general solution is a constant. To obtain the value of A we
have to know the value of y at some value of x, perhaps x = 0. In other words, we need to know an
initial condition.
Task
Find y as a function of x if
dy
= −2y
dx
and the initial condition is y(0) = 3.
Your solution
Answer
From Key Point 1 with k = −2 we have the general solution
y = Ae−2x
Putting x = 0 and y = 3 into this we obtain 3 = Ae0 i.e. A = 3 so the solution to the given initial
value problem is
y = 3e−2x
HELM (2006): 3
Section 25.1: Partial Differential Equations
We shall also need to be familiar with solutions to second order, homogeneous, constant coefficient
ODEs, summarised in Key Point 2.
Key Point 2
A second order ODE of the form
d2 y dy
a 2 + b + cy = 0 (1)
dx dx
where a, b, c are constants, has an auxiliary equation
am2 + bm + c = 0 (2)
obtained by inserting the trial solution y = emx in (1).
The general solution of (1) then depends on the solutions (or roots) of the quadratic equation (2).
(a) If (2) has real, distinct roots m = m1 and m = m2 then
y = Aem1 x + Bem2 x
(b) If (2) has a repeated root m = m1 then
y = (A + Bx)em1 x
(c) If (2) has complex roots (which will be a conjugate pair) m = α ± jβ then
y = eαx (A cos βx + B sin βx)
Note that in each of these cases (a) to (c) the general solution is a linear combination of two
particular solutions:
For (a) they are em1 x and em2 x .
For (b) they are em1 x and xem1 x .
For (c) they are eαx cos βx and eαx sin βx.
Task
d2 y
Use Key Point 2 to find the general solution of − 4y = 0.
dx2
First write down the auxiliary equation:
Your solution
4 HELM (2006):
®
Answer
m2 − 4 = 0
Now find the roots of the auxiliary equation:

Your solution
Answer
m = ±2
Finally give the general solution to the ODE:

Your solution
Answer
y = Ae2x + Be−2x (Since the roots of the auxiliary equation are real and distinct.)
Task
d2 y
Find the general solution of + 9y = 0
dx2
First write down the auxiliary equation:
Your solution
Answer
m2 + 9 = 0
Now Find the roots of this auxiliary equation:

Your solution
Answer
m = ±3i
Finally give the general solution to the ODE:

Your solution
HELM (2006): 5
Answer
y = A cos 3x + B sin 3x
(Since the roots of the auxiliary equation are complex conjugates with real part α = 0 and imaginary
part β = 3.)
The two Tasks above can be generalised as in Key Point 3.
Key Point 3
d2 y
(1) The general solution to: − n2 y = 0 is
dx2
y = Aenx + Be−nx
or, equivalently using hyperbolic functions,
y = C cosh nx + D sinh nx
d2 y
(2) The general solution to: + n2 y = 0 is
dx2
y = A cos nx + B sin nx
Those of you who are familiar with elementary dynamics will recognise the second differential equation
in Key Point 3 as modelling simple harmonic motion.
2. Partial differential equations

In all the above examples we had a function y of a single variable x, y being the solution of an
ordinary differential equation.
In engineering and science ODEs arise as models for systems where there is one independent variable
(often x) and one dependent variable (often y). Obvious examples are lumped electrical circuits
where the current i is a function only of time t (and not of position in the circuit) and lumped
mechanical systems (such as the simple harmonic oscillator referred to above) where the displacement
of a moving particle depends only on t.
However, in problems where one variable, say u, depends on more than one independent variable,
∂u ∂2u
say both x and t, then any derivatives of u will be partial derivatives such as or 2 and any
∂x ∂t
differential equation arising will be known as a partial differential equation. In particular, one-
dimensional (1-D) time-dependent problems where u depends on a position coordinate x and the time
t and two-dimensional (2-D) time-independent problems where u is a function of the two position
coordinates x and y both give rise to PDEs involving two independent variables. This is the case
6 HELM (2006):
®
we shall concentrate on. A two-dimensional time-dependent problem would involve 3 independent

variables x, y, t as would a three-dimensional time-independent problem where x, y, z would be the
independent variables.
Example 1
∂2u ∂2u
Show that u = sin x cosh y satisfies the PDE + = 0.
∂x2 ∂y 2
This PDE is known as Laplace’s equation in two dimensions and it arises in
many applications e.g. electrostatics, fluid flow, heat conduction.
Solution
∂u ∂u
u = sin x cosh y ⇒ = cos x cosh y and = sin x sinh y
∂x ∂y
∂2u ∂2u
Differentiating again gives = − sin x cosh y and = sin x cosh y
∂x2 ∂y 2
Hence
∂2u ∂2u
+ = − sin x cosh y + sin x cosh y = 0
∂x2 ∂y 2
so the given function u(x, y) is indeed a solution of the PDE.
Task
2 ∂2u 1 ∂u
Show that u = e−2π t sin πx is a solution of the PDE 2
=
∂x 2 ∂t
∂u ∂u
First find and :
∂t ∂x
Your solution
Answer
∂u 2 ∂u 2
= −2π 2 e−2π t sin πx = πe−2π t cos πx
∂t ∂x
∂2u
Now find and complete the Task:
∂x2
Your solution
Answer
∂2u 2 −2π 2 t ∂2u 1 ∂u
2
= −π e sin πx, and we see that 2
= as required.
∂x ∂x 2 ∂t
HELM (2006): 7
The PDE in the above Task has the general form
∂2u 1 ∂u
2
=
∂x k ∂t
where k is a positive constant. This equation is referred to as the one-dimensional heat conduction
equation (or sometimes as the diffusion equation). In a heat conduction context the dependent
variable u represents the temperature u(x, t).
The third important PDE involving two independent variables is known as the one-dimensional
wave equation. This has the general form
∂2u 1 ∂2u
=
∂x2 c2 ∂t2
(Note that both partial derivatives in the wave equation are second-order in contrast to the heat
conduction equation where the time derivative is first order.)
Example 2

πx πct
(a) Verify that u(x, t) = u0 sin cos (where u0 , ` and c are
` `
constants) satisfies the one-dimensional wave equation.
(b) Verify the boundary conditions i.e. u(0, t) = u(`, t) = 0.

∂u πx
(c) Verify the initial conditions i.e. (x, 0) = 0 and u(x, 0) = u0 sin .
∂t `
(d) Give a physical interpretation of this problem.
Answer
(a) By straightforward partial differentiation of the given function u(x, t):
∂2u

∂u π πx πct π 2 πx πct
= u0 cos cos = −u0 sin cos
∂x ` ` ` ∂x2 ` ` `
∂2u

∂u πc πx πct πc 2 πx πct
= −u0 sin sin = −u0 sin cos
∂t ` ` ` ∂t2 ` ` `
∂2u 1 ∂2u
We see that = which completes the verification.
∂x2 c2 ∂t2
(b) Putting x = 0, and leaving t arbitrary, in the given solution for u(x, t) gives

πct
u(x, 0) = u0 sin 0 cos = 0 for all t
`

πct
Similarly putting x = `, t arbitrary: u(`, 0) = u0 sin π cos = 0 for all t
`
8 HELM (2006):
®
Answer
∂u
(c) Evaluating firstly for general x and t
∂t

∂u πc πx πct
= −u0 sin sin
∂t ` ` `
Now putting t = 0 leaving x arbitrary
∂u πc πx
(x, 0) = −u0 sin sin 0 = 0.
∂t ` `
Also, putting t = 0 in the expression for u(x, t) gives
πx πx
u(x, 0) = u0 sin cos 0 = u0 sin .
` `
(d) Mathematically we have now proved that the given function u(x, t) satisfies the 1-D wave equa-
tion specified in (a), the two boundary conditions specified in (b) and the two initial conditions
specified in (c).
One possible physical interpretation of this problem is that u(x, t) represents the displacement of a
string stretched between two points at x = 0 and x = `. Clearly the position of any point P on the
vibrating string will depend upon its distance x from one end and on the time t.
The boundary conditions (b) represent the fact that the string is fixed at these end-points.
πx
The initial condition u(x, 0) = u0 sin represents the displacement of the string at t = 0.
`
∂u
The initial condition (x, 0) = 0 tells us that the string is at rest at t = 0.
∂t
u
t=0
t increasing
u(x, t)
x
0
x t = 1.5
Figure 1
Note that it can be proved formally that if T is the tension in the string and if ρ is the mass per
unit length of the string then u does, under certain conditions, satisfy the 1-D wave equation with
T
c2 = .
ρ
HELM (2006): 9
Key Point 4
The three PDEs of greatest general interest involving two independent variables are:
(a) The two-dimensional Laplace equation

∂2u ∂2u
+ =0
∂x2 ∂y 2
(b) The one-dimensional heat conduction equation:
∂2u 1 ∂u
2
=
∂x k ∂t
(c) The one-dimensional wave equation:
∂2u 1 ∂2u
=
∂x2 c2 ∂t2
10 HELM (2006):
®

Applications of PDEs 25.2

Introduction
In this Section we discuss briefly some of the most important PDEs that arise in various branches
of science and engineering. We shall see that some equations can be used to describe a variety of
different situations.

Prerequisites • have a knowledge of partial differentiation


• recognise the heat conduction equation and
Learning Outcomes the wave equation and have some knowledge
On completion you should be able to . . . of their applicability

HELM (2006): 11
Section 25.2: Applications of PDEs
Key Point 4 by no means exhausts the types of PDE which are important in applications. In this
Section we will discuss those three PDEs in Key Point 4 in more detail and briefly discuss other PDEs
over a wide range of applications. We will omit detailed derivations.
1. Wave equation
The simplest situation to give rise to the one-dimensional wave equation is the motion of a stretched
string - specifically the transverse vibrations of a string such as the string of a musical instrument.
Assume that a string is placed along the x-axis, is stretched and then fixed at ends x = 0 and x = L;
it is then deflected and at some instant, which we call t = 0, is released and allowed to vibrate. The
quantity of interest is the deflection u of the string at any point x, 0 ≤ x ≤ L, and at any time
t > 0. We write u = u(x, t). Figure 2 shows a possible displacement of the string at a fixed time t.
u u= u(x, t)
x
0 L
Figure 2
Subject to various assumptions · · ·
1. damping forces such as air resistance are negligible
2. the weight of the string is negligible
3. the tension in the string is tangential to the curve of the string at any point
4. the string performs small transverse oscillations i.e. every particle of the string moves strictly
vertically and such that its deflection and slope as every point on the string is small.
· · · it can be shown, by applying Newton’s law of motion to a small segment of the string, that u
satisfies the PDE
∂2u 2
2∂ u
= c (1)
∂t2 ∂x2
T
where c2 = , ρ being the mass per unit length of the string and T being the (constant) horizontal
ρ
component of the tension in the string. To determine u(x, t) uniquely, we must also know
1. the initial definition of the string at the time t = 0 at which it is released
2. the initial velocity of the string.
Thus we must be given initial conditions

u (x, 0) = f (x) 0≤x≤L (initial position)
∂u
(x, 0) = g(x) 0≤x≤L (initial velocity)
∂t
where f (x) and g(x) are known.
12 HELM (2006):
®
These two initial conditions are in addition to the two boundary conditions
u(0, t) = u(L, t) = 0 for t ≥ 0
which indicate that the string is fixed at each end. In Example 2 discussed in Section 25.1 we had
πx
f (x) = u0 sin
`
g(x) = 0 (string initially at rest).
The PDE (1) is the (undamped) wave equation. We will discuss solutions of it for various initial
conditions later. More complicated forms of the wave equation would arise if some of the assumptions
were modified. For example:
∂2u 2
2∂ u
(a) = c −g if the weight of the string was allowed for,
∂t2 ∂x2
∂2u 2
2∂ u ∂u
(b) = c − α if a damping force proportional to the velocity of the string
∂t2 ∂x2 ∂t
(with damping constant α) was included.
Equation (1) is referred to as the one-dimensional wave equation because only one space variable, x,
is present. The two-dimensional (undamped) wave equation is, in Cartesian coordinates,
∂2u
2 2

∂ u ∂ u
= c2 + (2)
∂t2 ∂x2 ∂y 2
This arises for example when we model the transverse vibrations of a membrane. See Figures 3(a),
3(b). Here u(x, y, t) is the definition of a point (x, y) on the membrane at time t. Again, a boundary
condition must be specified: commonly
u=0 t≥0
on the boundary of the membrane, if this is fixed (clamped). Also initial conditions must be given
∂u
u (x, y, 0) = f (x, y) (initial position) (x, y, 0) = g(x, y) (initial velocity)
∂t
For a circular membrane, such as a drumhead, polar coordinates defined by x = r cos θ, y = r sin θ
would be more convenient than Cartesian. In this case (2) becomes
∂2u
2
1 ∂2u

2 ∂ u 1 ∂u
=c + + 0 ≤ r ≤ R, 0 ≤ θ ≤ 2π
∂t2 ∂r2 r ∂r r2 ∂θ2
for a circular membrane of radius R.
y y
a
x x
b R
(a) (b)
Figure 3
HELM (2006): 13
2. Heat conduction equation
Consider a long thin bar, or wire, of constant cross-section and of homogeneous material oriented
along the x-axis (see Figure 4).
x
x= 0 x= L
Figure 4
Imagine that the bar is thermally insulated laterally and is sufficiently thin that heat flows (by
conduction) only in the x-direction. Then the temperature u at any point in the bar depends only
on the x-coordinate of the point and the time t. By applying the principle of conservation of energy
it can be shown that u(x, t) satisfies the PDE
∂u ∂2u 0≤x≤L
=k 2 (3)
∂t ∂x t>0
where k is a positive constant. In fact k, sometimes called the thermal diffusivity of the bar, is
given by
κ
k=
sρ
where κ = thermal conductivity of the material of the bar
s = specific heat capacity of the material of the bar
ρ = density of the material of the bar.
The PDE (3) is called the one-dimensional heat conduction equation (or, in other contexts where it
arises, the diffusion equation).
Task
What is the obvious difference between the wave equation (1) and the heat con-
duction equation (3)?
Your solution
Answer
Both equations involve second derivatives in the space variable x but whereas the wave equation has
a second derivative in the time variable t the heat conduction equation has only a first derivative in
t. This means that the solutions of (3) are quite different in form from those of (1) and we shall
study them separately later.
14 HELM (2006):
®
The fact that (3) is first order in t means that only one initial condition at t = 0 is needed, together
with two boundary conditions, to obtain a unique solution. The usual initial condition specifies the
initial temperature distribution in the bar
u(x, 0) = f (x)
where f (x) is known. Various types of boundary conditions at x = 0 and x = L are possible. For
example:
(a) u(0, t) = T1 and u(L, T ) = T2 (ends of the bar are at constant temperatures T1 and T2 ).
∂u ∂u
(b) (0, t) = (L, t) = 0 which are insulation conditions since they tell us that there is
∂x ∂x
no heat flow through the ends of the bar.
As you would expect, there are two-dimensional and three-dimensional forms of the heat conduction
equation. The two dimensional version of (3) is
2
∂ u ∂2u

∂u
=k + (4)
∂t ∂x2 ∂y 2
where u(x, y, t) is the temperature in a flat plate. The plate is assumed to be thin and insulated
on its top and bottom surface so that no heat flow occurs other than in the Oxy plane. Boundary
conditions and an initial condition are needed to give unique solutions of (4). For example if the
plate is rectangular as in Figure 5:
y=b
x
x= a
Figure 5
typical boundary conditions might be
u(x, 0) = T1 0 ≤ x ≤ a (bottom side at fixed temperature)
∂u
(a, y) = 0 0 ≤ y ≤ b (right-hand side insulated)
∂x
u(x, b) = T2 0 ≤ x ≤ a (top side at fixed temperature)
u(0, y) = 0 0 ≤ y ≤ b (left hand side at zero fixed temperature).
An initial condition would have the form u(x, y, 0) = f (x, y), where f is a given function.
HELM (2006): 15
3. Transmission line equations
In a long electrical cable or a telephone wire both the current and voltage depend upon position
along the wire as well as the time (see Figure 6).
i(x, t)
v(x, t)
x
i(x, t)
Figure 6
It is possible to show, using basic laws of electrical circuit theory, that the electrical current i(x, t)
satisfies the PDE
∂2i ∂2i ∂i
2
= LC 2
+ (RC + GL) + RGi (5)
∂x ∂t ∂t
where the constants R, L, C and G are, for unit length of cable, respectively the resistance, induc-
tance, capacitance and leakage conductance. The voltage v(x, t) also satisfies (5). Special cases of
(5) arise in particular situations. For a submarine cable G is negligible and frequencies are low so
inductive effects can also be neglected. In this case (5) becomes
∂2i ∂i
2
= RC (6)
∂x ∂t
which is called the submarine equation or telegraph equation. For high frequency alternating
currents, again with negligible leakage, (5) can be approximated by
∂2i ∂2i
= LC (7)
∂x2 ∂t2
which is called the high frequency line equation.
Task
What PDEs, already discussed, have the same form as equations (6) or (7)?
Your solution
Answer
(6) has the same form as the one-dimensional heat conduction equation.
(7) has the same form as the one-dimensional wave equation.
16 HELM (2006):
®
4. Laplace’s equation
If you look back at the two-dimensional heat conduction equation (4):
2
∂ u ∂2u

∂u
=k +
∂t ∂x2 ∂y 2
∂u
it is clear that if the heat flow is steady, i.e. time independent, then = 0 so the temperature
∂t
u(x, y) is a solution of
∂2u ∂2u
+ =0 (8)
∂x2 ∂y 2
(8) is the two-dimensional Laplace equation. Both this, and its three-dimensional counterpart
∂2u ∂2u ∂2u
+ + =0 (9)
∂x2 ∂y 2 ∂z 2
arise in a wide variety of applications, quite apart from steady state heat conduction theory. Since
time does not arise in (8) or (9) it is evident that Laplace’s equation is always a model for equilibrium
situations. In any problem involving Laplace’s equation we are interested in solving it in a specific
region R for given boundary conditions. Since conditions may involve
(a) u specified on the boundary curve C (two dimensions) or boundary surface S (three
dimensions) of the region R. Such boundary conditions are called Dirichlet conditions.
∂u
(b) The derivative of u normal to the boundary, written , specified on C or S. These are
∂n
referred to as Neumann boundary conditions.
(c) A mixture of (a) and (b).
Some areas in which Laplace’s equation arises are
(a) electrostatics (u being the electrostatic potential in a charge free region)

(b) gravitation (u being the gravitational potential in free space)
(c) steady state flow of inviscid fluids
(d) steady state heat conduction (as already discussed)
HELM (2006): 17
5. Other important PDEs in science and engineering
1. Poisson’s equation
∂2u ∂2u
+ = f (x, y) (two-dimensional form)
∂x2 ∂y 2
where f (x, y) is a given function. This equation arises in electrostatics, elasticity theory and
elsewhere.
2. Helmholtz’s equation
∂2u ∂2u
+ + k2u = 0 (two dimensional form)
∂x2 ∂y 2
which arises in wave theory.
3. Schrödinger’s equation
h2
2
∂ ψ ∂2ψ ∂2ψ

− 2 + 2 + 2 = Eψ
8π m ∂x2 ∂y ∂z
which arises in quantum mechanics. (h is Planck’s constant)
4. Transverse vibrations equation
∂4u ∂2u
a2 + 2 =0
∂x4 ∂t
for a homogeneous rod, where u(x, t) is the displacement at time t of the cross section through
x.
All the PDEs we have discussed are second order (because the highest order derivatives that arise
are second order) apart from the last example which is fourth order.
18 HELM (2006):
®
Solution Using
Separation of Variables 25.3

Introduction
The main topic of this Section is the solution of PDEs using the method of separation of variables.
In this method a PDE involving n independent variables is converted into n ordinary differential
equations. (In this introductory account n will always be 2.)
You should be aware that other analytical methods and also numerical methods are available for
solving PDEs. However, the separation of variables technique does give some useful solutions to
important PDEs.

• be able to solve first and second order
Prerequisites constant coefficient ordinary differential
Before starting this Section you should . . . equations

'
$
• apply the separation of variables method to
obtain solutions of the heat conduction
Learning Outcomes equation, the wave equation and the 2-D
On completion you should be able to . . . Laplace equation for specified boundary or
initial conditions
& %
HELM (2006): 19
Section 25.3: Solution Using Separation of Variables
1. Solution of important PDEs
We shall just consider two analytic solution techniques for PDEs:
(a) Direct integration
(b) Separation of variables
The method of direct integration is a straightforward extension of solving very simple ODEs by
integration, and will be considered first. The method of separation of variables is more important
and we will study it in detail shortly.
You should note that many practical problems involving PDEs have to be solved by numerical
methods but that is another story (introduced in 32 and 33).
Task
Solve the ODE
d2 y
2
= x2 + 2
dx
dy
given that y = 1 when x = 0 and = 2 when x = 0.
dx
dy
First find by integrating once, not forgetting the arbitrary constant of integration:
dx
Your solution
Answer
dy x3
= + 2x + A
dx 3
Now find y by integrating again, not forgetting to include another arbitrary constant:
Your solution
Answer
x4
y= + x2 + Ax + B
12
Now find A and B by inserting the two given initial conditions and so find the solution:
Your solution
Answer
y(0) = 1 gives B = 1 y 0 (0) = 2 gives A = 2
so the required solution is
x4
y= + x2 + 2x + 1
12
20 HELM (2006):
®
Consider now a similar type of PDE i.e. one that can also be solved by direct integration.
Suppose we require the general solution of
∂2u
= 2xet
∂x2
where u is a function of x and t.
Integrating with respect to x gives us
∂u
= x2 et + f (t)
∂x
where the arbitrary function f (t) replaces the normal “arbitrary constant” of ordinary integration.
This function of t only is needed because we are integrating “partially” with respect to x i.e. we are
reversing a partial differentiation with respect to x at constant t.
Integrating again with respect to x gives the general solution:
x3 t
u= e + x f (t) + g(t)
3
where g(t) is a second arbitrary function. We have now obtained the general solution of the given
PDE but to find the arbitrary function we must know two initial conditions.
Suppose, for the sake of example, that these conditions are
∂u
u(0, t) = t , (0, t) = et
∂x
Inserting the first of these conditions into the general solution gives g(t) = t.
Inserting the second condition into the general solution gives f (t) = et .
x3 t
So the final solution is u = e + xet + t.
3
Task
Solve the PDE
∂2u
= sin x cos y
∂x∂y
subject to the conditions
∂u π
= 2x at y = , u = 2 sin y at x = π.
∂x 2
First integrate the PDE with respect to y: (it is equally valid to integrate first with respect to x).
Don’t forget the appropriate arbitrary function.
Your solution
HELM (2006): 21
Answer
∂2u

∂ ∂u
Recall that =
∂x∂y ∂y ∂x
∂u
Hence integration with respect to y gives = sin x sin y + f (x)
∂x
∂u
Since one of the given conditions is on , impose this condition to determine the arbitrary function
∂x
f (x):
Your solution
Answer
At y = π/2 the condition gives sin x sin π/2 + f (x) = 2x i.e. f (x) = 2x − sin x
∂u
So = sin x sin y + 2x − sin x
∂x
Now integrate again to determine u:
Your solution
Answer
Integrating now with respect to x gives u = − cos x sin y + x2 + cos x + g(y)
Next, obtain the arbitrary function g(y):

Your solution
Answer
The condition u(π, y) = 2 sin y gives − cos π sin y + π 2 + cos π + g(y) = 2 sin y
∴ sin y + π 2 − 1 + g(y) = 2 sin y
∴ g(y) = sin y + 1 − π 2
Now write down the final answer for u(x, y):

Your solution
Answer
u(x, y) = x2 + cos x(1 − sin y) + sin y + 1 − π 2
22 HELM (2006):
®
2. Method of separation of variables - general approach

In Section 25.2 we showed that
(a) u(x, y) = sin x cosh y

is a solution of the two-dimensional Laplace equation
2
(b) u(x, t) = e−2π t sin πx
is a solution of the one-dimensional heat conduction equation
πx
πct
(c) u(x, t) = u0 sin cos
` `
is a solution of the one-dimensional wave equation.
All three solutions here have a specific form: in (a) u(x, y) is a product of a function of x alone,
sin x, and a function of y alone, cosh y. Similarly, in both (b) and (c), u(x, t) is a product of a
function of x alone and a function of t alone.
The method of separation of variables involves finding solutions of PDEs which are of this product
form. In the method we assume that a solution to a PDE has the form.
u(x, t) = X(x)T (t) (or u(x, y) = X(x)Y (y))
where X(x) is a function of x only, T (t) is a function of t only and Y (y) is a function y only.
You should note that not all solutions to PDEs are of this type; for example, it is easy to verify that
u(x, y) = x2 − y 2
(which is not of the form u(x, y) = X(x)Y (y)) is a solution of the Laplace equation.
However, many interesting and useful solutions of PDEs are obtainable which are of the product
form. We shall firstly consider the types of solution obtainable for our three basic PDEs using trial
solutions of the product form.
Heat conduction equation

∂2u 1 ∂u
2
= k>0 (1)
∂x k ∂t
Assuming that
(2)
u = X(x)T (t)
then = XT for short
∂u dX
= T = X 0 T for short
∂x dx
∂2u d2 X
= T = X 00 T for short
∂x2 dx2
∂u dT
= X = XT 0 for short
∂t dt
HELM (2006): 23
Substituting into the original PDE (1)
1
X 00 T = XT 0
k
which can be re-arranged as
X 00 1 T0
= (3)
X kT
Now the left-hand side of (3) involves functions of x only and the right-hand side expression contain
functions of t only. Thus altering the value of t cannot change the left-hand side of (3) i.e. it stays
constant. Hence so must the right-hand side be constant. We conclude that T (t) is a function such
that
1 T0
=K (4)
kT
where K is a constant whose sign is yet to be determined.
By a similar argument, altering the value of x cannot change the right-hand side of (3) and conse-
quently the left-hand side must be a constant, i.e.
X 00
=K (5)
X
We see that the effect of assuming a product trial solution of the form (2) converts the PDE (1)
into the two ODEs (4) and (5).
Both these ODEs are types whose solution we revised at the beginning of this Workbook but we shall
not attempt to solve them yet. In particular the solution of (5) depends on whether the constant K
is positive or negative.
Wave equation
∂2u 1 ∂2u
= (6)
∂x2 c2 ∂t2
Task
By following a similar procedure to the above, assume a product solution
u(x, t) = X(x)T (t)
for the wave equation and find the two ODEs satisfied by X(x) and T (t).
∂2u ∂2u
First obtain and :
∂x2 ∂t2
Your solution
Answer
∂2u 00 ∂2u
u = X(x)T (t) gives = X T and = XT 00
∂x2 ∂t2
24 HELM (2006):
®
Now substitute these results into (6) and transpose so the variables are separated i.e. all functions
of x are on the left-hand side, all funtions of t on the right-hand side:
Your solution
Answer
1 X 00 1 T 00
We get X 00 T = XT 00
and, transposing, =
c2 X c2 T
Finally, write down the required ordinary differential equations:
Your solution
Answer
Equating both sides to the same constant K gives
X 00 d2 X
= K or − KX = 0 (7)
X dx2
and
1 T 00 d2 T
=K or − Kc2 T = 0 (8)
c2 T dt2
The solution of the ODEs (7) and (8) has been obtained earlier, and will depend on the sign of K.
Laplace’s equation
∂2u ∂2u
+ =0 (9)
∂x2 ∂y 2
Task
Separating the variables for Laplace’s equation follows similar lines to the previous
Task. Obtain the ODEs satisfied by X(x) and Y (y).
Your solution
HELM (2006): 25
Answer
∂2u ∂2u
Assuming u(x, y) = X(x)Y (y) leads to: = X 00 Y = XY 00 so
∂x2 ∂y 2
X 00 Y 00
X 00 Y + XY 00 = 0 or =−
X Y
Equating each side to a constant K
X 00 d2 X
= K or − KX = 0 (10a)
X dx2
Y 00 d2 Y
= −K or + KY = 0 (10b)
Y dy 2
(Note carefully the different signs in the two ODEs. Yet again the sign of the “separation constant”
K will determine the solutions.)
3. Method of separation of variables - specific solutions

We shall now study some specific problems which can be fully solved by the separation of variables
method.
Example 3 Solve the heat conduction equation

∂2u 1 ∂u
2
=
∂x 2 ∂t
over 0 < x < 3, t>0 for the boundary conditions
u(0, t) = u(3, t) = 0
and the initial condition
u(x, 0) = 5 sin 4πx.
Solution
Assuming u(x, t) = X(x)T (t) gives rise to the differential equations (4) and (5) with the parameter
k = 2:
dT d2 X
= 2KT = KX
dt dx2
26 HELM (2006):
®
The T equation has general solution

T = Ae2Kt
which will increase exponentially with increasing t if K is positive and decrease with t if K is negative.
In any physical problem the latter is the meaningful situation. To emphasise that K is being taken
as negative we put
K = −λ2
so
2
T = Ae−2λ t .
The X equation then becomes
d2 X
2
= −λ2 X
dx
which has solution
X(x) = B cos λx + C sin λx.
Hence
2t
u(x, t) = X(x)T (t) = (D cos λx + E sin λx)e−2λ (11)
where D = AB and E = AC.
(You should always try to keep the number of arbitrary constants down to an absolute minimum by
multiplying them together in this way.)
We now insert the initial and boundary conditions to obtain the constant D andE and also the
separation constant λ.
The initial condition u(0, t) = 0 gives
2
(D cos 0 + E sin 0)e−2λ t = 0 for all t.
Since sin 0 = 0 and cos 0 = 1 this must imply that D = 0.
The other initial condition u(3, t) = 0 then gives
2
E sin(3λ)e−2λ t = 0 for all t.
We cannot deduce that the constant E has to be zero because then the solution (11) would be the
trivial solution u ≡ 0. The only sensible deduction is that
sin 3λ = 0 i.e. 3λ = nπ (where n is some integer).
Hence solutions of the form (11) satisfying the 2 boundary conditions have the form
nπx 2n2 π2 t
u(x, t) = En sin e− 9
3
where we have written En for E to allow for the possibility of a different value for the constant for
each different value of n.
We obtain the value of n by using the initial condition u(x, 0) = 5 sin 4πx and forcing this solution
to agree with it. That is,
nπx
u(x, 0) = En sin = 5 sin 4πx
3
HELM (2006): 27
so we must choose n = 12 with E12 = 5.
Hence, finally,

12πx − 2 (12)2 π2 t 2
u(x, t) = 5 sin e 9 = 5 sin(4πx) e−32π t .
3
Task
∂2u 1 ∂2u
Solve the 1-dimensional wave equation = for 0 < x < 2, t>0
∂x2 16 ∂t2
The boundary conditions are
u(0, t) = u(2, t) = 0
The initial conditions are
∂u
(i) u(x, 0) = 6 sin πx − 3 sin 4πx (ii) (x, 0) = 0
∂t
Firstly, either using (7) and (8) or by working from first principles assuming the product solution
u(x, t) = X(x)T (t),
write down the ODEs satisfied by X(x) and T (t):
Your solution
Answer
X 00 T 00
=K =K
X 16T
Now decide on the appropriate sign for K and then write down the solution to these equations:
Your solution
Answer
Choosing K as negative (say K = −λ2 ) will produce Sinusoidal solutions for X and T which are
appropriate in the context of the wave equation where oscillatory solutions can be expected.
Then X 00 = −λ2 X gives
X = A cos λx + B sin λx
Similarly T 00 = −16λ2 T gives
T = C cos 4λt + D sin 4λt
28 HELM (2006):
®
Now obtain the general solution u(x, t) by multiplying X(x) by T (t) and insert the two boundary
conditions to obtain information about two of the constants:
Your solution
Answer
u(x, t) = (A cos λx + B sin λx)(C cos 4λt + D sin 4λt)
u(0, t) = 0 for all t gives
A(C cos 4λt + D sin 4λt) = 0
which implies that A = 0.
u(2, t) = 0 for all t gives
B sin 2λ(C cos 4λt + D sin 4λt) = 0
so, for a non-trivial solution,
nπ
sin 2λ = 0 i.e. λ = for some integer n.
2
At this stage we write the solution as
nπx
u(x, t) = sin (E cos 2nπt + F sin 2nπt)
2
where we have multiplied constants and put E = BC and F = BD.
HELM (2006): 29
Now insert the initial condition
∂u
(x, 0) = 0 for all x 0 < x < 2.
∂t
and deduce the value of F :
Your solution
Answer
Differentiating partially with respect to t
∂u nπx
= sin (−2nπE sin 2nπt + 2nπF cos 2nπt)
∂t 2
so at t = 0
∂u nπx
(x, 0) = sin 2nπF = 0
∂t 2
from which we must have that F = 0.
Finally using the other the initial condition u(x, 0) = 6 sin(πx) − 3 sin(4πx) deduce the form of
u(x, t):
Your solution
30 HELM (2006):
®
Answer
At this stage the solution reads
nπx
u(x, t) = E sin cos(2nπt) (12)
2
We now have to insert the last condition i.e. the initial condition
u(x, 0) = 6 sin πx − 3 sin 4πx (13)
This seems strange because, putting t = 0 in our solution (12) suggests
nπx
u(x, 0) = E sin
2
At this point we seem to have incompatability because no single value of n will enable us to satisfy
(13). However, in the solution (12), any positive integer value of n is acceptable and we can in
fact, superpose solutions of the form (12) and still have a valid solution to the PDE Hence we first
write, instead of (12)
X∞ nπx
u(x, t) = En sin cos(2nπt) (14)
n=1
2
from which
∞
X nπx
u(x, 0) = En sin (15)
n=1
2
(which looks very much like, and indeed is, a Fourier series.)
To make the solution (15) fit the initial condition (13) we do not require all the terms in the infinite
Fourier series. We need only the terms with n = 2 with coefficient E2 = 6 and the term for which
n = 8 with E8 = −3. All the other coefficients En have to be chosen as zero.
Using these results in (14) we obtain the solution
u(x, t) = 6 sin πx cos 4πt − 3 sin 4πx cos 16πt
The above solution perhaps seems rather involved but there is a definite sequence of logical steps
which can be readily applied to other similar problems.
HELM (2006): 31
Heat conduction through a furnace wall
Introduction
Conduction is a mode of heat transfer through molecular collision inside a material without any
motion of the material as a whole. If one end of a solid material is at a higher temperature, then
heat will be transferred towards the colder end because of the relative movement of the particles.
They will collide with the each other with a net transfer of energy.
Energy flows through heat conductive materials by a thermal process generally known as ’gradient
heat transport’. Gradient heat transport depends on three quantities: the heat conductivity of the
material, the cross-sectional area of the material which is available for heat transfer and the spatial
gradient of temperature (driving force for the process). The larger the conductivity, the gradient,
and the cross section, the faster the heat flows.
The temperature profile within a body depends upon the rate of heat transfer to the atmosphere,
its capacity to store some of this heat, and its rate of thermal conduction to its boundaries (where
the heat is transferred to the surrounding environment). Mathematically this is stated by the heat
equation

∂ ∂T ∂T
k = ρc (1)
∂x ∂x ∂x
The thermal diffusivity α is related to the thermal conductivity k, the specific heat c, and the density
of solid material ρ, by
k
α=
ρc
Problem in Words
The wall (thickness L) of a furnace, with inside temperature 800◦ C, is comprised of brick material
[thermal conductivity = 0.02 W m−1 K−1 )]. Given that the wall thickness is 12 cm, the atmospheric
temperature is 0◦ C, the density and heat capacity of the brick material are 1.9 gm cm−3 and
6.0 J kg−1 K−1 respectively, estimate the temperature profile within the brick wall after 2 hours.
Solve the partial differential equation

∂ ∂T ∂T
k = ρc (2)
∂x ∂x ∂t
subject to the initial condition
πx
T (x, 0) = 800 sin (3)
2L
32 HELM (2006):
®
and boundary conditions at the inner (x = L) and outer (x = 0) walls of

T =0 at x=0 (4)
and
∂T
=0 at x=L (4b)
∂x
Find the temperature profile at T = 7200 seconds = 2 hours.
Using separation of variables
T (x, t) = X(x) × Y (t) (5)
so Equation (2) becomes
Y0 X 00
=α =K (6)
Y X
Using values of K which are zero or positive does not allow a solution which satisfies the initial and
boundary conditions. Thus, K is assumed to be negative i.e. K = −λ2 . Equation (6) separates into
the two ordinary differential equations
dY
= −λ2 Y
dt
d2 X
= −λ2 αX
dx2
with solutions
2t
Y = Ce−λ
λ λ
X = A∗ cos √ x + B ∗ sin √ x
α α
and

−λ2 t λ λ
T =X ×Y =e A cos √ x + B sin √ x (8)
α α
where A = A∗ × C and B = B ∗ × C.
Setting T = 0 where x = 0 (Equation (4a)) gives A = 0 i.e.
2 λ
T = Be−λ t sin √ x (9)
α
and hence
dT λ 2 λ
= B √ e−λ t cos √ x (10)
dx α α
dT
Setting =0 where x = L (and for all t), Equation (4b) gives one of the conclusions,
dx
B = 0
λ = 0
λ
cos √ L = 0
α
HELM (2006): 33
The first two possibilities (B = 0 and λ = 0) can be discounted as they leave T = 0 for all x and t
λ λ
and it is not possible to satisfy the initial condition (3). Hence cos √ L = 0 so √ L = (n + 12 )π
α α
and we deduce that
√
α 1
λ= (n + )π (11)
L 2
and so the temperature T satisfies
α
− (n+ 21 )2 t 1 πx
2
T = Be L sin (n + ) (12)
2 L
However, this must also satisfy Equation (2) i.e.

πx 1 πx
800 sin = B sin (n + ) (13)
2L 2 L
Equating the arguments of the sine terms
πx 1 πx
= (n + ) so n = 0
2L 2 L
Equating the coefficients of the sine terms
800 = B
So the temperature profile is
αt πx
T = 800e− 2L2 sin (14)
2L
k 0.02
where α = = = 1.764 × 10−6 m2 s−1 .
ρc 1900 × 6
αt
After two hours, t = 7200 so − 2 = −0.438 so
2L
πx πx
T = 800 × e−0.438 sin = 516 sin (15)
2L 2L
so the inner wall of the furnace has cooled from 800◦ C to 516◦ C.
Interpretation
The boundary conditions (2) and (3) represent approximations to the true boundary conditions,
approximations made to enable solution by separation of variables. More realistic conditions would
be
∂T (0, t)
−k = houtside {T∞ − T (0, t)}
∂x
∂T (L, t)
−k = hinside {T (0, t) − Ts }
∂x
34 HELM (2006):
®
Solution Using
Fourier Series 25.4
Introduction
In this Section we continue to use the separation of variables method for solving PDEs but you will
find that, to be able to fit certain boundary conditions, Fourier series methods have to be used leading
to the final solution being in the (rather complicated) form of an infinite series. The techniques will be
illustrated using the two-dimensional Laplace equation but similar situations often arise in connection
with other important PDEs.

• be familiar with the separation of variables
Prerequisites method
Before starting this Section you should . . . • be familiar with trigonometric Fourier series

• solve the 2-D Laplace equation for given
Learning Outcomes boundary conditions and utilize Fourier series
On completion you should be able to . . . in the solution when necessary

HELM (2006): 35
Section 25.4: Solution Using Fourier Series
1. Solutions involving infinite Fourier series
We shall illustrate this situation using Laplace’s equation but infinite Fourier series can also be
necessary for the heat conduction and wave equations.
We recall from the previous Section that using a product solution
u(x, t) = X(x)Y (y)
in Laplace’s equation gives rise to the ODEs:
X 00 Y 00
=K = −K
X Y
To determine the sign of K and hence the appropriate solutions for X(x) and Y (y) we must impose
appropriate boundary conditions. We will investigate solving Laplace’s equation in the square
0≤x≤` 0≤y≤`
for the boundary conditions u(x, 0) = 0 u(0, y) = 0 u(`, y) = 0 u(x, `) = U0 , a constant.
See Figure 7.
y
u = U0

u=0 u=0
u=0 x
Figure 7
(a) We must first deduce the sign of the separation constant K:
if K is chosen to be positive say K = λ2 , then the X equation is
X 00 = λ2 X
with general solution
X = Aeλx + Be−λx
while the Y equation becomes
Y 00 = −λ2 Y
with general solution
Y = C cos λy + D sin λy
If the sign of K is negative K = −λ2 the solutions will change to trigonometric in x and exponential
in y.
These are the only two possibilities when we solve Laplace’s equation using separation of variables
and we must look at the boundary conditions of the problem to decide which is appropriate.
Here the boundary conditions are periodic in x (since u(0, y) = u(`, y)) and non-periodic in y which
suggests we need a solution that is periodic in x and non-periodic in y.
36 HELM (2006):
®
Thus we choose K = −λ2 to give

X(x) = (A cos λx + B sin λx)
Y (y) = (Ceλy + De−λy )
(Note that had we chosen the incorrect sign for K at this stage we would later have found it impossible
to satisfy all the given boundary conditions. You might like to verify this statement.)
The appropriate general solution of Laplace’s equation for the given problem is
u(x, y) = (A cos λx + B sin λx)(Ceλy + De−λy ).
(b) Inserting the boundary conditions produces the following consequences:
u(0, y) = 0 gives A=0
nπ
u(`, y) = 0 gives sin λ` = 0 i.e. λ =
`
where n is a positive integer 1, 2, 3, . . . . While n = 0 also satisfies the equation it leads to the trivial
solution u = 0 only.)
u(x, 0) = 0 gives C + D = 0 i.e. D = −C
At this point the solution can be written
nπx nπy nπy !
−
u(x, y) = BC sin e ` −e `
`
This can be conveniently written as
nπx nπy
u(x, y) = E sin sinh (1)
` `
where E = 2BC.
At this stage we have just one final boundary condition to insert to obtain information about the
constant E and the integer n. Our solution (1) gives
nπx
u(x, `) = E sin sinh(nπ)
`
and clearly this is not compatible, as it stands, with the given boundary condition
u(x, `) = U0 = constant.
The way to proceed is again to superpose solutions of the form (1) for all positive integer values of
n to give
X∞ nπx nπy
u(x, y) = En sin sinh (2)
n=1
` `
from which the final boundary condition gives
X ∞ nπx
U0 = En sin sinh(nπ) 0 < x < `. (3)
n=1
`
X ∞ nπx
= bn sin
n=1
`
HELM (2006): 37
What we have here is a Fourier (sine) series for the function
f (x) = U0 0 < x < `.
Recalling the work on half-range Fourier series ( 23.5) we must extend this definition to produce
an odd function with period 2`. Hence we define

U0 0<x<`
f (x) =
−U0 −` < x < 0
f (x + `) = f (x)
illustrated in Figure 8.
f (x)
U0
− 2 x
− U0
Figure 8
(c) We can now apply standard Fourier series theory to evaluate the Fourier coefficients bn in (3).
We obtain
Z `
4U0 nπx
bn = En sinh nπ = sin dx
2` 0 `
nπx
(Recall that, in general, bn = 2 × the mean value of f (x) sin over a period. Here, because
nπx `
f (x) is odd, and hence f (x) sin is even, we may take half the period for our averaging
`
process.)
Carrying out the integration
4U0

2U0

 n = 1, 3, 5, . . .
En sinh nπ = (1 − cos nπ) i.e. En = nπ sinh nπ
nπ 

0 n = 2, 4, 6, . . .
(Since f (x) is a square wave with half-period symmetry we are not surprised that only odd harmonics
arise in the Fourier series.)
Finally substituting these results for En into (2) we obtain the solution to the given problem as the
infinite series:
nπx nπy
4U0 X sin sinh
∞
u(x, y) = ` `
π n=1 n sinh nπ
(n odd)
38 HELM (2006):
®
Task
Solve Laplace’s equation to determine the steady state temperature u(x, y) in the
semi-infinite plate 0 ≤ x ≤ 1, y ≥ 0. Assume that the left and right sides are
insulated and assume that the solution is bounded. The temperature along the
bottom side is a known function f (x).
First write this problem as a mathematical boundary value problem paying particular attention to the
mathematical representation of the boundary conditions:
Your solution
Answer
Since the sides x = 0 and x = 1 are insulated, the temperature gradient across these sides is zero
∂u ∂u
i.e. = 0 for x = 0, 0 < y < ∞ and = 0 for x = 1, 0 < y < ∞.
∂x ∂x
The third boundary condition is u(x, 0) = f (x).
The fourth boundary condition is less obvious: since the solution should be bounded (ie not grow
and grow) we must demand that u(x, y) → 0 as y → ∞. (See figure below.)
∂u ∂u
=0 =0
∂x ∂x
0 x
u = f (x) 1
HELM (2006): 39
Now use the separation of variables method, putting u(x, y) = X(x)Y (y), to find the differential
equations satisfied by X(x), Y (y) and decide on the sign of the separation constant K:
Your solution
Answer
We have boundary conditions which, like the worked example above, are periodic in x. Hence the
differential equations are, again,
X 00 = −λ2 X Y 00 = +λ2 Y
putting the separation constant K as −λ2 .
Write down the solutions for X, for Y and hence the product solution u(x, y) = X(x)Y (y):
Your solution
Answer
X = A cos λx + B sin λx Y = Ceλy + De−λy
so
u = (A cos λx + B sin λx)(Ceλy + De−λy ) (4)
Impose the derivative boundary conditions on this solution:

Your solution
40 HELM (2006):
®
Answer
∂u
= (−λA sin λx + λB cos λx)(Ceλy + De−λy )
∂x
∂u
Hence (0, y) = 0 gives λB(Ceλy + De−λy ) = 0 for all y.
∂x
The possibility λ = 0 can be excluded this would give a trivial constant solution in (4). Hence we
must choose B = 0.
∂u
The condition (1, y) = 0 gives
∂x
−λA sin λ(Ceλy + De−λy ) = 0
Choosing A = 0 would make u ≡ 0 so we must force sin λ to be zero i.e. choose λ = nπ where n
is a positive integer.
Thus, at this stage (4) becomes
u = A cos nπx(Cenπy + De−nπy )

= cos nπx(Eenπy + F e−nπy ) (5)
Now impose the condition that this solution should be bounded:

Your solution
Answer
The region over which we are solving Laplace’s equation is semi-infinite i.e. the y coordinate
increases without limit. The solution for u(x, y) in (5) will increase without limit as y → ∞ due to
the term enπy (n being a positive integer.) This can be avoided i.e. the solution will be bounded if
the constant E is chosen as zero.
Finally, use Fourier series techniques to deal with the final boundary condition u(x, 0) = f (x):
Your solution
HELM (2006): 41
Your solution
Answer
Superposing solutions of the form (5) (with E = 0) gives
∞
X
u(x, y) = Fn cos(nπx) e−nπy (6)
n=0
so the boundary condition gives

∞
X
f (x) = Fn cos nπx
n=0
We have here a half-range Fourier cosine series representation of a function f (x) defined over
0 < x < 1. Extending f (x) as an even periodic function with period 2 and using standard Fourier
series theory gives
Z 1
Fn = 2 f (x) cos nπx dx n = 1, 2, . . .
0
with
Z 1
F0
= f (x) dx.
2 0
Hence (6) is the solution of this given boundary value problem, the integrals giving us in principle
the Fourier coefficients Fn for a given function f (x).
42 HELM (2006):
Contents 26
Functions of a Complex
Variable
26.1 Complex Functions 2
26.2 Cauchy-Riemann Equations and Conformal Mapping 8
26.3 Standard Complex Functions 21
26.4 Basic Complex Integration 29
26.5 Cauchy’s Theorem 39
26.6 Singularities and Residues 50
Learning outcomes
By studying the Workbook you will understand the concept of a complex function and
its derivative and learn what is meant by an analytic function and why analytic functions
are important.
You will learn about the Cauchy-Riemann equations and the concept of conformal
mapping and be able to solve complex problems involving standard complex functions
and evaluate simple complex integrals.
You will learn Cauchy's theorem and be able to use it to evaluate complex integrals.
You will learn how to develop simple Laurent series and classify singularities of a complex
function.
Finally, you will learn about the residue theorem and how to use it to solve problems.

Complex Functions 26.1
Introduction
In this introduction to functions of a complex variable we shall show how the operations of taking a
limit and of finding a derivative, which we are familiar with for functions of a real variable, extend in a
natural way to the complex plane. In fact the notation used for functions of a complex variable and
for complex operations is almost identical to that used for functions of a real variable. In effect, the
real variable x is simply replaced by the complex variable z . However, it is the interpretation of
functions of a complex variable and of complex operations that differs significantly from the real case.
In effect, a function of a complex variable is equivalent to two functions of a real variable and our
standard interpretation of a function of a real variable as being a curve on an xy plane no longer holds.
There are many situations in engineering which are described quite naturally by specifying two har-
monic functions of a real variable: a harmonic function is one satisfying the two-dimensional Laplace
equation:
∂2f ∂2f
+ = 0.
∂x2 ∂y 2
Fluids and heat flow in two dimensions are particular examples. It turns out that knowledge of
functions of a complex variable can significantly ease the calculations involved in this area.
' $
• understand how to use the polar and
exponential forms of a complex number
• be familiar with trigonometric relations,

Prerequisites hyperbolic and logarithmic functions
• be able to form a partial derivative
• be able to take a limit

&
%

Learning Outcomes • explain the meaning of the term

analytic function

2 HELM (2006):
Workbook 26: Functions of a Complex Variable
1. Complex functions
Let the complex variable z be defined by z = x + iy where x and y are real variables and i is, as
usual, given by i2 = −1. Now let a second complex variable w be defined by w = u + iv where u
and v are real variables. If there is a relationship between w and z such that to each value of z in a
given region of the z−plane there is assigned one, and only one, value of w then w is said to be a
function of z, defined on the given region. In this case we write
w = f (z).
As a example consider w = z 2 − z, which is defined for all values of z (that is, the right-hand side
can be computed for every value of z). Then, remembering that z = x + iy,
w = u + iv = (x + iy)2 − (x + iy) = x2 + 2ixy − y 2 − x − iy.
Hence, equating real and imaginary parts: u = x2 − x − y 2 and v = 2xy − y.
If z = 2 + 3i, for example, then x = 2, y = 3 so that u = 4 − 2 − 9 = −7 and v = 12 − 3 = 9,
giving w = −7 + 9i.
Example 1
1
(a) For which values of z is w = defined?
z
(b) For these values obtain u and v and evaluate w when z = 2 − i.
Solution
(a) w is defined for all z 6= 0.
1 1 x − iy x − iy x −y
(b) u + iv = = · = 2 2
. Hence u = 2 2
and v = 2 .
x + iy x + iy x − iy x +y x +y x + y2
2 1 2 1
If z = 2 − i then x = 2, y = −1 so that x2 + y 2 = 5. Then u = , v = − and w = − i.
5 5 5 5
2. The limit of a function

The limit of w = f (z) as z → z0 is a number ` such that |f (z) − `| can be made as small as we
wish by making |z − z0 | sufficiently small. In some cases the limit is simply f (z0 ), as is the case for
w = z 2 − z. For example, the limit of this function as z → i is f (i) = i2 − i = −1 − i.
There is a fundamental difference from functions of a real variable: there, we could approach a point
on the curve y = g(x) either from the left or from the right when considering limits of g(x) at such
points. With the function f (z) we are allowed to approach the point z = z0 along any path in the
z-plane; we require merely that the distance |z − z0 | decreases to zero.
HELM (2006): 3
Section 26.1: Complex Functions
Suppose that we want to find the limit of f (z) = z 2 − z as z → 2 + i along each of the paths (a),
(b) and (c) indicated in Figure 1.
y (b)
(c)
z0
1
(a)
2 x
Figure 1
(a) Along this path z = x + i (for any x) and z 2 − z = x2 + 2xi − 1 − x − i

That is: z 2 − z = x2 − 1 − x + (2x − 1)i.
As z → 2 + i, then x → 2 so that the limit of z 2 − z is 22 − 1 − 2 + (4 − 1)i = 1 + 3i.
(b) Here z = 2 + yi (for any y) so that z 2 − z = 4 − y 2 − 2 + (4y − y)i.
As z → 2 + i, y → 1 so that the limit of z 2 − z is 4 − 1 − 2 + (4 − 1)i = 1 + 3i.
(c) Here z = k(2 + i) where k is a real number. Then
z 2 − z = k 2 (4 + 4i − 1) − k(2 + i) = 3k 2 − 2k + (4k 2 − k)i.
As z → 2 + i, k → 1 so that the limit of z 2 − z is 3 − 2 + (4 − 1)i = 1 + 3i.
In each case the limit is the same.
Task
Evaluate the limit of f (z) = z 2 + z + 1 as z → 1 + 2i along the paths
(a) parallel to the x-axis coming from the right,
(b) parallel to the y-axis, coming from above,
(c) the line joining the point 1 + 2i to the origin, coming from the origin.
Your solution
4 HELM (2006):
Answer
(a) Along this path z = x + 2i and z 2 + z + 1 = x2 − 4 + x + 1 + (4x + 2)i. As z → 1 + 2i, x → 1
and z 2 + z + 1 → −1 + 6i.
(b) Along this path z = 1 + yi
and z 2 + z + 1 = 1 − y 2 + 1 + 1 + (2y + y)i. As z → 1 + 2i, y → 2 and z 2 + z + 1 → −1 + 6i.
(c) If z = k(1 + 2i) then z 2 + z + 1 = k 2 + k + 1 − 4k 2 + (4k 2 + 2k)i. As z → 1 + 2i, k → 1
and z 2 + z + 1 → −1 + 6i.
Not all functions of a complex variable are as straightforward to analyse as the last two examples.
z̄
Consider the function f (z) = . Along the x-axis moving towards the origin from the right
z
z = x and z̄ = x so that f (z) = 1 which is the limit as z → 0 along this path.
However, we can approach the origin along any path. If instead we approach the origin along the
positive y-axis z = iy then
z̄
z̄ = −iy and f (z) = = −1, which is the limit as z → 0 along this path.
z
z̄
Since these two limits are distinct then lim does not exist.
z→0 z
We cannot assume that the limit of a function f (z) as z → z0 is independent of the path chosen.
Definition of continuity
The function f (z) is continuous as z → z0 if the following two statements are true:
(a) f (z0 ) exists;

(b) lim f (z) exists and is equal to f (z0 ).
z→z0
z2 + 4 i2 + 4 3
As an example consider f (z) = 2 . As z → i, then f (z) → f (i) = 2 = . Thus f (z) is
z +9 i +9 8
continuous at z = i.
z2 + 4
However, when z 2 + 9 = 0 then z = ±3i and neither f (3i) nor f (−3i) exists. Thus 2 is
z +9
discontinuous at z = ±3i. It is easily shown that these are the only points of discontinuity.
Task
z
State where f (z) = is discontinuous. Find lim f (z).
z2 +4 z→i
Your solution
HELM (2006): 5
Answer
z 2 + 4 = 0 where z = ±2i; at these points f (z) is discontinuous as f (±2i) does not exist.
i 1
lim f (z) = f (i) = 2 = i.
z→i i +4 3
It is easily shown that any polynomial in z is continuous everywhere whilst any rational function is
continuous everywhere except at the zeroes of the denominator.
Exercises
1
1. For which values of z is w = defined? For these values obtain u and v and evaluate w
z−i
when z = 1 − 2i.
2. Find the limit of f (z) = z 3 + z as z → i along the paths (a) parallel to the x-axis coming from
the right, (b) parallel to the y-axis coming from above.
z
3. Where is f (z) = discontinuous?. Find the lim f (z).
z2 +9 z→−i
Answers
1 1 x − (y − 1)i x − (y − 1)i
1. w is defined for all z 6= i w= = × = 2 .
x + yi − i x + (y − 1)i x − (y − 1)i x + (y − 1)2
x −(y − 1)
∴ u= , v= .
x2 + (y − 1)2 x2
+ (y − 1)2
1 1 3 1 3
When z = 1 − 2i, x = 1, y = −2 so that u = = , v= , z= + i
1+9 10 10 10 10
2. (a) z = x + i, z 3 + z = x3 + 3x2 i − 2x. As z → i, x → 0 and z 3 + z → 0
(b) z = yi, z 3 + z = −y 3 i + yi. As z → i, y → 1 and z 3 + z → −i + i = 0.

−i 1
3. f (z) is discontinuous at z = ±3i. The limit is f (−i) = = − i.
−1 + 9 8
3. Differentiating functions of a complex variable

The function f (z) is said to be differentiable at z = z0 if

f (z0 + ∆z) − f (z0 )
lim exists. Here ∆z = ∆x + i∆y.
∆z→0 ∆z
Apart from a change of notation this is precisely the same as the definition of the derivative of a
function of a real variable. Not surprisingly then, the rules of differentiation used in functions of a
real variable can be used to differentiate functions of a complex variable. The value of the limit is
df
the derivative of f (z) at z = z0 and is often denoted by dz |z=z0 or by f 0 (z0 ).
A point at which the derivative does not exist is called a singular point of the function.
6 HELM (2006):
A function f (z) is said to be analytic at a point z0 if it is differentiable throughout a neighbourhood
of z0 , however small. (A neighbourhood of z0 is the region contained within some circle |x−z0 | = r.)
1
For example, the function f (z) = has singular points where z 2 + 1 = 0, i.e. at z = ±i.
z2
+1
For all other points the usual rules for differentiation apply and hence
2z
f 0 (z) = − (quotient rule)
(z 2 + 1)2
6i 3
So, for example, at z = 3i, f 0 (z) = − 2
= − i.
(−9 + 1) 32
Example 2
z
Find the singular point of the rational function f (z) = . Find f 0 (z) at other
z+i
points and evaluate f 0 (2i).
Solution
z +i = 0 when z = −i and this is the singular point: f (−i) does not exist. Elsewhere, differentiating
using the quotient rule:
(z + i) · 1 − z · 1 i i 1
f 0 (z) = 2
= 2
. Thus at z = 2i, we have f 0 (z) = 2
= − i.
(z + i) (z + i) (3i) 9
The simple function f (z) = z̄ = x − iy is not analytic anywhere in the complex plane. To see this
consider looking at the derivative at an arbitrary point z0 . We easily see that
f (z0 + ∆z) − f (z0 )

R =
∆z
(x0 + ∆x) − i(y0 + ∆y) − (x0 − iy0 ) ∆x − i ∆y
= =
∆x + i∆y ∆x + i ∆y
Hence f (z) will fail to have a derivative at z0 if we can show that this expression has no limit. To
do this we consider looking at the limit of the function along two distinct paths.
Along a path parallel to the x-axis:
∆x
∆y = 0 so that R = = 1, and this is the limit as ∆z = ∆x → 0.
∆x
Along a path parallel to the y-axis:
−i ∆y
∆x = 0 so that R = = −1, and this is the limit as ∆z = ∆y → 0.
i∆y
f (z + ∆z) − f (z)
As these two values of R are distinct, the limit of as z → z0 does not exist
∆z
and so f (z) fails to be differentiable at any point. Hence it is not analytic anywhere.
HELM (2006): 7
Cauchy-Riemann
Equations and
Conformal Mapping 26.2
Introduction
In this Section we consider two important features of complex functions. The Cauchy-Riemann
equations provide a necessary and sufficient condition for a function f (z) to be analytic in some
region of the complex plane; this allows us to find f 0 (z) in that region by the rules of the previous
Section.
A mapping between the z-plane and the w-plane is said to be conformal if the angle between two
intersecting curves in the z-plane is equal to the angle between their mappings in the w-plane. Such
a mapping has widespread uses in solving problems in fluid flow and electromagnetics, for example,
where the given problem geometry is somewhat complicated.

Prerequisites • understand the idea of a complex function

and its derivative

• use the Cauchy-Riemann equations to obtain
Learning Outcomes the derivative of complex functions
On completion you should be able to . . . • appreciate the idea of a conformal mapping

8 HELM (2006):
®
1. The Cauchy-Riemann equations

Remembering that z = x + iy and w = u + iv, we note that there is a very useful test to determine
whether a function w = f (z) is analytic at a point. This is provided by the Cauchy-Riemann
equations. These state that w = f (z) is differentiable at a point z = z0 if, and only if,
∂u ∂v ∂u ∂v
= and =− at that point.
∂x ∂y ∂y ∂x
When these equations hold then it can be shown that the complex derivative may be determined by
df ∂f df ∂f
using either = or = −i .
dz ∂x dz ∂y
(The use of ‘if, and only if,’ means that if the equations are valid, then the function is differentiable
and vice versa.)
If we consider f (z) = z 2 = x2 − y 2 + 2ixy then u = x2 − y 2 and v = 2xy so that
∂u ∂u ∂v ∂v
= 2x, = −2y, = 2y, = 2x.
∂x ∂y ∂x ∂y
It should be clear that, for this example, the Cauchy-Riemann equations are always satisfied; therefore,
the function is analytic everywhere. We find that
df ∂f df ∂f
= = 2x + 2iy = 2z or, equivalently, = −i = −i(−2y + 2ix) = 2z
dz ∂x dz ∂y
This is the result we would expect to get by simply differentiating f (z) as if it was a real function.
For analytic functions this will always be the case i.e. for an analytic function f 0 (z) can be
found using the rules for differentiating real functions.
Example 3
Show that the function f (z) = z 3 is analytic everwhere and hence obtain its
derivative.
Solution
w = f (z) = (x + iy)3 = x3 − 3xy 2 + (3x2 y − y 3 )i

Hence
u = x3 − 3xy 2 and v = 3x2 y − y 3 .
Then
∂u ∂u ∂v ∂v
= 3x2 − 3y 2 , = −6xy, = 6xy, = 3x2 − 3y 2 .
∂x ∂y ∂x ∂y
The Cauchy-Riemann equations are identically true and f (z) is analytic everywhere.
df ∂f
Furthermore = = 3x2 − 3y 2 + (6xy)i = 3(x + iy)2 = 3z 2 as we would expect.
dz ∂x
We can easily find functions which are not analytic anywhere and others which are only analytic in
a restricted region of the complex plane. Consider again the function f (z) = z̄ = x − iy.
HELM (2006): 9
Section 26.2: Cauchy-Riemann Equations and Conformal Mapping
Here
∂u ∂u ∂v ∂v
u = x so that = 1, and = 0; v = −y so that = 0, = −1.
∂x ∂y ∂x ∂y
The Cauchy-Riemann equations are never satisfied so that z̄ is not differentiable anywhere and so is
not analytic anywhere.
1
By contrast if we consider the function f (z) = we find that
z
x y
u= 2 ; v= 2 .
x + y2 x + y2
As can readily be shown, the Cauchy-Riemann equations are satisfied everywhere except for x2 +y 2 =
1
0, i.e. x = y = 0 (or, equivalently, z = 0.) At all other points f 0 (z) = − 2 . This function is analytic
z
everywhere except at the single point z = 0.
Analyticity is a very powerful property of a function of a complex variable. Such functions tend to
behave like functions of a real variable.
Example 4
Show that if f (z) = z z̄ then f 0 (z) exists only at z = 0.
Solution
∂u ∂u ∂v ∂v
f (z) = x2 + y 2 so that u = x2 + y 2 , v = 0. = 2x, = 2y, = 0, = 0.
∂x ∂y ∂x ∂y
Hence the Cauchy-Riemann equations are satisfied only where x = 0 and y = 0, i.e. where z = 0.
Therefore this function is not analytic anywhere.
Analytic functions and harmonic functions

Using the Cauchy-Riemann equations in a region of the z-plane where f (z) is analytic, gives
∂2u ∂2v

∂ ∂u ∂ ∂v
= = − =− 2
∂x∂y ∂x ∂y ∂x ∂x ∂x
and
∂2u ∂2v

∂ ∂u ∂ ∂v
= = = 2.
∂y∂x ∂y ∂x ∂y ∂y ∂y
∂2u ∂2u
If these differentiations are possible then = so that
∂x∂y ∂y∂x
∂2u ∂2u
+ =0 (Laplace’s equation)
∂x2 ∂y 2
In a similar way we find that
∂2v ∂2v
+ = 0 (Can you show this?)
∂x2 ∂y 2
10 HELM (2006):
®
When f (z) is analytic the functions u and v are called conjugate harmonic functions.
Suppose u = u(x, y) = xy then it is easy to verify that u satisfies Laplace’s equation (try this). We
now try to find the conjugate harmonic function v = v(x, y).
First, using the Cauchy-Riemann equations:
∂v ∂u ∂v ∂u
= =y and =− = −x.
∂y ∂x ∂x ∂y
1
Integrating the first equation gives v = y 2 + a function of x. Integrating the second equation
2
1 2
gives v = − x + a function of y. Bearing in mind that an additive constant leaves no trace after
2
differentiation, we pool the information above to obtain
1
v = (y 2 − x2 ) + C where C is a constant
2
1
Note that f (z) = u + iv = xy + (y 2 − x2 )i + D where D is a constant (replacing Ci).
2
1 2
We can write f (z) = − iz + D (as you can verify). This function is analytic everywhere.
2
Task
Given the function u = x2 − x − y 2
(a) Show that u is harmonic, (b) Find the conjugate harmonic function, v.
Your solution
(a)
Answer
∂u ∂2u ∂u ∂2u
= 2x − 1, = 2, = −2y, = −2.
∂x ∂x2 ∂y ∂y 2
∂2u ∂2u
Hence + = 0 and u is harmonic.
∂x2 ∂y 2
Your solution
(b)
HELM (2006): 11
Answer
∂v
Integrating = 2x − 1 gives v = 2xy − y+ function of x.
∂y
∂v
Integrating = +2y gives v = 2xy+ function of y.
∂x
Ignoring the duplication, v = 2xy − y + C, where C is a constant.
Task
Find f (z) in terms of z, where f (z) = u + iv, where u and v are those found in
the previous Task.
Your solution
Answer
f (z) = u + iv = x2 − x − y 2 + 2xyi − iy + D, D constant.
Now z 2 = x2 − y 2 + 2ixy and z = x + iy thus f (z) = z 2 − z + D.
Exercises
z
1. Find the singular point of the rational function f (z) = . Find f 0 (z) at other points and
z − 2i
evaluate f 0 (−i).
2. Show that the function f (z) = z 2 + z is analytic everywhere and hence obtain its derivative.
3. Show that the function u = x2 − y 2 − 2y is harmonic, find the conjugate harmonic function v
and hence find f (z) = u + iv in terms of z.
Answers
1. f (z) is singular at z = 2i. Elsewhere
(z − 2i).1 − z.1 −2i −2i −2i 2

f 0 (z) = 2
= f 0 (−i) = = = i
(z − 2i) (z − 2i)2 (−3i)2 −9 9
2. u = x2 + x − y 2 and v = 2xy + y
∂u ∂u ∂v ∂v
= 2x + 1, = −2y, = 2y, = 2x + 1
∂x ∂y ∂x ∂y
Here the Cauchy-Riemann equations are identically true and f (z) is analytic everywhere.
df ∂f
= = 2x + 1 + 2yi = 2z + 1
dz ∂x
12 HELM (2006):
®
Answer
∂2u ∂2u
3. = 2, = −2 therefore u is harmonic.
∂x2 ∂y 2
∂v ∂u
= = 2x therefore v = 2xy+ function of y
∂y ∂x
∂v ∂u
=− = 2y + 2 therefore v = 2xy + 2x+ function of x
∂x ∂y
∴ v = 2xy + 2x + constant
f (z) = x2 + 2ixy − y 2 + 2xi − 2y = z 2 + 2iz
2. Conformal mapping
In Section 26.1 we saw that the real and imaginary parts of an analytic function each satisfies
Laplace’s equation. We shall show now that the curves
u(x, y) = constant and v(x, y) = constant
intersect each other at right angles (i.e. are orthogonal). To see this we note that along the curve
u(x, y) = constant we have du = 0. Hence
∂u ∂u
du = dx + dy = 0.
∂x ∂y
Thus, on these curves the gradient at a general point is given by
∂u
dy
= − ∂x .
dx ∂u
∂y
Similarly along the curve v(x, y) = constant, we have
∂v
dy
= − ∂x .
dx ∂v
∂y
The product of these gradients is
∂u ∂v ∂u ∂u
( )( ) ( )( )
∂x ∂x = − ∂x ∂y = −1
∂u ∂v ∂u ∂u
( )( ) ( )( )
∂y ∂y ∂y ∂x
where we have made use of the Cauchy-Riemann equations. We deduce that the curves are orthog-
onal.
HELM (2006): 13
As an example of the practical application of this work consider two-dimensional electrostatics. If
u = constant gives the equipotential curves then the curves v = constant are the electric lines of
force. Figure 2 shows some curves from each set in the case of oppositely-charged particles near to
each other; the dashed curves are the lines of force and the solid curves are the equipotentials.
Figure 2
In ideal fluid flow the curves v = constant are the streamlines of the flow.
In these situations the function w = u + iv is the complex potential of the field.
Function as mapping
A function w = f (z) can be regarded as a mapping, which maps a point in the z-plane to a point
in the w-plane. Curves in the z-plane will be mapped into curves in the w-plane.
Consider aerodynamics where we are interested in the fluid flow in a complicated geometry (say flow
past an aerofoil). We first find the flow in a simple geometry that can be mapped to the aerofoil
shape (the complex plane with a circular hole works here). Most of the calculations necessary to find
physical characteristics such as lift and drag on the aerofoil can be performed in the simple geometry
- the resulting integrals being much easier to evaluate than in the complicated geometry.
Consider the mapping
w = z2.
The point z = 2 + i maps to w = (2 + i)2 = 3 + 4i. The point z = 2 + i lies on the intersection of
the two lines x = 2 and y = 1. To what curves do these map? To answer this question we note that
a point on the line y = 1 can be written as z = x + i. Then
w = (x + i)2 = x2 − 1 + 2xi
As usual, let w = u + iv, then
u = x2 − 1 and v = 2x
Eliminating x we obtain:
4u = 4x2 − 4 = v 2 − 4 so v 2 = 4 + 4u is the curve to which y = 1 maps.
14 HELM (2006):
®
Example 5
Onto what curve does the line x = 2 map?
Solution
A point on the line is z = 2 + yi. Then
w = (2 + yi)2 = 4 − y 2 + 4yi
Hence u = 4 − y 2 and v = 4y so that, eliminating y we obtain
16u = 64 − v 2 or v 2 = 64 − 16u
In Figure 3(a) we sketch the lines x = 2 and y = 1 and in Figure 3(b) we sketch the curves into
which they map. Note these curves intersect at the point (3, 4).
v
y
v 2 = 64 − 16u
x=2
v 2 = 4 + 4u
(3, 4)
(2, 1) y=1
x u
(a) (b)
Figure 3
The angle between the original lines in (a) is clearly 900 ; what is the angle between the curves in (b)
at the point of intersection?
dv
The curve v 2 = 4 + 4u has a gradient . Differentiating the equation implicitly we obtain
du
dv dv 2
2v =4 or =
du du v
dv 1
At the point (3, 4) = .
du 2
HELM (2006): 15
Task
dv
Find for the curve v 2 = 64 − 16u and evaluate it at the point (3, 4).
du
Your solution
Answer
dv dv 8 dv
2v = −16 ∴ = − . At v = 4 we obtain = −2.
du du v du
Note that the product of the gradients at (3, 4) is −1 and therefore the angle between the curves at
their point of intersection is also 900 . Since the angle between the lines and the angle between the
curves is the same we say the angle is preserved.
In general, if two curves in the z-plane intersect at a point z0 , and their image curves under the
mapping w = f (z) intersect at w0 = f (z0 ) and the angle between the two original curves at z0
equals the angle between the image curves at w0 we say that the mapping is conformal at z0 .
An analytic function is conformal everywhere except where f 0 (z) = 0.
Task
At which points is w = ez not conformal?
Your solution
Answer
f 0 (z) = ez . Since this is never zero the mapping is conformal everywhere.
Inversion
1
The mapping w = f (z) = is called an inversion. It maps the interior of the unit circle in the
z
z-plane to the exterior of the unit circle in the w-plane, and vice-versa. Note that
x y u v
w = u + iv = 2 2
− 2 2
i and similarly z = x + iy = 2 2
− 2 i
x +y x +y u +v u + v2
so that
x y
u= and v=− .
x2 + y 2 x2 + y 2
16 HELM (2006):
®
A line through the origin in the z-plane will be mapped into a line through the origin in the w-plane.
To see this, consider the line y = mx, for m constant. Then
x mx
u= 2 2 2
and v=− 2
x +m x x + m2 x2
so that v = −mu, which is a line through the origin in the w-plane.
Task
Consider the line ax + by + c = 0 where c 6= 0. This represents a line in the
z-plane which does not pass through the origin. To what type of curve does it
map in the w-plane?
Your solution
Answer
The mapped curve is
au bv
− 2 +c=0
u2 +v 2 u + v2
Hence au − bv + c(u2 + v 2 ) = 0. Dividing by c we obtain the equation:
a b
u2 + v 2 + u − v = 0
c c
which is the equation of a circle in the w-plane which passes through the origin.
Similarly, it can be shown that a circle in the z-plane passing through the origin maps to a line in
the w-plane which does not pass through the origin. Also a circle in the z-plane which does not pass
through the origin maps to a circle in the w-plane which does pass through the origin. The inversion
mapping is an example of the bilinear transformation:
az + b
w = f (z) = where we demand that ad − bc 6= 0
cz + d
(If ad − bc = 0 the mapping reduces to f (z) = constant.)
Task
az + b
Find the set of bilinear transformations w = f (z) = which map z = 2 to
cz + d
w = 1.
Your solution
HELM (2006): 17
Answer
2a + b
1= . Hence 2a + b = 2c + d.
2c + d
Any values of a, b, c, d satisfying this equation will do provided ad − bc 6= 0.
Task
Find the bilinear transformations for which z = −1 is mapped to w = 3.
Your solution
Answer
−a + b
3= . Hence −a + b = −3c + 3d.
−c + d
Example 6
Find the bilinear transformation which maps
(a) z = 2 to w = 1, and
(b) z = −1 to w = 3, and
(c) z = 0 to w = −5
Solution
We have the answers to (a) and (b) from the previous two Tasks:
2a + b = 2c + d
−a + b = −3c + 3d
b
If z = 0 is mapped to w = −5 then −5 = so that b = −5d. Substituting this last relation into
d
the first two obtained we obtain
2a − 2c − 6d = 0
−a + 3c − 8d = 0
Solving these two in terms of d we find 2c = 11d and 2a = 17d. Hence the transformation is:
17z − 10
w= (note that the d’s cancel in the numerator and denominator).
11z + 2
18 HELM (2006):
®
Some other mappings are shown in Figure 4.
z2
z3
π/3
z 1/2
zα
π/α
z-plane w-plane
Figure 4
As an engineering application we consider the Joukowski transformation
`2
w=z− where ` is a constant.
z
It is used to map circles which contain z = 1 as an interior point and which pass through z = −1
into shapes resembling aerofoils. Figure 5 shows an example:
y v
−1 1 x u
z-plane w-plane
Figure 5
This creates a cusp at which the associated fluid velocity can be infinite. This can be avoided by
adjusting the fluid flow in the z-plane. Eventually, this can be used to find the lift generated by such
an aerofoil in terms of physical characteristics such as aerofoil shape and air density and speed.
HELM (2006): 19
Exercise
az + b
Find a bilinear transformation w = which maps
cz + d
(a) z = 0 into w = i
(b) z = −1 into w = 0
(c) z = −i into w = 1
Answer
b
(a) z = 0, w = i gives i = so that b = di
d
−a + b
(b) z = −1, w = 0 gives 0 = so −a + b = 0 so a = b.
−c + d
−ai + b
(c) z = −i, w = 1 gives 1 = so that −ci + d = −ai + b = d + di (using (a) and (b))
−ci + d
We conclude from (c) that −c = d. We also know that a = b = di.
diz + di iz + i
Hence w = =
−dz + d −z + 1
20 HELM (2006):
®
Standard Complex
Functions 26.3

Introduction
In this Section we examine some of the standard functions of the calculus applied to functions of
a complex variable. Note the similarities to and differences from their equivalents in real variable
calculus.
#
• understand the concept of a function of a
complex variable and its derivative
Prerequisites
• be familiar with the Cauchy-Riemann
equations
"
!

Learning Outcomes • apply the standard functions of a complex

variable discussed in this Section

HELM (2006): 21
Section 26.3: Standard Complex Functions
1. Standard functions of a complex variable
The functions which we have considered so far have mostly been built from powers of z. We consider
other functions here.
The exponential function

Using Euler’s relation we are led to define
ez = ex+iy = ex .eiy = ex (cos y + i sin y).
From this definition we can show readily that when y = 0 then ez reduces to ex , as it should.
If, as usual, we express w in real and imaginary parts then: w = ez = u + iv so that
u = ex cos y, v = ex sin y. Then
∂u ∂v ∂u ∂v
= ex cos y = and = −ex sin y = − .
∂x ∂y ∂y ∂x
Thus by the Cauchy-Riemann equations, ez is analytic everywhere. It can be shown from the
definition that if f (z) = ez then f 0 (z) = ez , as expected.
Task
By calculating |ez |2 show that |ez | = ex .
Your solution
Answer
|ez |2 = |ex cos y + iex sin y|2 = (ex cos y)2 + (ex sin y)2 = (ex )2 (cos2 y + sin2 y) = (ex )2 .
Therefore |ez | = ex .
Example 7
Find arg(ez ).
Solution
ex sin y
If θ = arg(ez ) = arg(ex (cos y + i sin y)) then tan θ = = tan y. Hence arg(ez ) = y.
ex cos y
22 HELM (2006):
®
Example 8
Find the solutions (for z) of the equation ez = i
Solution
To find the solutions of the equation ez = i first write i as 0 + 1i so that, equating real and imaginary
parts of ez = ex (cos y + i sin y) = 0 + 1i gives , ex cos y = 0 and ex sin y = 1.
π
Therefore cos y = 0, which implies y = + kπ, where k is an integer. Then, using this we see that
2
sin y = ±1. But ex must be positive, so that sin y = +1 and ex = 1. This last equation has just
one solution, x = 0. In order that sin y = 1 we deduce that k must be even. Finally we have the
complete solution to ez = i, namely:
π
z= + kπ i, k an even integer.
2
Task
Obtain all the solutions to ez = −1.
First find equations involving ex cos y and ex sin y:
Your solution
Answer
As a first step to solving the equation ez = −1 obtain expressions for ex cos y and ex sin y from
ez = ex (cos y + i sin y) = −1 + 0i. Hence ex cos y = −1, ex sin y = 0.
Now using the expression for sin y deduce possible values for y and hence from the first equation in
cos y select the values of y satisfying both equations and deduce the form of the solutions for z:
Your solution
Answer
The two equations we have to solve are: ex cos y = −1, ex sin y = 0. Since ex 6= 0 we deduce
sin y = 0 so that y = kπ, where k is an integer. Then cos y = ±1 (depending as k is even or odd).
But ex 6= −1 so ex = 1 leading to the only possible solution for x: x = 0. Then, from the second
relation: cos y = −1 so k must be an odd integer. Finally, z = kπi where k is an odd integer. Note
the interesting result that if z = 0 + πi then x = 0, y = π and ez = 1(cos π + i sin π) = −1. Hence
eiπ = −1, a remarkable equation relating fundamental numbers of mathematics in one relation.
HELM (2006): 23
Trigonometric functions
We denote the complex counterparts of the real trigonometric functions cos x and sin x by cos z and
sin z and we define these functions by the relations:
1 1
cos z ≡ (eiz + e−iz ), sin z ≡ (eiz − e−iz ).
2 2i
These definitions are consistent with the definitions (Euler’s relations) used for cos x and sin x.
Other trigonometric functions can be defined in a way which parallels real variable functions. For
example,
sin z
tan z ≡ .
cos z
Note that

d d 1 iz 1 1
(sin z) = (e − e ) = (ieiz + ie−iz ) = (eiz + e−iz ) = cos z.
−iz
dz dz 2i 2i 2
Task
d
Show that (cos z) = − sin z.
dz
Your solution
Answer

d d 1 iz −iz
(cos z) = (e + e )
dz dz 2
i iz 1
= (e − e−iz ) = − (eiz − e−iz ) = − sin z.
2 2i
Among other useful relationships are
1 1
sin2 z + cos2 z = − (eiz − e−iz )2 + (eiz + e−iz )2
4 4
1 1
= (−e2iz + 2 − e−2iz + e2iz + 2 + e−2iz ) = · 4 = 1.
4 4
24 HELM (2006):
®
Also, using standard trigonometric expansions:
e−y + ey e−y − ey

sin z = sin(x + iy) = sin x cos iy + cos x sin iy = sin x + cos x
2 2i
1
= sin x cosh y − cos x sinh y
i
= sin x cosh y + i cos x sinh y.
Task
Show that cos z = cos x cosh y − i sin x sinh y.
Your solution
Answer
e−y + ey e−y − ey

cos z = cos(x + iy) = cos x cos iy − sin x sin iy = cos x − sin x
2 2i
1
= cos x cosh y + sin x sinh y
i
= cos x cosh y − i sin x sinh y
Hyperbolic functions
In an obvious extension from their real variable counterparts we define functions cosh z and sinh z
by the relations:
1 1
cosh z = (ez + e−z ), sinh z = (ez − e−z ).
2 2
d 1 d z 1
Note that (sinh z) = (e − e−z ) = (ez + e−z ) = cosh z.
dz 2 dz 2
HELM (2006): 25
Task
d
Determine (cosh z).
dz
Your solution
Answer
d 1 d z 1
(cosh z) = (e + e−z ) = (ez − e−z ) = sinh z.
dz 2 dz 2
Other relationships parallel those for trigonometric functions. For example it can be shown that
cosh z = cosh x cos y + i sinh x sin y and sinh z = sinh x cos y + i cosh x sin y
These relationships can be deduced from the general relations between trigonometric and hyperbolic
functions (can you prove these?):
cosh iz = cos z and sinh iz = i sin z
Example 9
Show that cosh2 z − sinh2 z = 1.
Solution
1 z 1
cosh2 z = (e + e−z )2 = (e2z + 2 + e−2z )
4 4
1 z 1
sinh2 z = (e − e−z )2 = (e2z − 2 + e−2z )
4 4
1
∴ cosh2 z − sinh2 z = (2 + 2) = 1.
4
Alternatively since cosh iz = cos z then cosh z = cos iz and since sinh iz = i sin z it follows that
sinh z = −i sin iz so that
cosh2 z − sinh2 z = cos2 iz + sin2 iz = 1
26 HELM (2006):
®
Logarithmic function
Since the exponential function is one-to-one it possesses an inverse function, which we call ln z. If
w = u + iv is a complex number such that ew = z then the logarithm function is defined through the
statement: w = ln z. To see what this means it will be convenient to express the complex number
z in exponential form as discussed in 10.3: z = reiθ and so
w = u + iv = ln(reiθ ) = ln r + iθ.
Therefore u = ln r = ln |z| and v = θ. However ei(θ+2kπ) = eiθ .e2kπi = eiθ .1 = eiθ for integer k. This
means that we must be more general and say that v = θ + 2kπ, k integer. If we take k = 0 and
confine v to the interval −π < v ≤ π, the corresponding value of w is called the principal value of
ln z and is written Ln(z).
In general, to each value of z 6= 0 there are an infinite number of values of ln z, each with the same
real part. These values are partitioned into branches of range 2π by considering in turn k = 0,
k = ±1, k = ±2 etc. Each branch is defined on the whole z−plane with the exception of the point
1
z = 0. On each branch the function ln z is analytic with derivative except along the negative real
z
axis (and at the origin). Figure 6 represents the situation schematically.
Figure 6
The familiar properties of a logarithm apply to ln z, except that in the case of Ln(z) we have to
adjust the argument by a multiple of 2π to comply with −π < arg(Ln(z)) ≤ π
For example
√ π √
(a) ln(1 + i) = ln 2ei 4 = ln 2 + i π4 + 2kπ

1 π
= ln 2 + i + 2kπ .
2 4
1 π
(b) Ln(1 + i) = ln 2 + i .
2 4
(c) If ln z = 1 − iπ then z = e1−iπ = e1 .e−iπ = −e.
HELM (2006): 27
Task
Find (a) ln(1 − i) (b) Ln(1 − i) (c) z when ln z = 1 + iπ
Your solution
Answer
√ π √ π 1 π
(a) ln(1 − i) = ln 2e−i 4 = ln 2 + i − + 2kπ = ln 2 + − + 2kπ .
4 2 4
1 π
(b) Ln(1 − i) = ln 2 − i .
2 4
1+iπ 1 iπ
(c) z = e = e .e = −e.
Exercises
1. Obtain all the solutions to ez = 1.
2. Show that 1 + tan2 z ≡ sec2 z
3. Show that cosh2 z + sinh2 z ≡ cosh 2z

√ √
4. Find ln( 3 + i), Ln( 3 + i).
5. Find z when ln z = 2 + πi
Answers
1. ex cos y = 1 and ex sin y = 0 ∴ sin y = 0 and y = kπ where k is an integer.
Then cos y = ±1 and since ex > 0 we take cos y = 1 and ex = 1 so that x = 0. Then
cos y = 1 and k is an even integer. ∴ z = 2kπi for k integer.
1 e − e−iz
iz
2. tan z =
i eiz + e−iz
e2iz + e−2iz − 2 4 22 1
1 + tan2 z = 1 − 2iz −2iz
= 2iz −2iz
= iz −iz 2
= = sec2 z.
e +e +2 e +e +2 (e + e ) cos2 z
1 1 1 2z
3. cosh2 z + sinh2 z = (e2z + 2 + e−2z ) + (e2z − 2 + e−2z ) = (e + e−2z ) = cosh 2z.
4 4 2
√ √ √
4. ln( 3 + 1) = ln 5 + i( π6 + 2kπ) = 12 ln 5 + i( π6 + 2kπ). Ln( 3 + i) = 12 ln 5 + i π6 .
5. If ln z = 2 + πi then z = e2+πi = e2 eiπ = −e2 .
28 HELM (2006):
®
Basic Complex
Integration 26.4

Introduction
Complex variable techniques have been used in a wide variety of areas of engineering. This has
been particularly true in areas such as electromagnetic field theory, fluid dynamics, aerodynamics
and elasticity. With the rapid developments in computer technology and the consequential use of
sophisticated algorithms for analysis and design in engineering there has been, in recent years, less
emphasis on the use of complex variable techniques and a shift towards numerical techniques applied
directly to the underlying full partial differential equations which model the situation. However it
is useful to have an analytical solution, possibly for an idealized model in order to develop a better
understanding of the solution and to develop confidence in numerical estimates for the solution of
more sophisticated models.
The design of aerofoil sections for aircraft is an area where the theory was developed using complex
variable techniques. Throughout engineering, transforms defined as complex integrals in one form or
another play a major role in analysis and design. The use of complex variable techniques allows us
to develop criteria for the stability of systems.
' $
• be able to carry out integration of simple
real-valued functions
Prerequisites • be familiar with the basic ideas of functions
Before starting this Section you should . . . of a complex variable
• be familiar with line integrals

&
%

Learning Outcomes • understand the concept of complex integrals


HELM (2006): 29
Section 26.4: Basic Complex Integration
1. Complex integrals
If f (z) is a single-valued, continuous function in some region R in the complex plane then we define
the integral of f (z) along a path C in R (see Figure 7) as
Z Z
f (z) dz = (u + iv)(dx + i dy).
C C
R
x
Figure 7
Here we have written f (z) and dz in real and imaginary parts:
f (z) = u + iv and dz = dx + i dy.
Then we can separate the integral into real and imaginary parts as
Z Z Z
f (z) dz = (u dx − v dy) + i (v dx + u dy).
C C C
We often interpret real integrals in terms of area; now we define complex integrals in terms of line
integrals over paths in the complex plane. The line integrals are evaluated as described in 29.
Example 10
Obtain the complex integral:
Z
z dz
C
where C is the straight line path from z = 1 + i to z = 3 + i. See Figure 8.
3 + 3i
C2
C1
1+i C 3+i
Figure 8
30 HELM (2006):
®
Solution
Here, since y is constant (y = 1) along the given path then z = x + i, implying that u = x and
v = 1. Also, as y is constant, dy = 0.
Therefore,
Z Z Z
z dz = (u dx − v dy) + i (v dx + u dy)
C C C
Z 3 Z 3
= x dx + i 1 dx
1 1
3 3
x2

= +i x
2 1 1

9 1
= − + i(3 − 1) = 4 + 2i.
2 2
Task Z
Evaluate z dz where C1 is the straight line path from z = 3 + i to z = 3 + 3i.
C1
First obtain expressions for u, v, dx and dy by finding an appropriate expression for z along the path:
Your solution
Answer
Along the path z = 3 + iy, implying that u = 3 and v = y. Also dz = 0 + idy.
Now find limits on y:
Your solution
Answer
The limits on y are: y = 1 to y = 3.
Now evaluate the integral:
Your solution
HELM (2006): 31
Answer Z Z Z
z dz = (u dx − v dy) + i (v dx + u dy)
C1 C1 C1
Z 3 Z 3
= −y dy + i 3 dy
1 1
3 3
−y 2

9 1
= + i 3y = − + + i(9 − 3)
2 1 1 2 2
= −4 + 6i.
Task Z
Evaluate z dz where C2 is the straight line path from z = 1 + i to z = 3 + 3i.
C2
Your solution
Answer
We first need to find the equation of the line C2 in the Argand plane.
We note that both points lie on the line y = x so the complex equation of the straight line is
z = x + ix giving u = x and v = x. Also dz = dx + idx = (1 + i)dx.
Z Z Z
∴ z dz = (x dx − x dx) + i (x dx + x dx).
C2 C2 C2
Z
=i (2x dx)
C2
Next, we see that the limits on x are x = 1 to x = 3. We are now in a position to evaluate the
integral:
Z Z 3 3
z dz = i 2x dx = i x2 = i(9 − 1) = 8i.
C2 1 1
Note that this result is the sum of the integrals along C and C1 . You might have expected this.
A more intricate example now follows.
32 HELM (2006):
®
Example
Z 11
Evaluate z 2 dz where C1 is that part of the unit circle going anticlockwise from
C1
the point z = 1 to the point z = i. See Figure 9.
y
1
C1
1 x
Figure 9
Solution
First, note that z 2 = (x + iy)2 = x2 − y 2 + 2xyi and dz = dx + i dy giving
Z Z Z
2
2 2
{2xy dx + (x2 − y 2 )dy}.

z dx = (x − y ) dx − 2xy dy + i
C1 C1 C1
This is obtained by simply expressing the integral in real and imaginary parts. These integrals cannot
be evaluated in this form since y and x are related. Instead we re-write them in terms of the single
variable θ.
Note that on the unit circle: x = cos θ, y = sin θ so that dx = − sin θ dθ and dy = cos θ dθ.
The expressions (x2 − y 2 ) and 2xy can be expressed in terms of 2θ since
x2 − y 2 = cos2 θ − sin2 θ ≡ cos 2θ 2xy = 2 cos θ sin θ ≡ sin 2θ.
Now as the point z moves from z = 1 to z = i along the path C1 the parameter θ changes from
π
θ = 0 to θ = . Hence,
2
Z Z π Z π
2 2
f (z) dz = {− cos 2θ sin θ dθ − sin 2θ cos θ dθ} + i {− sin 2θ sin θ dθ + cos 2θ cos θ dθ} .
C1 0 0
We can simplify these daunting-looking integrals by using the trigonometric identities:

sin(A + B) ≡ sin A cos B + cos A sin B and cos(A + B) ≡ cos A cos B − sin A sin B.
We obtain (choosing A = 2θ and B = θ in both expressions):
− cos 2θ sin θ − sin 2θ cos θ ≡ −(sin θ cos 2θ + cos θ sin 2θ) ≡ − sin 3θ.
Also − sin 2θ sin θ + cos 2θ cos θ ≡ cos 3θ.
Now we can complete the evaluation of our integral:
Z Z π Z π
2 2
f (z) dx = (− sin 3θ)dθ + i cos 3θ dθ
C1 0 0
π2 π2
1 1 1 1 1 1 1
= cos 3θ +i sin 3θ = (0 − ) + i − − 0 = − − i ≡ − (1 + i).
3 0 3 0 3 3 3 3 3
HELM (2006): 33
In the last Task we integrated z 2 over a given path. We had to perform some intricate mathematics
to get the value. It would be convenient if there was a simpler way to obtain the value of such
complex integrals. This is explored in the following Tasks.
Task i
1 3
Evaluate z
3 1
Your solution
Answer
1
We obtain − (1 + i) again, which is the same result as from the previous Task.
3
It would seem that, by carrying out an analogue of real integration (simply integrating the function
and substituting in the limits) we can obtain the answer much more easily. Is this coincidence?
If you return to the first Task of this Section you will note:
3+3i
1 2 1
(3 + 3i)2 − (1 + i)2

z =
2 1+i 2
1
= {9 + 18i − 9 − 1 − 2i + 1}
2
1
= (16i) = 8i,
2
the result we obtained earlier.
We shall investigate these ‘coincidences’ in Section 26.5.
As a variation on this example, suppose that the path C1 is the entire circumference of the unit circle
travelled in an anti-clockwise direction. The limits are θ = 0 and θ = 2π. Hence
Z Z 2π Z 2π
f (z) dz = (− sin 3θ)dθ + i cos 3θ dθ
C1 0 0
2π 2π
1 1
= cos 3θ +i sin 3θ
3 0 3 0
1 1
= ( − ) + i(0 − 0) = 0.
3 3
Is there an underlying reason for this result? (We shall see in Section 26.5.)
Another technique for evaluating integrals taken around the unit circle is shown in the next example,
in which
I we need to evaluate
1
dz where C is the unit circle.
C z I
Note the use of since we have a closed path; we could have used this notation earlier.
34 HELM (2006):
®
Task I
1
Evaluate dz where C is the unit circle.
C z
First show that a point z on the unit circle can be written z = eiθ and hence find dz in terms of θ:
Your solution
Answer
On the unit circle a point (x, y) is such that x = cos θ, y = sin θ and hence z = cos θ + i sin θ
which, using De Moivre’s theorem, can be seen to be z = eiθ .
dz
Then = ieiθ so that dz = ieiθ dθ.
dθ
I
1
Now evaluate the integral dz.
C z
Your solution
Answer
I Z 2π Z 2π
1 1 iθ
dz = ie dθ = idθ = 2πi.
C z 0 eiθ 0
We now quote one of the most important results in complex integration which incorporates the last
result.
Key Point 1
If n is an integer and C is the circle centre z = z0 and radius r, that is, it has equation |z −z0 | = r
then
I
dz 0, n 6= 1;
=
C (z − z0 )
n 2πi, n = 1.
Note that the result is independent of the value of r.
HELM (2006): 35
Two-dimensional fluid flow

Introduction
Functions of a complex variable find a very elegant application in the mathematical treatment of
two-dimensional fluid flow.
Problem in words
Find the forces and moments due to fluid flowing past a cylinder.
Figure 10 shows a cross section of a cylinder (not necessarily circular), whose boundary is C, placed
in a steady non-viscous flow of an ideal fluid; the flow takes place in planes parallel to the xy plane.
The cylinder is out of the plane of the paper. The flow of the fluid exerts forces and turning moments
upon the cylinder. Let X, Y be the components, in the x and y directions respectively, of the force
on the cylinder and let M be the anticlockwise moment (on the cylinder) about the orgin.
y Y
X
M C
x
Figure 10
Blasius’ theorem (which we shall not prove) states that
I 2 I 2
1 dw 1 dw
X − iY = iρ dz and M = Re − ρ z dz
2 C dz 2 C dz
where Re denotes the real part, ρ is the (constant) density of the fluid and w = u + iv is the complex
potential (see Section 261) for the flow. Both ρ and ω are presumed known.
We shall find X, Y and M if the cylinder has a circular cross section and the boundary is specified
by |z| = a. Let the flow be a uniform stream with speed U .
Now, using a standard result, the complex potential describing this situation is:
2
a2 a2 2a2 a4

dw dw 2
w=U z+ so that = U 1 − 2 and =U 1− 2 + 4 .
z dz z dz z z
Using Key Point 1 with z0 = 0 :
I 2
2a2 a4
I
1 dw 1 2
X − iY = iρ dz = iρU 1 − 2 + 4 dz = 0 so X = Y = 0.
2 C dz 2 z z
36 HELM (2006):
®
2
2a2 a4 −2a2 U 2

dw 2
Also, z =U z− + 3 . The only term to contribute to M is .
dz z z z
Again using Key Point 1, this leads to −4πa2 U 2 i and this has zero real part. Hence M = 0, also.
Interpretation
The implication is that no net force or moment acts on the cylinder. This is not so in practice. The
discrepancy arises from neglecting the viscosity of the fluid.
Exercises
Z
1. Obtain the integral z dz along the straight-line paths
C
(a) from z = 2 + 2i to z = 5 + 2i
(b) from z = 5 + 2i to z = 5 + 5i
(c) from z = 2 + 2i to z = 5 + 5i
Z
2. Find (z 2 + z) dz where C is the part of the unit circle going anti-clockwise from the point
C
z = 1 to the point z = i.
I
3. Find f (z) dz where C is the circle |z − z0 | = r for the cases
C
1
(a) f (z) = . z0 = 1
z2
1
(b) f (z) = , z0 = 1
(z − 1)2
1
(c) f (z) = , z0 = 1 + i
z−1−i
HELM (2006): 37
Answers
1. (a) Here y is constant along the given path z = x + 2i so that u = x and v = 2. Also
dy = 0. Thus
Z Z Z Z 5 Z 5
z dz = (udx − vdy) + i (vdx + udy) = xdx + i 2dx
C C C 2 2
2 5 5
x 25 4 21
= + i 2x = ( − ) + i(10 − 4) = + 6i.
2 2 2 2 2 2
(b) Here dx = 0, v = y, u = 5. Thus

Z Z 5 Z 5
z dz = (−y)dy + i 5dy
C 2 2
2 5 5
y 25 4 21
= − + i 5y = (− + ) + i(25 − 10) = − + 15i.
2 2 2 2 2 2
(c) z = x + ix, u = x, v = x, dz = (1 + i)dx, so

Z Z Z Z 2 5
x
z dz = (xdx − xdx) + i (xdx + xdx) = i 2xdx = 2i = 21i.
C C C C 2 2
Note that the result in (c) is the sum of the results in (a) and (b).
i
1 3 i2
3
z2
Z
2 z 1 1 4 1
2. (z + z) dz = + = ( i + ) − ( + ) = − − i.
C 3 2 1 3 2 3 2 3 3
3. Using Key Point 1 we have (a) 0, (b) 0, (c) 2πi.
Note that in all cases the result is independent of r.
38 HELM (2006):

Cauchy’s Theorem 26.5

Introduction
In this Section we introduce Cauchy’s theorem which allows us to simplify the calculation of certain
contour integrals. A second
I result, known as Cauchy’s integral formula, allows us to evaluate some
f (z)
integrals of the form dz where z0 lies inside C.
C z − z0

• be familiar with the basic ideas of functions
Prerequisites of a complex variable
Before starting this Section you should . . . • be familiar with line integrals

• state and use Cauchy’s theorem

Learning Outcomes
• state and use Cauchy’s integral formula

HELM (2006): 39
Section 26.5: Cauchy’s Theorem
1. Cauchy’s theorem
Simply-connected regions
A region is said to be simply-connected if any closed curve in that region can be shrunk to a point
without any part of it leaving a region. The interior of a square or a circle are examples of simply
connected regions. In Figure 11 (a) and (b) the shaded grey area is the region and a typical closed
curve is shown inside the region. In Figure 11 (c) the region contains a hole (the white area inside).
The shaded region between the two circles is not simply-connected; curve C1 can shrink to a point
but curve C2 cannot shrink to a point without leaving the region, due to the hole inside it.
C2
C1
(a) (b) (c)
Figure 11
Key Point 2
Cauchy’s Theorem
The theorem states that if f (z) is analytic everywhere within a simply-connected region then:
I
f (z) dz = 0
C
for every simple closed path C lying in the region.

This is perhaps the most important theorem in the area of complex analysis.
I
As a straightforward example note that z 2 dz = 0, where C is the unit circle, since z 2 is analytic
I C
everywhere (see Section 261). Indeed z 2 dz = 0 for any simple contour: it need not be circular.
C
Consider the contour shown in Figure 12 and assume f (z) is analytic everywhere on and inside the
40 HELM (2006):
contour C.
y
D
B
A
C
E
Figure 12
Then by analogy with real line integrals
Z Z I
f (z) dz + f (z) dz = f (z) dz = 0 by Cauchy’s theorem.
AEB BDA C
Therefore
Z Z Z
f (z) dz = − f (z) dz = f (z) dz
AEB BDA ADB
(since reversing the direction of integration reverses the sign of the integral).
This implies that we may choose any path between A and B and the integral will have the same
value providing f (zz ) is analytic in the region concerned.
Integrals of analytic functions only depend on the positions of the points A and B, not on the path
connecting them. This explains the ‘coincidences’ referred to previously in Section 26.4.
Task Z 1+2i
Using ‘simple’ integration evaluate cos z dz, and explain why this is valid.
i
Your solution
Answer
Z 1+2i 1+2i
cos z dz = sin z = sin(1 + 2i) − sin i.
i i
This way of determining the integral is legitimate because cos z is analytic (everywhere).
HELM (2006): 41
We now investigate what occurs when the closed path of integration does not necessarily lie within
a simply-connected region. Consider the situation described in Figure 13.
y E
F
C2
C1
A
Figure 13
Let f (z) be analytic in the region bounded by the closed curves C1 and C2 . The region is cut by the
line segment joining A and B.
Consider now the closed curve AEABF BA travelling in the direction indicated by the arrows. No
line can cross the cut AB and be regarded as remaining in the region. Because of the cut the shaded
region is simply connected. Cauchy’s theorem therefore applies (see Key Point 2).
Therefore
I
f (z) dz = 0 since f (z) is analytic within and on the curve AEABF BA.
AEABF BA
Note that
Z Z
f (z) dz = − f (z) dz, being a simple change of direction.
AB BA
Also, we can divide the closed curve into smaller sections:

I Z Z Z Z
f (z) dz = f (z) dz + f (z) dz + f (z) dz + f (z) dz
AEABF BA AEA AB BF B BA
Z Z
= f (z) dz + f (z) dz = 0.
AEA BF B
i.e.
I I
f (z) dz − f (z) dz = 0
C1 C2
(since we assumeI that closed paths

I are travelled anticlockwise).
Therefore f (z) dz = f (z) dz.
C1 I C 2
This allows us to evaluate f (z) dz by replacing C1 by any curve C2 such that the region between
C1
them contains no singularities (see Section 261) of f (z). Often we choose a circle for C2 .
42 HELM (2006):
ExampleI12
6
Determine dz where C is the curve |z − 3| = 5 shown in Figure 14.
C z(z − 3)
y
C
C2 C1
2 3 x
Figure 14
Solution
6
We observe that f (z) = is analytic everywhere except at z = 0 and z = 3.
z(z − 3)
Let C1 be the circle of unit radius centred at z = 3 and C2 be the unit circle centered at the origin.
By analogy with the previous example we state that
I I I
6 6 6
dz = dz + dz.
C z(z − 3) C1 z(z − 3) C2 z(z − 3)
(To show this you would need two cuts: from C to C1 and from C to C2 .)
The remaining parts of this problem are presented as two Tasks.
Task
6
Expand into partial functions.
z(z − 3)
Your solution
Answer
6 A B A(z − 3) + Bz
Let ≡ + ≡ . Then A(z − 3) + Bz ≡ 6.
z(z − 3) z z−3 z(z − 3)
If z = 0 A(−3) = 6 ∴ A = −2. If z = 3 B×3=6 ∴ B = 2.
6 2 2
∴ ≡− + .
z(z − 3) z z−3
HELM (2006): 43
Thus:
I I I I I
6 2 2 2 2
dz = dz − dz + dz − dz = I1 − I2 + I3 − I4 .
C z(z − 3) C1 z−3 C1 z C2 z−3 C2 z
Task
Find the values of I1 , I2 , I3 , I4 , using Key Point 1 (page 35):
(a) Find the value of I1 :
Your solution
Answer
Using Key Point 1 we find that I1 = 2 × 2πi = 4πi.
(b) Find the value of I2 :
Your solution
Answer
1
The function is analytic inside and on C1 so that I2 = 0.
z
(c) Find the value of I3 :
Your solution
Answer
1
The function is analytic inside and on C2 so I3 = 0.
z−3
(d) Find the value of I4 :
Your solution
Answer
I4 = 4πi again using Key Point 1.
(e) Finally, calculate I = I1 − I2 + I3 − I4 :
Your solution
44 HELM (2006):
Answer
I
6 dz
= 4πi − 0 + 0 − 4πi = 0.
C z(z − 3)
Exercises
Z 2+3i
1. Evaluate sin z dz.
1+i
I
4
2. Determine dz where C is the contour |z − 2| = 4.
C z(z − 2)
Answers
Z 2+3i 2+3i
1. sin z dz = − cos z = cos(1 + i) − cos(2 + 3i) since sin z is analytic everywhere.
1+i 1+i
2.
y
C
C2 C1
2 2 6 x
4
f (z) = is analytic everywhere except at z = 0 and z = 2.
z(z − 2)
I I I
4 4 4
Call I = dz = dz + dz.
C z(z − 2) C1 z(z − 2) C2 z(z − 2)
4 2 2
Now ≡− + so that
z(z − 2) z z−2
I I I I
2 2 2 2
I = dz − dz + dz − dz
C1 z − 2 C1 z C2 z − 2 C2 z
= I1 + I2 + I3 + I4
I2 and I3 are zero because of analyticity.
I1 = 2 × 2πi = 4πi, by Key Point 1 and I4 = −4πi likewise.
Hence I = 4πi + 0 + 0 − 4πi = 0.
HELM (2006): 45
2. Cauchy’s integral formula
This is a generalization of the result in Key Point 2:
Key Point 3
Cauchy’s Integral Formula
If f (z) is analytic inside and on the boundary C of a simply-connected region then for any point z0
inside C,
I
f (z)
dz = 2πi f (z0 ).
C z − z0
ExampleI 13
z
Evaluate dz where C is the path shown in Figure 15:
C z2 + 1
1
C1 : |z − i| = 2
C1
i
x
i
Figure 15
Solution
We note that z 2 + 1 ≡ (z + i)(z − i).
z z z/(z + i)
Let = = .
z2 +1 (z + i)(z − i) z−i
The numerator z/(z + i) is analytic inside and on the path C1 so putting z0 = i in the Cauchy
integral formula (Key Point 3)
I
z i 1
2
dz = 2πi = 2πi. = πi.
C1 z + 1 i+i 2
46 HELM (2006):
Task I
z
Evaluate dz where C is the path (refer to the diagram)
C z2 +1
1
(a) C2 : |z + i| = 2
(b) C3 : |z| = 2.
C3
i
x
i
C2
I
z
(a) Use the Cauchy integral formula to find an expression for dz:
C2 z2 + 1
Your solution
Answer
z z/(z − i)
2
= . The numerator is analytic inside and on the path C2 so putting z0 = −i in the
z +1 z+i
Cauchy integral formula gives

−i
I
z
2
dz = 2πi = πi.
C2 z + 1 −2i
I
z
(b) Now find dz:
C3 z2 + 1
Your solution
Answer
By analogy with the previous part,
I I I
z z z
2
dz = 2
dz + 2
dz = πi + πi = 2πi.
C3 z + 1 C1 z + 1 C2 z + 1
HELM (2006): 47
The derivative of an analytic function
If f (z) is analytic in a simply-connected region then at any interior point of the region, z0 say, the
derivatives of f (z) of any order exist and are themselves analytic (which illustrates what a powerful
property analyticity is!). The derivatives at the point z0 are given by Cauchy’s integral formula for
derivatives:
I
(n) n! f (z)
f (z0 ) = dz
2πi C (z − z0 )n+1
where C is any simple closed curve, in the region, which encloses z0 .
Note the case n = 1:
I
0 1 f (z)
f (z0 ) = dz.
2πi C (z − z0 )2
Example 14
Evaluate the contour integral
z3
I
2
dz
C (z − 1)
where C is a contour which encloses the point z = 1.
Solution
z3 z3
I I
Since f (z) = has a pole of order 2 at z = 1 then f (z) dz = dz
(z − 1)2 C C0 (z − 1)2
where C 0 is a circle centered at z = 1.
I I
3 g(z)
If g(z) = z then f (z) dz = 2
dz
C C 0 (z − 1)
Since g(z) is analytic within and on the circle C 0 we use Cauchy’s integral formula for derivatives
to show that
z3
I
1 0 2
2
dz = 2πi × [g (z)]z=1 = 2πi 3z z=1 = 6πi.
C (z − 1) 1!
48 HELM (2006):
Exercise
I
z
Evaluate dz where C is the path:
C z2 +9
(a) C1 : |z − 3i| = 1 (b) C2 : |z + 3i| = 1 (c) C3 : |z| = 6.
Answers
z z z/(z + 3i)
(a) We will use the fact that = =
z2 +9 (z + 3i)(z − 3i) z − 3i
z
The numerator is analytic inside and on the path C1 so putting z0 = 3i in
z + 3i
Cauchy’s integral formula
I
z 3i 1
2
dz = 2πi = 2πi × = πi.
C1 z + 9 3i + 3i 2
z/(z − 3i)
(b) Here
z + 3i
The numerator is analytic inside and on the path C2 so putting z = −3i in Cauchy’s
integral formula:

−3i
I
z
2
dz = 2πi = πi.
C2 z + 9 −3i − 3i
(c) The integral is the sum of the two previous integrals and has value 2πi.
HELM (2006): 49
Singularities
and Residues 26.6
Introduction
Taylor’s series for functions of a real variable is generalised here to the Laurent series for a function
of a complex variable, which includes terms of the form (z − z0 )−n .
The different types of singularity of a complex function f (z) are discussed and the definition of a
residue at a pole is given. The residue theorem is used to evaluate contour integrals where the only
singularities of f (z) inside the contour are poles.

Prerequisites • be familiar with binomial and Taylor series


• understand the concept of a Laurent
Learning Outcomes series
On completion you should be able to . . . • find residues and use the residue theorem

50 HELM (2006):
®
1. Taylor and Laurent series

Many of the results in the area of series of real variables can be extended into complex variables. As
an example, the concept of radius of convergence of a series is extended to the concept of a circle
of convergence. If the circle of convergence of a series of complex numbers is |z − z0 | = ρ then
the series will converge if |z − z0 | < ρ.
For example, consider the function
1
f (z) =
1−z
It has a singularity at z = 1. We can obtain the Maclaurin series, centered at z = 0, as
f (z) = 1 + z + z 2 + z 3 + . . .
The circle of convergence is |z| = 1.
The radius of convergence for a series centred on z = z0 is the distance between z0 and the nearest
singularity.
Laurent series
One of the shortcomings of Taylor series is that the circle of convergence is often only a part of the
region in which f (z) is analytic.
As an example, the series
1
1 + z + z 2 + z 3 + . . . converges to f (z) =
1−z
only inside the circle |z | = 1 even though f (z) is analytic everywhere except at z = 1 .
The Laurent series is an attempt to represent f (z) as a series over as large a region as possible.
We expand the series around a point of singularity up to, but not including, the singularity itself.
Figure 16 shows an annulus of convergence r1 < |z − z0 | < r2 within which the Laurent series
(which is an extension of the Taylor series) will converge. The extension includes negative powers of
(z − z0 ).
y
r1 C
z0
r2
Figure 16
Now, we state Laurent’s theorem in Key Point 4.
HELM (2006): 51
Section 26.6: Singularities and Residues
Key Point 4
Laurent’s Theorem
If f (z) is analytic through a closed annulus D centred at z = z0 then at any point z inside D we
can write
f (z) = a0 + a1 (z − z0 ) + a2 (z − z0 )2 + . . .
+ b1 (z − z0 )−1 + b2 (z − z0 )−2 + . . .
where the coefficients an and bn (for each n) is given by

I I
1 f (z) 1 f (z)
an = n+1
dz, bn = dz,
2πi C (z − z0 ) 2πi C (z − z0 )1−n
the integral being taken around any simple closed path C lying inside D and encircling the inner
boundary. (Refer to Figure 16.)
Example 15
1
Expand f (z) = in terms of negative powers of z which will be valid if
1−z
|z| > 1.
Solution

1
First note that 1 − z = −z 1 − so that
z

1 1 1 1 1
f (z) = − 1
=− 1+ + 2 + 3
z 1− z z z z z
1 1 1 1
= − − 2 − 3 − 4 − ...
z z z z

1 1
This is valid for < 1, that is, < 1 or |z| > 1. Note that we used a binomial expansion rather
z |z|
than the theorem itself. Also note that together with the earlier result we are now able to expand
1
f (z) = everywhere, except for |z| = 1.
1−z
52 HELM (2006):
®
Task
1
This Task concerns f (z) = .
1+z
(a) Using the binomial series, expand f (z) in terms of non-negative power of z:
Your solution
Answer
f (z) = (1 + z)−1 = 1 − z + z 2 − z 3 + . . .
(b) State the values of z for which this expansion is valid:
Your solution
Answer
|z| < 1 (standard result for a GP).

1 1
(c) Using the identity 1 + z = z 1 + expand f (z) = in terms of negative powers of z
z 1+z
and state the values of z for which your expansion is valid:
Your solution
Answer
−1
1 1 1 1 1 1 1 1 1 1 1
f (z) = = 1+ = 1 − + 2 − 3 + ... = − 2 + 3 − 4 + ...
1 z z z z z z z z z z
z 1+
z
1
Valid for < 1 i.e. |z| > 1 .
z
2. Classifying singularities
If the function f (z) has a singularity at z = z0 , and in a neighbourhood of z0 (i.e. a region of the
complex plane which contains z0 ) there are no other singularities, then z0 is an isolated singularity
of f (z).
The principal part of the Laurent series is the part containing negative powers of (z − z0 ). If the
principal part has a finite number of terms say
HELM (2006): 53
b1 b2 bm
+ 2
+ ... + and bm 6= 0
z − z0 (z − z0 ) (z − z0 )m
then f (z) has a pole of order m at z = z0 (we have written b1 for a−1 , b2 for a−2 etc. for simplicity.)
Note that if b1 = b2 = . . . = 0 and bm 6= 0, the pole is still of order m.
A pole of order 1 is called a simple pole whilst a pole of order 2 is called a double pole. If the
principal part of the Laurent series has an infinite number of terms then z = z0 is called an isolated
essential singularity of f (z).
The function
i 1 1
f (z) = ≡ −
z(z − i) z−i z
1
has a simple pole at z = 0 and another simple pole at z = i. The function e z−2 has an isolated
essential singularity at z = 2. Some complex
√ functions have non-isolated singularities called branch
points. An example of such a function is z.
Task
2 1 1 3
Classify the singularities of the function f (z) = − 2+ + .
z z z + i (z − i)4
Your solution
Answer
A pole of order 2 at z = 0, a simple pole at z = −i and a pole of order 4 at z = i.
Exercises
1
1. Expand f (z) = in terms of negative powers of z to give a series which will be valid if
2−z
|z| > 2.
1 1 2
2. Classify the singularities of the function: f (z) = 2
+ 2
− .
z (z + i) (z + i)3
Answers
2
1. 2 − z = −z(1 − ) so that:
z
−1 1 2 −1 1 2 4 8 1 2 4 8
f (z) = 2 = − (1 − ) = − (1 + + 2 + 3 + . . . ) = − − 2 − 3 − 3 − . . .
z(1 − z ) z z z z z z z z z z

2
This is valid for < 1 or |z| > 2.
z
2. A double pole at z = 0 and a pole of order 3 at z = −i.
54 HELM (2006):
®
3. The residue theorem

Suppose f (z) is a function which is analytic inside andI on a closed contour C, except for a pole of
order m at z = z0 , which lies inside C. To evaluate f (z) dz we can expand f (z) in a Laurent
C
series in powers ofI (z − z0 ). If Zwe let Γ be a circle of centre z0 lying inside C then, as we saw in
Section 262, f (z) dz = f (z) dz.
C Γ
From Key Point 1 in Section 26.4 we know that the integral of each of the positive and negative
b1
powers of (z − z0 ) is zero with the exception of and this has value 2πb1 . Since it is the only
z − z0
coefficient remaining after the integration, it is called the residue of f (z) at z = z0 . It is given by
I
1
b1 = f (z) dz.
2πi C
Calculating the residue, for any given function f (z) is an important task and we examine some results
concerning its determination for functions with simple poles, double poles and poles of order m.
Finding the residue
b1
If f (z) has a simple pole at z = z0 then f (z) = + a0 + a1 (z − z0 ) + a2 (z − z0 )2 + . . .
z − z0
so that (z − z0 )f (z) = b1 + a0 (z − z0 ) + a1 (z − z0 )2 + a2 (z − z0 )3 + . . .
Taking limits as z → z0 , lim {(z − z0 )f (z)} = b1 .
z→z0
1 1 1
− 2i 1
2i
For example, let f (z) = 2 ≡ ≡ + .
z +1 (z + i)(z − i) z+i z−i
There are simple poles at z = −i and z = i. The residue at z = i is

1 1 1
lim (z − i) = lim = .
z→i (z + i)(z − i) z→i z+i 2i
Similarly, the residue at z = −i is

1 1 −1
lim (z + i) = lim = .
z→i (z + i)(z − i) z→−i z−i 2i
Task
1
This Task concerns f (z) = .
z2 + 4
(a) Identify the singularities of f (z):
Your solution
HELM (2006): 55
Answer
1 1
1 −
f (z) = = 4i + 4i . There are simple poles at z = −2i and z = 2i.
(z + 2i)(z − 2i) z + 2i z − 2i
(b) Now find the residues of f (z) at z = 2i and at z = −2i:

Your solution
Answer

1 1 1
lim (z − 2i) = lim = .
z→2i (z + 2i)(z − 2i) z→2i z + 2i 4i
Similarly at z = −2i.

1 1 1
lim (z + 2i) = lim =− .
z→−2i (z + 2i)(z − 2i) z→−2i z − 2i 4i
In general the residue at a pole of order m at z = z0 is

m−1
1 d m
b1 = lim [(z − z0 ) f (z)] .
(m − 1)! z→z0 dz m−1
z2 + 1
As an example, if f (z) = , f (z) has a pole of order 3 at z = −1 (m = 3).
(z + 1)3
We need first
d2 2
d2 2

3 (z + 1) d
2
(z + 1) 3
= 2
[z + 1] = [2z] = 2.
dz (z + 1) dz dz
1
Then b1 = × 2 = 1.
2!
We have a useful result (Key Point 5) which allows us to evaluate contour integrals quickly when
f (z) has only poles inside the contour.
Key Point 5
The Residue Theorem
I
f (z) dz = 2πi × (sum of the residues at the poles inside C).
C
56 HELM (2006):
®
Example 16 I I I
1
Let f (z) = 2 . Find the integrals dz, dz and dz in which C1 is
z +1 C1 C2 C3
the circle |z − i| = 1, C2 is the circle |z + i| = 1, and C3 is any path enclosing
both z = i and z = −i. See Figure 17.
C3
i C1
x
i
C2
Figure 17
Solution
1
Figure 17 shows that only the pole at z = i lies inside C1 . The residue at this pole is , as we
I 2i
1
found earlier. Hence f (z) dz = 2πi × = π.
C1 2i
1
Also, the residue at z = −i, the only pole inside C2 , is − . Hence
2i
I
1
f (z) dz = −2πi × = −π.
C2 2i
I
1 1
Note that the contour C3 encloses both poles so that f (z) dz = 2πi − = 0.
C3 2i 2i
Exercises
1
1. Identify the singularities of f (z) = and find the residue at each of them.
z 2 (z 2 + 9)
I
1
2. Find the integral f (z) dz where f (z) = and C is
C z2 +4
(a) the circle |z − 2i| = 1;
(b) the circle |z + 2i| = 1;
(c) any closed path enclosing both z = 2i and z = −2i.
HELM (2006): 57
Answers
1. Double pole at z = 0, simple poles at z = 3i and z = −3i.
Residue at z = 3i

1 1 1 1 1 1
= lim (z − 3i) 2
= lim 2
= 2 × =− = i.
z→3i z (z + 3i)(z − 3i) z→3i z (z + 3i) 9i 6i 54i 54
Residue at z = −3i

1 1 1 1 1
= lim (z + 3i) 2 = lim 2
= 2× = − i.
z→−3i z (z + 3i)(z − 3i) z→−3i z (z − 3i) 9i −6i 54

d 2
d 1 −2z
For the double pole at z = 0 we find (z − 0) f (z) = = .
dz dz z 2 + 9 (z 2 + 9)2

−2z
Then, lim = 0.
z→0 (z 2 + 9)2
2.
y
i C1
C3
z=0 x
i
C2
1
f (z) =
(z + 2i)(z − 2i)

1 1
(a) Only the pole at z = 2i lies inside C1 . The residue there is lim = .
z→2i z + 2i 4i
I
1 π
Hence f (z) dz = 2πi × = .
C1 4i 2
1 1
(b) Only the pole at z = −2i lies inside C2 . The residue there is lim =− .
z→−2i z − 2i 4i
I
1 π
Hence f (z) dz = 2πi × (− ) = − .
C2 4i 2
(c) The contour C3 encloses both poles so that
I
1 1
f (z) dz = 2πi − = 0.
C3 4i 4i
58 HELM (2006):
Contents 27
Multiple Integration
27.1 Introduction to Surface Integrals 2
27.2 Multiple Integrals over Non-rectangular Regions 20
27.3 Volume Integrals 41
27.4 Changing Coordinates 66
Learning outcomes
In this Workbook you will learn to integrate a function of two variables over various
rectangular and non-rectangular areas. You will learn how to do this for various other
coordinate systems. You will learn to integrate a function of three variables over a volume.
Introduction to
Surface Integrals 27.1
Introduction
Often in Engineering it is necessary to find the sum of a quantity over an area or surface. This can be
achieved by means of a surface integral also known as a double integral i.e. a function is integrated
twice, once with respect to one variable and subsequently with respect to another variable. This
Section looks at the concept of the double integral and how to evaluate a double integral over a
rectangular area.
#
• thoroughly understand the various techniques
of integration
Prerequisites
Before starting this Section you should . . . • be familiar with the concept of a function of
two variables
"
!

• understand the concept of a surface integral

Learning Outcomes
• integrate a function over a rectangular region

2 HELM (2006):
Workbook 27: Multiple Integration
®
1. An example of a surface integral

An engineer involved with the construction of a dam to hold back the water in a reservoir needs to be
able to calculate the total force the water exerts on the dam so that the dam is built with sufficient
strength.
In order to calculate this force, two results are required:
(a) The pressure p of the water is proportional to the depth. That is
p = kd (1)
where k is a constant.
(b) The force on an area subjected to constant pressure is given by
force = pressure × area (2)
The diagram shows the face of the dam. The depth of water is h and δA is a small area in the face
of the dam with coordinates (x, y).
h−y
δA
h
y
x
x
Figure 1
Using (1), the pressure at δA ∼ k(h − y). Using (2), the force on an area δA ∼ k(h − y)δA.
Both of these expressions are approximate as y is slightly different at the top of δA to the bottom.
Now
Total force on dam = sum of forces on all areas δA making up the face of the dam
X
≈ k(h − y)δA
all δA
For a better approximation let δA become smaller, and for the exact result find the limit as δA → 0.
Then
X
Total force on the dam = lim k(h − y)δA
δA→0
Z
= k(h − y) dA
A
Z
where k(h − y) dA stands for the surface integral of k(h − y) over the area A. Surface integrals
A
are evaluated using double integrals. The following Section shows a double integral being developed
in the case of the volume under a surface.
HELM (2006): 3
Section 27.1: Introduction to Surface Integrals
2. Single and double integrals
As has been seen in 14.3, the area under the curve y = f (x) between x = a and x = b is
Z b
given by f (x) dx (assuming that the curve lies above the axis for all x in the range a ≤ x ≤ b).
a
This is illustrated by the figure below.
y area required
y = f (x)
a b x
Figure 2
In a similar manner, the volume under a surface (given by a function of two variables z = f (x, y))
and above the xy plane can be found by integrating the function z = f (x, y) twice, once with respect
to x and once with respect to y.
f (x, y)
y
y=d
y=c
x
x=a x=b
Figure 3
The above figure shows the part of a surface given by f (x, y) which lies above the rectangle a ≤
x ≤ b, c ≤ y ≤ d. This rectangle is shaded and the volume above this rectangle but below the
surface can be seen.
4 HELM (2006):
®
f (x, y)
δx
y
y=d
y=c
x
x=a x=b
Figure 4
Imagine a vertical slice taken through this volume at right angles to the x-axis (figure above). This
slice has thickness δx and lies at position x. Assuming that δx is small enough that the areas of
both sides (left and right) of this slice are virtually the same, the area of each face of the slice is
given by the integral
Z y=d
f (x, y) dy (where x measures the position of the slice)
y=c
and the volume of the slice will be given by

Z y=d
δx f (x, y) dy
y=c
To find the total volume between the surface and the xy plane, this quantity should be summed over
all possible such slices, each for a different value of x. Thus
X Z y=d
V ≈ f (xi , y) dy δx
i y=c
When δx becomes infinitesmally small, it can be considered to be dx and the summation will change
into an integral. Hence
Z x=b Z y=d
V = f (x, y) dydx
x=a y=c
Thus the volume is given by integrating the function twice, once with respect to x and once with
respect to y.
The procedure shown here considers the volume above a rectangular area and below the surface.
The volume beneath the surface over a non-rectangular area can also be found by integrating twice
(see Section 27.2).
HELM (2006): 5
Key Point 1
Volume Integral
The volume under the surface z = f (x, y) and above a rectangular region in the xy plane (that is
the rectangle a ≤ x ≤ b, c ≤ y ≤ d) is given by the integral:
Z b Z d
V = f (x, y) dydx
x=a y=c
3. ‘Inner’ and ‘Outer’ integrals

A typical double integral may be expressed as
Z x=b Z y=d
I= f (x, y) dy dx
x=a y=c
where the part in the centre i.e.

Z y=d
f (x, y) dy
y=c
(known as the inner integral) is the integral of a function of x and y with respect to y. As the
integration takes place with respect to y, the variable x may be regarded as a fixed quantity (a
constant) but for every different value of x, the inner integral will take a different value. Thus, the
Z y=d
inner integral will be a function of x e.g. g (x) = f (x, y) dy.
y=c
This innerZintegral, being a function of x, once evaluated, can take its place within the outer integral
x=b
i.e. I = g (x) dx which can then be integrated with respect to x to give the value of the
x=a
double integral.
The limits on the outer integral will be constants; the limits on the inner integral may be constants
(in which case the integration takes place over a rectangular area) or may be functions of the variable
used for the outer integral (in this case x). In this latter case, the integration takes place over a
non-rectangular area (see Section 27.2). In the Examples quoted in this Section or in the early parts
of the next Section, the limits include the name of the relevant variable; this can be omitted once
more familiarity has been gained with the concept. It will be assumed that the limits on the inner
integral apply to the variable used to integrate the inner integral and the limits on the outer integral
apply to the variable used to integrate this outer integral.
6 HELM (2006):
®
4. Integration over rectangular areas

Consider the double integral
Z 5 Z 1
I= (2x + y) dydx
x=0 y=−1
This represents an integral over the rectangle shown below.
y
1
x
5
−1
Figure 5
Here, the inner integral is
Z 1
g (x) = (2x + y) dy
−1
and the outer integral is

Z 5
I= g (x) dx
x=0
Looking in more detail at the inner integral

Z 1
g (x) = (2x + y) dy
−1
the function (2x + y) can be integrated with respect to y (keeping x constant) to give 2xy + 21 y 2 + C
(where C is a constant and can be omitted as the integral is a definite integral) i.e.
1
1 2 1 1 1 1
g (x) = 2xy + y = 2x + − −2x + = 2x + + 2x − = 4x.
2 −1 2 2 2 2
This is a function of x as expected. This inner integral can be placed into the outer integral to get
Z 5
I= 4x dx
x=0
which becomes
5
2
I = 2x = 2 × 52 − 2 × 02 = 2 × 25 − 0 = 50
0
Hence the double integral

Z 5 Z 1
I= (2x + y) dydx = 50
x=0 y=−1
HELM (2006): 7
Key Point 2
Double Integral
When evaluating a double integral, evaluate the inner integral first and substitute the result into the
outer integral.
Example 1 Z 2 Z 3
Evaluate the double integral I= x2 y dydx
x=−1 y=−2
This integral is evaluated over the area shown below.
y
3
−1 x
2
−2
Figure 6
Solution
Here, the inner integral is
3 2 3
Z
2 2y 9 4 5
g (x) = x y dy = x = x2 − x2 = x2
y=−2 2 −2 2 2 2
and hence the outer integral is
Z 2 2
5 2 51 3 5 5 15
I= x dx = x = × 8 − (−1) =
x=−1 2 23 −1 6 6 2
8 HELM (2006):
®
Example 2
Use the above approach to evaluate the double integral
Z 5 Z 1
πy
I= x2 cos dydx
x=0 y=−1 2
Note that the limits are the same as in a previous case but that the function itself
has changed.
Solution
The inner integral is
Z 1 1
2 πy 2 2 πy 2 2 4
x cos dy = x sin = x2 1 − x2 (−1) = x2
y=−1 2 π 2 −1 π π π
so the outer integral becomes
Z 5 5
4 2 4 3 4 4 500
I= x dx = x = 125 − 0= ≈ 53.1
x=0 π 3π 0 3π 3π 3π
Clearly, variables other than x and y may be used.
Example 3
Evaluate the double integral
Z 4 Z π
I= s sin t dtds
s=1 t=0
Solution
This integral becomes (dispensing with the step of formally writing the inner integral),
Z 4 π Z 4 Z 4
I = − s cos t ds = [−s cos π + s cos 0] ds = [−s (−1) + s (1)] ds
s=1 0 1 1
Z 4 4
= 2s ds = s2 = 16 − 1 = 15
1 1
HELM (2006): 9
Clearly, evaluating the integrals can involve further tools of integration, e.g. integration by parts or
by substitution.
Example 4
Z 2Z 3
xye−x
I= 2
dydx
−1 −2 y + 1
Here, the limits have not formally been linked with a variable name but the limits on the outer integral
apply to x and the limits on the inner integral apply to y. As the integrations are more complicated,
the inner integral will be evaluated explicitly.
Solution
3
xye−x
Z
Inner integral = dy
−2 y2 + 1
which can be evaluated by means of the substitution U = y 2 + 1.
1
If U = y 2 + 1 then dU = 2y dy so y dy = dU .
2
Also if y = −2 then U = 5 and if y = 3 then U = 10.
So the inner integral becomes (remembering that x may be treated as a constant)
Z 10 10
1 xe−x xe−x 10 dU xe−x xe−x
Z
ln 2
dU = = ln U = (ln 10 − ln 5) = xe−x
5 2 U 2 5 U 2 5 2 2
and so the double integral becomes
Z 2
ln 2 2 −x
Z
−x ln 2
I= xe dx = xe dx
−1 2 2 −1
which can be evaluated by integration by parts.
" 2 Z 2 # Z 2
ln 2 −x −x
ln 2 −2 1 −x
I = − xe − 1 × −e dx = −2e + (−1) e + e dx
2 −1 −1 2 −1
" 2 #
ln 2
= −2e−2 − e1 + − e−x
2 −1
ln 2 ln 2
−2e−2 − e−1 − e−2 + e1 = −3e−2 ≈ −0.14

=
2 2
10 HELM (2006):
®
Task
Evaluate the following double integral.
Z 1Z 2
x2 y + 3y 2

I= dydx
−1 0
Your solution
Answer
Z 2 2
2 2 1 2 2 1
x y + y3 × 4x2 + 8 − (0 + 0) = 2x2 + 8

The inner integral = x y + 3y dy = =
0 2 0 2
This can be put in the outer
integralto give
Z 1 1
2 2 2 4 52
2x2 + 8 dx = x3 + 8x

I= = + 8 − (− − 8) = + 16 =
−1 3 −1 3 3 3 3
Exercises
Evaluate the following double integrals over rectangular areas.
Z 1 Z 2
1. I = xy dydx
x=0 y=0
Z 3 Z 4
x2 + y 2

2. I = dxdy
−2 0
Z π Z 1
3. I = y sin2 x dydx
0 −1
Z 2 Z 3
4. I = st3 dsdt
0 −1
Z 3 Z 1 4
5. I = 5z 2 w w2 − 1 dwdz (Requires integration by substitution.)
0 0
Z 2π Z 1
6. I = ty sin t dydt (Requires integration by parts.)
0 0
Answers
1. 1, 2. 460/3, 3. 0, 4. 16, 5. 9/2, 6. −π
HELM (2006): 11
5. Special cases
If the integrand can be written as
f (x, y) = g (x) h (y)
then the double integral
Z bZ d
I= g (x) h (y) dydx
a c
can be written as
Z b Z d
I= g (x) dx × h (y) dy
a c
i.e. the product of the two individual integrals. For example, the integral
Z 2 Z 3
I= x2 y dydx
x=−1 y=−2
which was evaluated earlier can be written as

Z 2 Z 3 3 2 2 3
2 x y 8 (−1) 9 4
I= x dx × y dy = = − −
x=−1 y=−2 3 −1 2 −2 3 3 2 2
5 15
=3× =
2 2
the same result as before.
Key Point 3
Double Integral as a Product
The integral
Z bZ d Z b Z d
g(x)h(y) dydx can be written as g(x) dx × h(y) dy
a c a c
Imagine the integral

Z 1Z 1
2
I= xe−y dydx
−1 0
Z 1
2
Approached directly, this would involve evaluating the integral xe−y dy which cannot be done
0
by algebraic means (i.e. it can only be determined numerically).
12 HELM (2006):
®
However, the integral can be re-written as

Z 1 Z 1 1 Z 1 Z 1
−y 2 1 2 −y 2 2
I= x dx × e dy = x × e dy = 0 × e−y dy = 0
−1 0 2 −1 0 0
and the result can be found without the need to evaluate the difficult integral.
If the integrand is independent of one of the variables and is simply a function of the other variable,
then only one integration need be carried out.
Z bZ d Z d
The integral I1 = h (y) dydx may be written as I1 = (b − a) h (y) dy and the integral
Z bZ d a c Z b c
I2 = g (x) dydx may be written as I2 = (d − c) g (x) dx i.e. the integral in the variable
a c a
upon which the integrand depends multiplied by the length of the range of integration for the other
variable.
Example 5
Z 2Z 2
I= y 2 dydx
0 −1
Solution
As the integral in y can be multiplied by the range of integration in x, the double integral will equal
3 2 " #
Z 2 3 3
y 2 (−1)
I = (2 − 0) y 2 dy = 2 =2 − =6
−1 3 −1 3 3
Note that the two integrations can be carried out in either order as long as the limits are associated
with the correct variable. For example
Z 1 Z 2 Z 1 4 2 2 Z 1
4 xy 4 1 4
I= x y dydx = dx = 2x − x dx
x=0 y=−1 x=0 2 −1 x=0 2
Z 1 1
3 4 3 5 3 3 3
= x dx = x = ×1− ×0=
0 2 10 0 10 10 10
and
2 1 2 1 2
x5 y
Z Z Z Z hy i
4
I= x y dxdy = dy = − 0 dy
y=−1 x=0 y=−1 5 0 −1 5
Z 2 2 2
y y 4 1 3
= dy = = − =
−1 5 10 −1 10 10 10
HELM (2006): 13
Task
Evaluate the following integral:
Z 1Z 1
I= z (w + 1) dwdz.
0 −1
Your solution
Answer
1
Exercises
1. Evaluate the following integrals:
Z π/2 Z 1
(a) I = (y cos x) dydx
0 0
Z 3 Z 1
(b) I = y 2 dydx
−8 −1
Z 1Z 5
(c) I = (s + 1)4 dtds
0 0
Z 3 Z 2 Z 2 Z 3
3
2. Evaluate the integrals x y dydx and x3 y dxdy and show that they are equal.
−1 0 0 −1
As explained in the text, the order in which these integrations are carried out does not matter
for integrations over rectangular areas.
Answers
1. (a) 1/2, (b) 22/3, (c) 31
2. 40
14 HELM (2006):
®
6. Applications of surface integration over rectangular areas
Force on a dam
At the beginning of this Section, the total force on a dam was given by the surface integral
Z
k(h − y) dA
A
Imagine that the dam is rectangular in profile with a width of 100 m and a height h of 40 m. The
expression dA is replaced by dxdy and the limits on the variables x and y are 0 to 100 m and 0
to 40 m respectively. The constant k may be assumed to be 104 kg m−2 s−2 . The surface integral
becomes the double integral
Z 40 Z 100 Z 40 Z 100
k(h − y) dxdy that is 104 (40 − y) dxdy
0 0 0 0
As the integral in this double integral does not contain x, the integral may be written
Z 40 Z 100 Z 40
4
10 (40 − y) dxdy = (100 − 0) 104 (40 − y) dy
0 0 0
40
y2

4
= 100 × 10 40y −
2 0
6 2
= 10 [(40 × 40 − 40 /2) − 0]
= 106 × 800 = 8 × 108 N
that is the total force is 800 meganewtons.
Centre of pressure
We wish to find the centre of pressure (xp , yp ) of a plane area immersed vertically in a fluid. Take
the x axis to be in the surface of the fluid and the y axis to be vertically down, so that the plane
Oxy contains the area.
O surface
x
y
x
δA
Figure 7
We require the following results:
(a) The pressure p is proportional to the depth h, so that p = ωh where ω is a constant.

(b) The force F on an area δA subjected to constant pressure p is given by F = pδA
HELM (2006): 15
Consider a small element of area δA at the position shown. The pressureZ at δA is ωy. ZThen the
force acting on δA is ωyδA. Hence the total force acting on the area A is ωy dA = ω y dA.
A A
Moment of force on δA about Oy = ωxyδA

Z
Total moment of force on δA about Oy = ω xy dA
A
Moment of force on δA about Ox = ωy 2 δA
Z
Total moment of force on δA about Ox = ω y 2 dA
A
Taking moments about Oy:
total force × xp = total moment

Z Z
ω y dA xp = ω xy dA
A
Z Z A
xp y dA = xy dA
A A
Taking moments about Ox:
total force × yp = total moment

Z Z
ω y dA yp = ω y 2 dA
A
Z Z A
yp y dA = y 2 dA
A A
Hence
Z Z
xy dA y 2 dA
xp = ZA and yp = ZA .
y dA y dA
A A
16 HELM (2006):
®
Example 6
A rectangle of sides a and b is immersed vertically in a fluid with one of its edges
in the surface as shown in Figure 8. Where is the centre of pressure?
O surface
x
Figure 8
Solution
To express the surface integral as double integrals we will use cartesian coordinates and vertical
slices. We need the following integrals.
Z Z bZ a a Z b Z b b
1 2 1 2 1 2 1
y dA = y dydx = y dx = a dx = a x = a2 b
A 0 0 0 2 0 0 2 2 0 2
Z Z bZ a Z b a Z b b
1 2 1 2 1 1
xy dA = xy dydx = xy dx = xa dx = x2 a2 = a2 b2
A 0 0 0 2 0 0 2 4 0 4
Z Z bZ a Z b a Z b b
1 3 1 3 1 1
y 2 dA = y 2 dydx = y dx = a dx = a3 x = a3 b
A 0 0 0 3 0 0 3 3 0 3
Hence
Z Z
2
y dA 1 3 xy dA 1 2 2
ab 2 ab 1
yp = ZA = 3
1 2 = a and xp = ZA = 4
1 2 = b
ab 3 ab 2
y dA 2 y dA 2
A A
1 2 2
The centre of pressure is ( b, a), so is at a depth of a.
2 3 3
HELM (2006): 17
Areas and moments
Z
The surface integral f (x, y) dA can represent a number of physical quantities, depending on the
A
function f (x, y) that is used.
Properties:
(a) If f (x, y) = 1 then the integral represents the area of A.

(b) If f (x, y) = x then the integral represents the first moment of A about the y axis.
(c) If f (x, y) = y then the integral represents the first moment of A about the x axis.
(d) If f (x, y) = x2 then the integral represents the second moment of A about the y axis.
(e) If f (x, y) = y 2 then the integral represents the second moment of A about the x axis.
(f) If f (x, y) = x2 + y 2 then the integral represents the second moment of A about the z
axis.
Example 7
Given a rectangular lamina of length `, width b, thickness t (small) and density ρ
(see Figure 9), find the second moment of area of this lamina (moment of inertia)
about the x-axis.
y
b
density: ρ
thickness: t (small)
O x
Figure 9
Solution
By property (e) above, the moment of inertia is given by
Z bZ ` Z b
2
y ρt dxdy = ρt(` − 0) y 2 dy
0 0 0
3 b
y
= `ρt
3 0
3
b
= `ρt
3
As the mass of the lamina is M = `btρ, the moment of inertia simplifies to 31 M b2 . The t and ρ are
included in the integral to make it a moment of inertia rather than simply a second moment.
18 HELM (2006):
®
Task
By a similar method to that in Example 7, find the moment of inertia of the same
lamina about the y-axis.
Your solution
Answer
From property (d) above, the moment of inertia (or second moment of area) is given by the integral
Z lZ b Z l
2
x ρt dydx = ρt(b − 0) x2 dx
0 0 0
3 l
x
= bρt
3 0
3
l
= bρt
3
As the mass of the lamina is M = lbρt, the moment of inertia simplifies to 31 M l2 . Again, the t and
ρ are included in the integral to make it a moment of inertia rather than simple a second moment.
Exercises
By making use of the form of the integrand, evaluate the following double integrals:
Z πZ 1
1. I = y cos2 x dydx
0 0
Z 3 Z 1
2. I = y 2 dydx
−8 −1
Z 1 Z 5
3. I = (s + 1)4 dtds
0 0
π 22
Answers 1. 4
, 2. 3
3. 31
HELM (2006): 19
Multiple Integrals
over Non-rectangular
Regions 27.2
Introduction
In the previous Section we saw how to evaluate double integrals over simple rectangular regions. We
now see how to extend this to non-rectangular regions.
In this Section we introduce functions as the limits of integration, these functions define the region
over which the integration is performed. These regions can be non-rectangular. Extra care now must
be taken when changing the order of integration. Producing a sketch of the region is often very
helpful.
' $
• have a thorough understanding of the various
techniques of integration
• be familiar with the concept of a function of

Prerequisites two variables
• have completed Section 27.1
• be able to sketch a function in the plane

&
%

Learning Outcomes • evaluate double integrals over

non-rectangular regions

20 HELM (2006):
®
1. Functions as limits of integration

In Section 27.1 double integrals of the form
Z x=b Z y=d
I= f (x, y) dydx
x=a y=c
were considered. They represent an integral over a rectangular region in the xy plane. If the limits
of integration of the inner integral are replaced with functions G1 , G2 ,
Z x=b Z G2 (x)
I= f (x, y) dydx
x=a G1 (x)
then the region described will not, in general, be a rectangle. The region will be a shape bounded
by the curves (or lines) which these functions G1 and G2 describe.
As was indicated in 27.1
Z x=b Z G2 (x)
I= f (x, y)dy dx
x=a G1 (x)
can be interpreted as the volume lying above the region in the xy plane defined by G1 (x) and G2 (x),
bounded above by the surface z = f (x, y). Not all double integrals are interpreted as volumes but
this is often the case. If z = f (x, y) < 0 anywhere in the relevant region, then the double integral
no longer represents a volume.
Key Point 4
Double Integral Over General Region
Z x=b Z G2 (x)
I= f (x, y) dydx
x=a G1 (x)
1. The functions G1 , G2 which are the limits for the inner integral are functions of the variable
of the outer integral. This must be the case for the integral to make sense.
2. The limits of the outer integral are constant.
3. Integration over rectangular regions can be thought of as the special case where G1 and G2
are constant functions.
HELM (2006): 21
Section 27.2: Multiple Integrals over Non-rectangular Regions
Example 8 Z 1 Z 1−x
Evaluate the integral I= 2xy dydx
x=0 y=0
Solution
f (x, y) y
1
y
y =1−x
x 1 x
Figure 10 Figure 11
Projecting the relevant part of the surface (Figure 10) down to the xy plane produces the triangle
shown in Figure 11. The extremes that x takes are x = 0 and x = 1 and so these are the limits
on the outer integral. For any value of x, the variable y varies between y = 0 (at the bottom) and
y = 1 − x (at the top). Thus if the volume, shown in the diagram, under the function f (x, y),
bounded by this triangle is required then the following integral is to be calculated.
Z 1 Z 1−x
f (x, y) dydx
x=0 y=0
Once the correct limits have been determined, the integration is carried out in exactly the same
manner as in Section 27.1
Z 1−x
First consider the inner integral g(x) = 2xy dy
y=0
h i1−x
Integrating 2xy with respect to y gives xy 2 + C so g(x) = xy 2 = x(1 − x)2
y=0
Note that, as is required, this is a function of x, the variable of the outer integral. Now the outer
integral is
Z 1
I = x(1 − x)2 dx
x=0
Z 1 4 1
3 2
x 2x3 x2 1 2 1 1
= x − 2x + x dx = − + = − + =
x=0 4 3 2 x=0 4 3 2 12
Regions do not have to be bounded only by straight lines. Also the integrals may involve other tools
of integration, such as substitution or integration by parts. Drawing a sketch of the limit functions
in the plane and shading the region is a valuable tools when evaluating such integrals.
22 HELM (2006):
®
Example 9
Evaluate the volume under the surface given by z = f (x, y) = 2x sin(y), over
the region bounded above by the curve y = x2 and below by the line y = 0, for
0 ≤ x ≤ 1.
y
1
y = x2
1 x
Figure 12
Solution
First sketch the curve y = x2 and identify the region. This is the shaded region in Figure 12. The
required integral is
Z 1 Z x2
I = 2x sin(y) dydx
x=0 y=0
Z 1 x2
= − 2x cos(y) dx
x=0 y=0
Z 1
−2x cos x2 + 2x dx

=
Zx=0
1
1 − cos x2

= 2x dx
x=0
Making the substitution u = x2 so du = 2x dx and noting that the limits x = 0, 1 map to u = 0, 1,

gives
Z 1
I = (1 − cos (u)) du
u=0
h i1
= u − sin(u)
u=0
= 1 − sin(1)
≈ 0.1585
HELM (2006): 23
Example 10
Evaluate the volume under the surface given by z = f (x, y) = x2 + 21 y, over the
region bounded by the curves y = 2x and y = x2 .
y = 2x
y = x2
Figure 13
Solution
The sketch of the region is shown in Figure 13. The required integral is
Z b Z 2x
2 1
I= x + y dydx
x=a y=x2 2
To determine the limits for the integration with respect to x, the points where the curves intersect
are required. These points are the solutions of the equation 2x = x2 , so the required limits are
x = 0 and x = 2. Then the volume is given by
Z 2 Z 2x
2 1
I = x + y dydx
x=0 y=x2 2
Z 2 2x
1
= x2 y + y 2 dx
x=0 4 y=x2
Z 2
2 3 5 4
= x + 2x − x dx
x=0 4
3 2
x x4 x5
= + −
3 2 4 x=0
8
=
3
24 HELM (2006):
®
Example 11
(a) Evaluate the volume under z = f (x, y) = 5x2 y, over the half of the
unit circle that lies above the x-axis. (Figure 14).
−1 1 x
Figure 14
(b) Repeat (a) for z = f (x, y) = 1.
Solution
(a) This region is bounded by the circle y 2 + x2 = 1 and the line y = 0. Since
√ only positive
values of y are required, the equation of the circle can be written y = 1 − x2 . Then
the required volume is given by
Z 1 Z √
1−x2 Z 1 √1−x2
5 2 2
5x2 y dydx

I = = xy dx
x=−1 y=0 x=−1 2 y=0
1 1
5 x3 x5
Z
5 2 2
= x (1 − x2 ) dx = − =
x=−1 2 2 3 5 −1 3
(b)
Z 1 Z √
1−x2 Z 1 √1−x2
I = 1 dydx = y dx
x=−1 y=0 x=−1 y=0
Z 1 √ π
= 1 − x2 dx (which by substituting x = sin θ) =
x=−1 2
Note that by putting f (x, y) = 1 we have found the volume of a semi-circular lamina of uniform
height 1. This result is numerically the same as the area of the region in Figure 14. (This is a
general result.)
HELM (2006): 25
Task
Evaluate the following double integral over a non-rectangular region.
Z 1 Z 1
x2 + y 2 dydx

x=0 y=x
(a) First sketch the region of the xy-plane determined by the limits:
Your solution
Answer
y y
y=1 y=1
x x
x=1 x=1
(b) Now evaluate the inner triangle:
Your solution
Answer
In the triangle, x varies between x = 0 and x = 1. For every value of x, y varies between y = x
and y = 1.
The inner integral is given by
Z 1 1
2 2
2 1 3
Inner Integral = x + y dy = x y + y
y=x 3 x
1 1
= x2 × 1 + × 13 − (x2 × x + x3 )
3 3
1 4
= x2 + − x3
3 3
26 HELM (2006):
®
(c) Finally evaluate the outer integral:
Your solution
Answer
The inner integral is placed in the outer integral to give
Z 1 1
2 4 3 1 1 3 1 4 1
Outer Integral = x − x + dx = x − x + x
0 3 3 3 3 3 0
1 1 1
= ( − + )−0
3 3 3
1
=
3
Note that the above Task is simply one of integrating a function over a region - there is no reference
to a volume here. Another like this now follows.
Task
Integrate the function z = x2 y over the trapezium with vertices at (0, 0), (1, 1),
(1, 2) and (0, 4).
Your solution
HELM (2006): 27
Answer
The integration takes place over the trapezium shown (left)
y y
(0, 4)
(1, 2)
(1, 1)
x x
(0, 0)
Considering variable x on the outer integral and variable y on the inner integral, the trapezium has
an extent in x of x = 0 to x = 1. So, the limits on the outer integral (limits on x) are x = 0 and
x = 1.
For each value of x, y varies from y = x (line joining (0, 0) to (1, 1)) to y = 4 − 2x (line joining
(1, 2) and (0, 4)). So the limits on the inner integral (limits on y) are y = x to y = 4 − 2x.
The double integral thus becomes
Z 1 Z 4−2x
x2 y dy dx
x=0 y=x

4−2x 2 4−2x
(4 − 2x)2 x2
Z
2 2y 3
x y dy = x = x2 − x2 = 8x2 − 8x3 + x4
y=x 2 y=x 2 2 2
Putting this into the outer integral gives

Z 1 1
2 3 3 4 8 3 4 3 5 8 3 29
(8x − 8x + x ) dx = x − 2x + x =( −2+ )−0=
x=0 2 3 10 0 3 10 30
Exercises
Evaluate the following integrals
Z 1 Z x2 +2
1. xy dydx
x=0 y=3x
Z 2 Z 3x
2. xy dydx [Hint: Note how the same curves can define different regions.]
x=1 y=x2 +2
Z 2 Z x2
x
3. dydx, [Hint: use integration by parts for the outer integral.]
x=1 y=1 y
28 HELM (2006):
®
Answers
11
1.
24
9
2.
8
3
3. 4 ln 2 − ≈ 1.27
2
Splitting the region of integration

Sometimes it is difficult or impossible to represent the region of integration by means of consistent
limits on x and y. Instead, it is possible to divide the region of integration into two (or more) sub-
regions, carry out a multiple integral on each region and add the integrals together. For example,
suppose it is necessary to integrate the function g(x, y) over the triangle defined by the three points
(0, 0), (1, 4) and (2, −2).
y B(1, 4)
A(0, 0)
x
D(1, −1)
C(2, −2)
Figure 15
It is not possible to represent the triangle ABC by means of limits on an inner integral and an outer
integral. However, it can be split into the triangle ABD and the triangle BCD. D is chosen to be
the point on AC directly beneath B, that is, line BD is parallel to the y-axis so that x is constant
along it. Note that the sides of triangle ABC are defined by sections of the lines y = 4x, y = −x
and y = −6x + 10.
In triangle ABD, the variable x takes values between x = 0 and x = 1. For each value of x, y can
take values between y = −x (bottom) and y = 4x. Hence, the integral of the function g(x, y) over
triangle ABD is
Z x=1 Z y=4x
I1 = g (x, y) dydx
x=0 y=−x
Similarly, the integral of g(x, y) over triangle BCD is

Z x=2 Z y=−6x+10
I2 = g (x, y) dydx
x=1 y=−x
and the integral over the full triangle is

Z x=1 Z y=4x Z x=2 Z y=−6x+10
I = I1 + I2 = g (x, y) dydx + g (x, y) dydx
x=0 y=−x x=1 y=−x
HELM (2006): 29
Example 12
Integrate the function g(x, y) = xy over the triangle ABC.
Solution
Over triangle ABD, the integral is
Z x=1 Z y=4x
I1 = xy dydx
x=0 y=−x
Z x=1 y=4x Z x=1
1 2 3 1 3
= xy dx = 8x − x dx
x=0 2 y=−x x=0 2
Z x=1 1
15 3 15 4 15 15
= x dx = x = −0=
x=0 2 8 0 8 8
Over triangle BCD, the integral is

Z x=2 Z y=−6x+10
I2 = xy dydx
x=1 y=−x
Z x=2 y=−6x+10
Z x=2
1 2 1 2 1 2
= xy dx = x(−6x + 10) − x(−x) dx
x=1 2 y=−x x=1 2 2
Z x=2 Z x=2
1 1
36x3 − 120x2 + 100x − x3 dx = 35x3 − 120x2 + 100x dx

=
x=1 2 2 x=1
2
1 35 4 75 5
= x − 40x3 + 50x2 = 10 − =
2 4 1 8 8
15 5 5
So the total integral is I1 + I2 = + =
8 8 2
2. Order of integration
All of the preceding Examples and Tasks have been integrals of the form
Z x=b Z G2 (x)
I= f (x, y) dydx
x=a G1 (x)
These integrals represent taking vertical slices through the volume that are parallel to the yz-plane.
That is, vertically through the xy-plane.
Just as for integration over rectangular regions, the order of integration can be changed and the
region can be sliced parallel to the xz-plane. If the inner integral is taken with respect to x then an
integral of the following form is obtained:
Z y=d Z H2 (y)
I= f (x, y) dxdy
y=c H1 (y)
30 HELM (2006):
®
Key Point 5
Changing Order of Integration
1. The integrand f (x, y) is not altered by changing the order of integration.
2. The limits will, in general, be different.
Example 13
The following integral was evaluated in Example 9.
Z 1 Z x2
I= 2x sin(y) dydx = 1 − sin(1)
x=0 y=0
Change the order of integration and confirm that the new integral gives the same
result.
y
1
y = x2
1 x
Figure 16
Solution
The integral is taken over the region which is bounded by the curve y = x2 . Expressed as a function
√
of y this curve is x = y. Now consider this curve as bounding the region from the left, then the
line x = 1 bounds the region to the right. These then are the limit functions for the inner integral
√
H1 (y) = y and H2 (y) = 1. Then the limits for the outer integral are c = 0 ≤ y ≤ 1 = d. The
following integral is obtained
Z 1 Z 1 Z 1 x=1 Z 1
2
I = √
2x sin(y) dxdy = x sin(y) dy = (1 − y) sin(y)) dy
y=0 x= y y=0 √ y=0
x= y
1 Z 1
= − (1 − y) cos(y) − cos(y) dy, using integration by parts
y=0 y=0
1
= 1− sin(y) = 1 − sin(1)
y=0
HELM (2006): 31
Task Z 1 Z 1
2
The double integral I = ey dydx involves an inner integral which is
0 x
impossible to integrate. Show that
Z 1 Zif ythe order of integration is reversed, the
2
integral can be expressed as I = ey dxdy. Hence evaluate the integral I.
0 0
Your solution
Answer
The following diagram shows the changing description of the boundary as the order of integration
is changed.
y
y y=1
y=1
changing order
x=0 x=y
y=x
y=0 x
x
x=0 x=1
Z 1 Z 1 Z 1 Z y Z 1 y
y2 y2 y2
I= e dydx = e dxdy = xe dy
0 x 0 0 0 0
Z 1 1
y2 1 2 1
= ye dy = ey = (e − 1)
0 2 0 2
32 HELM (2006):
®
3. Evaluating surface integrals using polar coordinates

Areas with circular boundaries often lead to double integrals with awkward limits, and these integrals
can be difficult to evaluate. In such cases it is easier to work with polar (r, θ) rather than Cartesian
(x, y) coordinates.
Polar coordinates
P
r
y
O θ
x
Figure 17
The polar coordinates of the point P are the distance r from P to the origin O and the angle θ that
the line OP makes with the positive x axis. The following are used to transform between polar and
rectangular coordinates.
p y
1. Given (x, y), (r, θ) are found using r = x2 + y 2 and tan θ = .
x
2. Given (r, θ), (x, y) are found using x = r cos θ and y = r sin θ
Note that we also have the relation r2 = x2 + y 2 .
Finding surface integrals with polar coordinates

The area of integration A is covered with coordinate circles given by r = constant and coordinate
lines given by θ = constant.
The elementary areas δA are almost rectangles having width δr and length determined by the length
of the part of the circle of radius r Zbetween θ and δθ, the arc length of this part of the circle is rδθ.
So δA ≈ rδrδθ. Thus to evaluate f (x, y) dA we sum f (r, θ)rδrδθ for all δA.
A
Z Z θ=θB Z r=r2 (θ)
f (x, y) dA = f (r, θ) r drdθ
A θ=θA r=r1 (θ)
Key Point 6
Polar Coordinates
In double integration using polar coordinates, the variable r appears in f (r, θ) and in rdrdθ. As
explained above, this r is required because the elementary area element become larger further away
from the origin.
HELM (2006): 33
θ = θB
r = r2 (θ)
δA
θ = constant
r = r1 (θ) θ = θA
r = constant
Figure 18
Note that the use of polar coordinates is a special case of the use of a change of variables. Further
cases of change of variables will be considered in Section 27.4.
Example 14
Z πZ
3
2
Evaluate r cos θ drdθ and sketch the region of integration. Note that it is
0 0
the function cos θ which is being integrated over the region and the r comes from
the rdrdθ.
y
θ = π/3
r=2
r=0
θ
x
θ=0
Figure 19
34 HELM (2006):
®
Solution
The evaluation is similar to that for cartesian coordinates. The inner integral with respect to r, is
evaluated first with θ constant. Then the outer θ integral is evaluated.
Z πZ 2 Z π 2
3 3 1 2
r cos θ dθ = r cos θ dθ
0 0 0 2 0
Z π
3
= 2 cos θ dθ
0
h i π3 π √
= 2 sin θ = 2 sin = 3
0 3
With θ constant r varies between 0 and 2, so the bounding curves of the polar strip start at r = 0
and end at r = 2. As θ varies between 0 and π3 a sector of a circular disc is swept out. This sector
is the region of integration shown above.
Example 15
Earlier in this Section, an example concerned integrating the function f (x, y) =
5x2 y over the half of the unit circle which lies above the x-axis. It is also possible
to carry out this integration using polar coordinates.
Solution
The semi-circle is characterised by 0 ≤ r ≤ 1 and 0 ≤ θ ≤ π. So the integral may be written
(remembering that x = r cos θ and y = r sin θ)
Z πZ 1
5(r cos θ)2 (r sin θ) r drdθ
0 0
which can be evaluated as follows

Z πZ 1
5r4 sin θ cos2 θ drdθ
Z0 π 0
5 1
= r sin θ cos2 θ 0 dθ
Z0 π π
1 1 1 1 1 2
= 2
sin θ cos θ dθ = − cos θ = − cos3 π + cos3 0 = − (−1) + (1) =
3
0 3 0 3 3 3 3 3
This is, of course, the same answer that was obtained using an integration over rectangular coordi-
nates.
HELM (2006): 35
4. Applications of surface integration
Force on a dam
Section 27.1 considered the force on a rectangular dam of width 100 m and height 40 m. Instead,
imagine that the dam is not rectangular in profile but instead has a width of 100 m at the top but
only 80 m at the bottom. The top and bottom of the dam can be given by line segments y = 0
y
(bottom) and y = 40 while the sides are parts of the lines y = 40 − 4x i.e. x = 10 − (left) and
4
y
y = 40 + 4(x − 100) = 4x − 360 i.e. x = 90 + (right). (See Figure 20).
4
y
(0, 40) y = 40 (100, 40)
y = 40 − 4x
DAM y = 4x − 360
(10, 0) y=0 (90, 0) x
Figure 20
Thus the dam exists at heights y between 0 and 40 while for each value of y, the horizontal coordinate
y y
x varies between x = 10 − and x = 90 + . Thus the surface integral representing the total force
4 4
i.e.
Z Z 40 Z 90+ y
4
4
I= 10 (40 − y) dA becomes the double integral I = 104 (40 − y) dxdy
A 0 10− y4

Z 40 Z 90+ y
4
I = 104 (40 − y) dxdy
0 10− y4
Z 40 90+ y4 Z 40
4 4 y h y i
= 10 (40 − y)x (40 − y)(90 + ) − (40 − y)(10 − ) dy
dy = 10
0 10− y4 0 4 4
Z 40 h 40
y2
Z
y i
= 104 (40 − y)(80 + ) dy = 104 3200 − 60y − dy
0 2 0 2
40
4 2 1 3 4 2 1 3
= 10 3200y − 30y − y = 10 (3200 × 40 − 30 × 40 − 40 ) − 0
6 0 6
208000
= 104 × ≈ 6.93 × 108 N
3
i.e. the total force is just under 700 meganewtons.
36 HELM (2006):
®
Centre of pressure
A plane area in the shape of a quadrant of a circle of radius a is immersed vertically in a fluid with
one bounding radius in the surface. Find the position of the centre of pressure.
O a
θ x
Figure 21
Note: In subsection 6 of Section 27.1 it was shown that the coordinates of the centre of pressure of
a (thin) object are
Z Z
xy dA y 2 dA
xp = ZA and yp = ZA
y dA y dA
A A
π π a
Z Z Z a Z
2
2 1 3
2
y dA = r sin θ drdθ = r sin θ dθ
A 0 0 0 3 0
Z π π
1 3 2 1 h i 2 1
= a sin θ dθ = a3 − cos θ = a3
3 0 3 0 3
Z Z πZ a Z π a
2
3
2 1 4
xy dA = r cos θ sin θ dθ = r cos θ sin θ dθ
A 0 0 0 4 0
Z π π2
1 4 2 1 1 2 1
= a sin θ cos θ dθ = a4 sin θ = a4
4 0 4 2 0 8
Z Z πZ a Z π a
2 2 1 4 2
y 2 dA = r3 sin2 θ drdθ = r sin θ dθ
A 0 0 0 4 0
Z π Z π π2
1 4 2 2 1 4 2 1 1 4 1 1
= a sin θ dθ = a (1 − cos 2θ) dθ = a θ − sin 2θ = πa4
4 0 4 0 2 8 2 0 16
Z Z
xy dA 1 4 y 2 dA 1
a 3 πa4 3
Then xp = ZA = 81 3 = a and yp = ZA = 161 3 = πa.
a 8 a 16
y dA 3 y dA 3
A A

3 3
The centre of pressure is at a, πa .
8 16
HELM (2006): 37
Volume of liquid in an elliptic tank

Introduction
A tank in the shape of an elliptic cylinder has a volume of liquid poured into it. It is useful to know
in advance how deep the liquid will be. In order to make this calculation, it is necessary to perform
a multiple integration.
b
c
z
y
h
Figure 22
Problem in words
The tank has semi-axes a (horizontal) and c (vertical) and is of constant thickness b. A volume of
liquid V is poured in (assuming that V < πabc, the volume of the tank), filling it to a depth h,
which is to be calculated. Assume 3-D coordinate axes based on a point at the bottom of the tank.
Since the tank is of constant thickness b, the volume of liquid is given by the shaded area multiplied
by b, i.e.
V = b × shaded area
where the shaded area can be expressed as the double integral
Z h Z x2
dxdz
z=0 x=x1
where the limits x1 and x2 on x can be found from the equation of the ellipse
x2 (z − c)2
+ =1
a2 c2
38 HELM (2006):
®
From the equation of the ellipse
(z − c)2

2 2
x = a 1−
c2
a2
= 2 c2 − (z − c)2

c
a2 a√
= 2 2zc − z 2

so x = ± 2zc − z 2
c c
Thus
a√ a√
x1 = − 2zc − z 2 , and x2 = + 2zc − z 2
c c
Consequently
Z h Z x2 Z h h i x2
V =b dxdz = b x dz
z=0 x=x1 z=0 x1
h
a√
Z
= b 2 2zc − z 2 dz
z=0 c
Now use substitution z − c = c sin θ so that dz = c cos θ dθ

π
z=0 gives θ=−
2
−1 h
z=h gives θ = sin − 1 = θ0 (say)
c
Z θ0
a
V = b 2 c cos θ c cos θ dθ
c
− π2
Z θ0
= 2abc cos2 θ dθ
− π2
Z θ0
= abc [1 + cos 2θ] dθ
− π2
θ0
1
= abc θ + sin 2θ
2 − π2

1 π
= abc θ0 + sin 2θ0 − − + 0
2 2

1 π
= abc θ0 + sin 2θ0 + . . . (∗)
2 2
which can also be expressed in the form

 
s 2
h h h π
V = abc sin−1 −1 + −1 1− −1 + 
c c c 2
HELM (2006): 39
While (∗) expresses V as a function of θ0 (and therefore h) to find θ0 as a function of V requires a
numerical method. For a given a, b, c and V , solve equation (∗) by a numerical method to find θ0
and find h from h = c(1 + sin θ0 ).
Interpretation
If a = 2 m, b = 1 m, c = 3 m (so the total volume of the tank is 6π m3 ≈ 18.85 m3 ), and a volume
of 7 m3 is to be poured into the tank then

1 π
V = abc θ0 + sin 2θ0 +
2 2
which becomes

1 π
7 = 6 θ0 + sin 2θ0 +
2 2
and has solution θ0 = −0.205 (3 decimal places).
Finally
h = c(1 + sin θ0 )
= 3(1 + sin(−0.205))
= 2.39 m to 2 d.p
compared to the maximum height of 6 m.
Exercises
1. Evaluate the functions (a) xy and (b) xy + 3y 2
over the quadrilateral with vertices at (0, 0), (3, 0), (2, 2) and (0, 4).
Z Z Z Z
2. Show that f (x, y) dy dx = f (x, y) dx dy for f (x, y) = xy 2 when A is the interior
A A
of the triangle with vertices at (0, 0), (2, 0) and (2, 4).
Z 4 Z 2
3. By reversing the order of the two integrals, evaluate the integral sin x3 dx dy
y=0 x=y 1/2
4. Integrate the function f (x, y) = x3 + xy 2 over the quadrant x ≥ 0, y ≥ 0, x2 + y 2 ≤ 1.

Answers
Z 2 Z 4−x Z 3 Z 6−2x
22 3 53 202 7 425
1. f (x, y) dy dx + f (x, y) dy dx; + = ; + =
x=0 y=0 x=2 y=0 3 2 6 3 2 6
256
2. Both equal
15
Z 2 Z x2
1
3. sin x3 dy dx = (1 − cos 8) ≈ 0.382
x=0 y=0 3
Z π/2 Z 1
1
4. r4 cos θ dr dθ =
θ=0 r=0 5
40 HELM (2006):
®

Volume Integrals 27.3
Introduction
In the previous two Sections, surface integrals (or double integrals) were introduced i.e. functions
were integrated with respect to one variable and then with respect to another variable. It is often
useful in engineering to extend the process to an integration with respect to three variables i.e. a
volume integral or triple integral. Many of the processes and techniques involved in double integration
are relevant to triple integration.
' $

two variables
Prerequisites
Before starting this Section you should . . . • have studied Sections 27.1 and 27.2 on
double integration
• be able to visualise or sketch a function in

three variables.
&
%

Learning Outcomes • evaluate triple integrals


HELM (2006): 41
Section 27.3: Volume Integrals
1. Example of volume integral: mass of water in a reservoir
Sections 27.1 and 27.2 introduced an example showing how the force on a dam can be represented by
a double integral. Suppose, instead of the total force on the dam, an engineer wishes to find the total
mass of water in the reservoir behind the dam. The mass of a little element of water (dimensions δx
in length, δy in breadth and δz in height) with density ρ is given by ρδzδyδx (i.e. the mass of the
element is given by its density multiplied by its volume).
The density may vary at different parts of the reservoir e.g. due to temperature variations and
the water expanding at higher temperatures. It is important to realise that the density ρ may be a
function of all three variables, x, y and z. For example, during the spring months, the depths of the
reservoir may be at the cold temperatures of the winter while the parts of the reservoir nearer the
surface may be at higher temperatures representing the fact that they have been influenced by the
warmer air above; this represents the temperature varying with the vertical coordinate z. Also, the
parts of the reservoir near where streams flow in may be extremely cold as melting snow flows into
the reservoir. This represents the density varying with the horizontal coordinates x and y.
Thus the mass of a small elementZ 0 of water is given by ρ(x, y, z)δzδyδx The mass of water in a
column is given by the integral ρ(x, y, z) dzδyδx where the level z = 0 represents the surface
−h(x,y)
of the reservoir and the function h(x, y) represents the depth of the reservoir for the particular values
of x and y under consideration. [Note that the depth is positive but as it is measured downwards, it
represents a negative value of z.]
The mass of water in a slice (aligned parallel to the x-axis) is given by integrating once more with
Z y2 (x) Z 0
respect to y i.e. ρ(x, y, z) dzdyδx. Here the functions y1 (x) and y2 (x) represent the
y1 (x) −h(x,y)
extreme values of y for the value of x under consideration.
Finally the total mass of water in the reservoir can be found by integrating over all x i.e.
Z b Z y2 (x) Z 0
ρ(x, y, z) dzdydx.
a y1 (x) −h(x,y)
To find the total mass of water, it is necessary to integrate the density three times, firstly with
respect to z (between limits dependent on x and y), then with respect to y (between limits which
are functions of x) and finally with respect to x (between limits which are constant).
This is an example of a triple or volume integral.
2. Evaluating triple integrals

A triple integral is an integral of the form
Z b Z q(x) Z s(x,y)
f (x, y, z) dzdydx
a p(x) r(x,y)
The evaluation can be split into an “inner integral” (the integral with respect to z between limits
which are functions of x and y), an “intermediate integral” (the integration with respect to y between
limits which are functions of x) and an “outer integral” (the integration with respect to x between
42 HELM (2006):
®
limits which are constants. Note that there is nothing special about the variable names x, y and z:
other variable names could have been used instead. Z
Triple integrals can be represented in different ways. f dV represents a triple integral where the
V
dV is replaced by dxdydz (or equivalent) and the limit of V on the integral is replaced by appropriate
limits on the three integrals.
Z
Note that the integral dV (i.e. integrating the function f (x, y, z) = 1) gives the volume of the
V
relevant shape. Hence the alternative name of volume integral.
One special case is where the limits on all the integrals are constants (a constant is, of course, a
special case of a function). This represents an integral over a cuboidal region.
Example 16
Consider a cube V of side 1.
Z
(a) Express the integral f dV (where f is any function of x, y and z) as a
V
triple integral.
Z
(b) Hence evaluate (y 2 + z 2 ) dV
V
z y=0 x=0
z=1
y=1
1
δV
1
y
1
x=1
x
z=0
Figure 23
Solution
(a) Consider a little element of length dx, width dy and height dz. Then δV (the volume of
the small element) is the product of these lengths dxdydz. The function is integrated three times.
The first integration represents the integral over the vertical strip from z = 0 to z = 1. The second
integration represents this strip sweeping across from y = 0 to y = 1 and is the integration over
the slice that is swept out by the strip. Finally the integration with respect to x represents this slice
sweeping from x = 0 to x = 1 and is the integration over the entire cube. The integral therefore
becomes
Z 1Z 1Z 1
f (x, y, z) dzdydx
0 0 0
HELM (2006): 43
Solution (contd.)
(b) In the particular case where the function is f (x, y, z) = y 2 + z 2 , the integral becomes
Z 1Z 1Z 1
(y 2 + z 2 ) dzdydx
0 0 0

Z 1 1
2 2 2 1 3 1 1 1
(y + z ) dz = y z + z = y2 × 1 + × 1 − y2 × 0 − × 0 = y2 +
0 3 z=0 3 3 3
This inner integral is now placed into the intermediate integral to give
Z 1 1
2 1 1 3 1 1 1 1 1 2
(y + ) dy = y + y = × 13 + × 1 − × 03 − × 0 =
0 3 3 3 y=0 3 3 3 3 3
Finally, this intermediate integral can be placed into the outer integral to give
Z 1 1
2 2 2 2 2
dx = x = × 1 − × 0 =
0 3 3 0 3 3 3
Example
Z 1 17
Z 2Z 3
Evaluate 8xyz dzdydx. This represents an integral over the cuboid
0 0 0
given by 0 ≤ x ≤ 1, 0 ≤ y ≤ 2, 0 ≤ z ≤ 3.
Solution
The inner integral is given by integrating the function with respect to z while keeping x and y
constant.
Z 3 3
2
8xyz dz = 4xyz = 4xy × 9 − 0 = 36xy
0 0
This result is now integrated with respect to y while keeping x constant:

Z 2 2
36xy dy = 18xy = 18x × 4 − 0 = 72x
0 0
Finally, this result is integrated with respect to x:

Z 1 1
2
72x dx = 36x = 36 × 1 − 0 = 36
0 0
Z 1Z 2Z 3
Hence, 8xyz dzdydx = 36
0 0 0
More generally, the limits on the inner integral may be functions of the “intermediate” and “outer”
variables and the limits on the intermediate integral may be functions of the “outer” variable.
44 HELM (2006):
®
Example 18
V is the tetrahedron bounded by the planes x = 0, y = 0, z = 0 and x+y +z = 4.
(see Figure 24).
z
4
z =4−x−y
δV 4
y
y =4−x
4
z=0
x
Figure 24
Z
(a) Express f (x, y, z) dV (where f is a function of x, y and z) as a
V
triple integral.
Z
(b) Hence find x dV .
V
Solution
The tetrahedron is divided into a series of slices parallel to the yz-plane and each slice is divided
into a series of vertical strips. For each strip, the bottom is at z = 0 and theZ top is on the plane
4−x−y
x + y + z = 4 i.e. z = 4 − x − y. So the integral up each strip is given by f (x, y, z) dz
z=0
and this (inner) integral will be a function of x and y.
This, in turn, is integrated over all strips which form the slice. For each value of x, one end of
the Z will be at y = 0 and the other end at y = 4 − x. So the integral over the slice is
Z slice
4−x 4−x−y
f (x, y, z) dzdy and this (intermediate) integral will be a function of x.
y=0 z=0
Finally, integration
Z 4 Z 4−xis Zcarried out over x. The limits on x are x = 0 and x = 4. Thus the triple
4−x−y
integral is f (x, y, z) dzdydx and this (outer) integral will be a constant.
x=0 y=0 z=0
Z Z 4 Z 4−x Z 4−x−y
Hence f (x, y, z) dV = f (x, y, z) dzdydx.
V x=0 y=0 z=0
HELM (2006): 45
Solution (contd.)
In the case where f (x, y, z) = x, the integral becomes
Z Z 4 Z 4−x Z 4−x−y
f (x, y, z) dV = x dzdydx
V x=0 y=0 z=0
Z 4 Z 4−x 4−x−y
= xz dydx
x=0 y=0 z=0
Z 4 Z 4−x
= [(4 − x − y)x − 0] dydx
x=0 y=0
Z 4 Z 4−x
4x − x2 − xy dydx

=
x=0 y=0
Z 4 4−x
2 1 2
= 4xy − x y − xy dx
x=0 2 y=0
Z 4 4−x
2 1 2
= 4x(4 − x) − x (4 − x) − x(4 − x) − 0 dx
x=0 2 y=0
Z 4
2 2 3 2 1 3
= 16x − 4x − 4x + x − 8x + 4x − x dx
x=0 2
Z 4
1
= 8x − 4x2 + x3 dx
x=0 2
4
2 4 3 1 4 4 1
= 4x − x + x = 4 × 42 − × 43 + × 44 − 0
3 8 0 3 8
256
= 64 − + 32
3
192 − 256 + 96
=
3
32
=
3
Key Point 7
Triple Integration
The procedure for carrying out a triple integral is very similar to that for a double integral except
that the procedure requires three stages rather than two.
46 HELM (2006):
®
Example 19
Find the integral of x over the shape shown in Figure 25. It represents half (positive
x) of a cylinder centered at x = y = 0 with radius 1 and vertical extent from z = 0
to z = 1.
z
1
1 y
1
x
Figure 25
Solution
√
2
√ terms of x, the shape goes from x = 0 to x = 1. For each value of x, y goes from − 1 − x to
In
1 − x2 . The variable z varies from z = 0 to z = 1. Hence the triple integral is
√
Z 1 Z 1−x2 Z 1
I = √
x dzdydx
x=0 y=− 1−x2 z=0
Z 1 Z √1−x2 1 Z 1 Z √
1−x2 Z 1 Z √
1−x2
= √
xz dydx = √
[x − 0] dydx = √
x dydx
x=0 y=− 1−x2 z=0 x=0 y=− 1−x2 x=0 y=− 1−x2
√1−x2
Z 1 Z 1 √
= xy √
dx = 2x 1 − x2 dx
x=0 y=− 1−x2 x=0
This outer integral can be evaluated by means of the substitution U = 1 − x2 i.e. dU = −2x dx
and noting that U = 1 when x = 0 and U = 0 when x = 1 i.e.
1
Z 1 √ Z 0 Z 1
2 1/2 1/2 2 3/2 2 2
I= 2x 1 − x dx = − U dU = U dU = U = −0=
x=0 1 0 3 0 3 3
It is important to note that the three integrations can be carried out in whatever order is most
convenient. The result does not depend on the order in which the integrals are carried out. However,
when the order of the integrations is changed, it is necessary to consider carefully what the limits
should be on each integration. Simply moving the limits from one integration to another will only
work in the case of integration over a cuboid (i.e. where all limits are constants).
HELM (2006): 47
Key Point 8
Order of Integration for Triple Integrals
1. The three integrations can be carried out in whichever order is most convenient.
2. When changing the order of the integrations, it is important to reconsider the limits on each
integration; a diagram can often help.
Example 20
For the triangular prism in Figure 26, with ends given by the planes y = 0 and
y = 2 and remaining faces given by the planes x = 0, z = 0 and x + 4z = 4, find
the integral of x over the prism, by
(a) Integrating first with respect to z, then y and finally x, and
(b) Changing the order of the integrations to x first, then y, then z.
z
1
2 y
4
x
Figure 26
Solution
For every value of x and y, the vertical coordinate z varies from z = 0 to z = 1 − x/4. Hence
the limits on z are z = 0 and z = 1 − x/4. For every value of x, the limits on y are y = 0 to
y = 2. The limits on x are x = 0 and x = 4 (the limits on the figure). Hence the triple integral is
Z 4 Z 2 Z 1−x/4
x dzdydx which can be evaluated as follows
0 0 0
Z 4 Z 2 Z 1−x/4 Z 4 Z 2 1−x/4
I = x dzdydx = dydx xz
0 0 0 0 0
Z 4Z 2h Z 4 Z 2 0
x i 1 2
= x 1− − 0 dydx = x− x dydx
0 0 4 0 0 4
Z 4 2 Z 4
1 2 1 2 1 2
= x − x y dx = x − x × 2 − x − x × 0 dx
0 4 0 0 4 4
Z 4 4
1 1 1 32 16
= 2x − x2 dx = x2 − x3 = 42 − × 43 − 0 = 16 − =
0 2 6 0 6 3 3
48 HELM (2006):
®
Solution (contd.)
Now, if the order of the integrations is changed, it is necessary to re-derive the limits on the integrals.
For every combination of y and z, x varies between x = 0 (left) and x = 4 − 4z (right). Hence the
limits on x are x = 0 and x = 4 − 4z. The limits on y are y = 0 and y = 2 (for all z). The limits
of z are z = 0 (bottom) and z = 1 (top).
Z 1 Z 2 Z 4−4z
So the triple integral becomes x dxdydz which can be evaluated as follows
0 0 0
Z 1 Z 2 Z 4−4z Z 1 Z 2 4−4z
1 2
I = x dxdydz = dydz x
0 0 0 0 0 0 2
Z 1Z 2 Z 1Z 2
1 2
= (4 − 4z) dydz = (8 − 16z + 8z 2 ) dydz
0 0 2 0 0
Z 1 2 Z 1
8 − 16z + 8z 2 y dz = 2 (8 − 16z + 8z 2 ) dz

=
0 0 0
1
2 8 3 8 16
= 2 8z − 8z + z =2 8−8+ −0 =
3 0 3 3
Key Point 9
Limits of Integration
While for different orders of integration the integral will always evaluate to the same value, the limits
of integration will in general be different.
Task Z 2 Z 3 Z 2
Evaluate the triple integral: x3 y 2 z dxdydz
0 0 0
Your solution
HELM (2006): 49
Answer
Z 2 2
1 4 2 1
x y z dx = x y z = 24 y 2 z − 0 = 4y 2 z
3 2
0 4 0 4
This is put into the intermediate integral i.e.
Z 3 3
4 3 4
2
4y z dy = y z = 33 z − 0 = 36z
0 3 0 3
Finally, this is put in the outer integral to give
Z 2 2
2
I= 36z dz = 18z = 18 × 22 − 0 = 72
0 0
Exercises
Evaluate the following triple integrals
Z 2 Z x Z x+z
1. (x + y + z) dydzdx
0 0 0
Z 4 Z 3 Z 2−x/2
2. (x + y) dzdydx
2 −1 x/2−2
Answer
88
1. 14 2.
3
Task
Find the volume of the solid prism shown in the diagram below. Check that when
the order of integration is changed, the volume remains unaltered.
1
y+z =1
1 y
3
x
x=3
50 HELM (2006):
®
Your solution
Answer ZZZ
The volume is given by the triple integral dV .
Putting z on the outer integral, y on the intermediate integral and x on the inner integral,
the limits on z are z = 0 to z = 1. For each value of z, y varies from y = 0 (base) to y = 1 − z
on the sloping face. For each combination of y and z, x varies from x = 0 to x = 3. Thus, the
volume is given by
Z Z Z Z 1 Z 1−z Z 3
V = dV = dxdydz
z=0 y=0 x=0
Z 1 Z 1−z 3 Z 1 Z 1−z
= x dydz = 3 dydz
z=0 y=0 0 z=0 y=0
Z 1 1−z Z 1 Z 1
= 3y dz = (3(1 − z) − 0) dz = (3 − 3z) dz
z=0 y=0 z=0 z=0
1
3 2 3 3
= 3z − z =3− − (0 − 0) = = 1.5
2 0 2 2
HELM (2006): 51
Answers continued
Now, the three integrations can be carried out in a different order. For example, with x on the outer
integral, z on the intermediate integral and y on the inner integral, the limits on x are x = 0 to
x = 3; for each value of x, z varies from z = 0 to z = 1 and for each combination of x and z, y
varies from y = 0 to y = 1 − z. The volume is therefore given by
Z Z Z Z 3 Z 1 Z 1−z
V = dV = dydzdx
x=0 z=0 y=0
Z 3 Z 1 1−z
= y dzdx
x=0 z=0 y=0
Z 3 Z1
= [1 − z] dzdx
x=0 z=0
Z 3 1
z2

= z− dx
x=0 2 z=0
Z 3
12

= 1− − 0 dx
x=0 2
Z 3
1
= dx
x=0 2
h x i3 3
= = − 0 = 1.5
2 x=0 2
There are in all six ways (3!) to order the three integrations; each order gives the same answer of
1.5.
Exercise
Find the volume of the solid shown in the diagram below. Check that when the order of integration
is changed, the volume remains unaltered.
z
z=6
6
y = 4 − x2
4 y
2
Answer
32
52 HELM (2006):
®
3. Higher order integrals

A function may be integrated over four or more variables. For example, the integral
Z 1 Z 1 Z 1 Z 1−x
(w + y) dzdydxdw
w=0 x=0 y=0 z=0
represents the function w + y being integrated over the variables w, x, y and z. This is an example
of a quadruple integral.
The methods of evaluating quadruple integrals are very similar to those for double and triple integrals.
Start the integration from the inside and gradually work outwards. Quintuple (five variable) and
higher-order integrals also exist and the techniques are similar.
Example 21 Z 1 Z 1 Z 1 Z 1−x
Evaluate the quadruple integral (w + y) dzdydxdw.
w=0 x=0 y=0 z=0
Solution
The first integral, with respect to z gives
Z 1−x 1−x
(w + y) dz = (w + y)z = (w + y)(1 − x) − 0 = (w + y)(1 − x).
0 0
The second integral, with respect to y gives

Z 1 1
1 2 1 1
(w +y)(1−x) dy = wy + y (1 − x) = w + (1−x)−0 = w + (1−x).
0 2 0 2 2
The third integral, with respect to x gives
Z 1 1
x2

1 1 1 1 1 1 1 1
w+ (1 − x) dx = w+ x− = w+ −0 = w+ = w+ .
0 2 2 2 0 2 2 2 2 2 4
Finally, integrating with respect to w gives
Z 1 1
1 1 1 2 1 1 1 1
w+ dw = w + w = + − 0 =
0 2 4 4 4 0 4 4 2
Exercise
Z 1 Z 1 Z 1 Z 1−y 2
Evaluate the quadruple integral (x + y 2 ) dzdydxdw.
0 −1 −1 0
Answer
8
15
HELM (2006): 53
4. Applications of triple and higher integrals
Z Z Z Z
The integral f (x, y, z) dzdydx (or f (x, y, z) dV ) may represent many physical quantities
V
depending on the function f (x, y, z) and the limits used.
Volume Z
The integral 1 dV (i.e. the integral of the function f (x, y, z) = 1) with appropriate limits gives
V
the volume of the solid described by V . This is sometimes more convenient than finding the volume
by means of a double integral.
Mass Z Z Z Z
The integral ρ(x, y, z) dzdydx (or ρ(x, y, z) dV ), with appropriate limits, gives the mass
V
of the solid bounded by V .
Mass of water in a reservoir

The introduction to this Section concerned the mass of water in a reservoir. Imagine that the reservoir
is rectangular in profile and that the width along the dam (i.e. measured in the x direction) is 100 m.
Imagine also that the length of the reservoir (measured away from the dam i.e. in the y direction) is
400 m. The depth of the reservoir is given by 40 − y/10 m i.e. the reservoir is 40 m deep along the
dam and the depth reduces to zero at the end away from the dam.
The density of the water can be approximated by ρ(z) = a − b × z where a = 998 kg m−3 and
b = 0.05 kg m−4 . I.e. at the surface (z = 0) the water has density 998 kg m−3 (corresponding to
a temperature of 20◦ C) while 40 m down i.e. z = −40, the water has a density of 1000 kg m−3
(corresponding to the lower temperature of 4◦ C).
400
DAM y
−40
100
x
Figure 27
The mass of water in the reservoir is given by the integral of the function ρ(z) = a − b × z. For
each value of x and y, the limits on z will be from y/10 − 40 (bottom) to 0 (top). Limits on y will
be 0 to 400 m while the limits of x will be 0 to 100 m. The mass of water is therefore given by the
integral
Z 100 Z 400 Z 0
M= (a − bz) dzdydx
0 0 y/10−40
54 HELM (2006):
®

Z 100 Z 400 Z 0
M = (a − bz) dzdydx
0 0 y/10−40
Z 100
Z 400 0
b 2
= az − z dydx
0 0 2 y/10−40
Z 100 Z 400
b 2
= 0 − a(y/10 − 40) + (y/10 − 40) dydx
0 0 2
Z 100 Z 400
a b 2
= 40a − y + y − 4by + 800b dydx
0 0 10 200
Z 100 400
a 2 b 3 2
= 40ay − y + y − 2by + 800by dx
0 20 600 0
Z 100
320000
= 16000a − 8000a + b − 320000b + 320000b dx
0 3
Z 100
320000
= 8000a + b dx
0 3
3.2 0.16
= 8 × 105 a + × 107 b = 7.984 × 108 + × 107 = 7.989 × 108 kg
3 3
So the mass of water in the reservoir is 7.989 × 108 kg.

Notes :
1. In practice, the profile of the reservoir would not be rectangular and the depth would not vary
so smoothly.
2. The variation of the density of water with height is only a minor factor so it would only be
taken into account when a very exact answer was required. Assuming that the water had a
uniform density of ρ = 998 kg m−3 would give a total mass of 7.984 × 108 kg while assuming
a uniform density of ρ = 1000 kg m−3 gives a total mass of 8 × 108 kg.
Centre of mass
The expressions for the centre of mass (x, y, z) of a solid of density ρ(x, y, z) are given below
Z Z Z
ρ(x, y, z)x dV ρ(x, y, z)y dV ρ(x, y, z)z dV
x= Z y= Z z= Z
ρ(x, y, z) dV ρ(x, y, z) dV ρ(x, y, z) dV
In the (fairly common) case where the density ρ does not vary with position and is constant, these
results simplify to
Z Z Z
x dV y dV z dV
x= Z y= Z z= Z
dV dV dV
HELM (2006): 55
Example 22
A tetrahedron is enclosed by the planes x = 0, y = 0, z = 0 and x + y + z = 4.
Find (a) the volume of this tetrahedron, (b) the position of the centre of mass.
Solution
(a) Note that this tetrahedronZ was considered in Example
Z 418,Z see Figure 24. It was shown that in
4−x Z 4−x−y
this case the volume integral f (x, y, z) dV becomes f (x, y, z) dzdydx. The
V x=0 y=0 z=0
volume is given by
Z Z 4 Z 4−x Z 4−x−y
V = dV = dzdydx
V x=0 y=0 z=0
Z 4 Z 4−x 4−x−y
= z dydx
x=0 y=0 z=0
Z 4 Z 4−x
= (4 − x − y) dydx
x=0 y=0
Z 4 4−x
1 2
= 4y − xy − y dx
x=0 2 y=0
Z 4
1 2
= 8 − 4x + x dx
x=0 2
4
2 1 3 64 32
= 8x − 2x + x = 32 − 32 + =
6 0 6 3
32
Thus the volume of the tetrahedron is ≈ 10.3
3
Z
x dV
(b) The x coordinate of the centre of mass i.e. x is given by x = Z .
dV
Z Z
32
The denominator dV is the formula for the volume i.e. while the numerator x dV was
3
32
calculated in an earlier Example to be .
3
Z
x dV
32/3
Thus x = Z = = 1.
32/3
dV
By symmetry (or by evaluating relevant integrals), it can be shown that y = z = 1 i.e. the centre
of mass is at (1, 1, 1).
56 HELM (2006):
®
Moment of inertia
The moment of inertia I of a particle of mass M about an axis P Q is defined as
I = Mass × Distance2 or I = M d2
where d is the perpendicular distance from the particle to the axis.
To find the moment of inertia of a larger object, it is necessary to carry out a volume integration
p
over all such particles. The distance of a particle at (x, y, z) from the z-axis is given by x2 + y 2
so the moment of inertia of an object about the z-axis is given by
Z
Iz = ρ(x, y, z)(x2 + y 2 ) dz
V
Similarly, the moments of inertia about the x-axis and y-axis are given by
Z Z
2 2
Ix = ρ(x, y, z)(y + z ) dx and Iy = ρ(x, y, z)(x2 + z 2 ) dy
V V
In the case where the density is constant over the object, so ρ(x, y, z) = ρ, these formulae reduce to
Z Z Z
2 2 2 2
Ix = ρ (y + z ) dx , Iy = ρ (x + z ) dy and Iz = ρ (x2 + y 2 ) dz
V V V
When possible, the moment of inertia is expressed in terms of M , the mass of the object.
Example 23
Find the moment of inertia (about the x-axis) of the cube of side 1, mass M and
density ρ shown in Example 16, page 43.
Solution
For the cube,
Mass = Volume × Density i.e. M = 13 × ρ = ρ
The moment of inertia (about the x-axis) is given by
Z Z 1Z 1Z 1
2 2
Ix = ρ (y + z ) dx = ρ (y 2 + z 2 ) dzdydx
V 0 0 0
2
This integral was shown to equal in Example 16. Thus
3
2 2
Ix = ρ = M
3 3
By applying symmetry, it can also be shown that the moments of inertia about the y- and z-axes
2
are also equal to M .
3
HELM (2006): 57
Radioactive decay
Introduction
A cube of an impure radioactive ore is of side 10 cm. The number of radioactive decays taking
place per cubic metre per second is given by R = 1023 (0.1 − z)e−t/1000 . The dependence on time
represents a half-life of 693 seconds while the dependence on the vertical coordinate z represents
some gravitational stratification. The value z = 0 represents the bottom of the cube and z = 0.1
represents the top of the cube. (Note that the dimensions are in metres so 10 cm becomes 0.1 m.)
What is the total number of decays taking place over the cube in the 100 seconds between t = 0
and t = 100?
Solution
The total number of decays is given by the quadruple integral
Z 0.1 Z 0.1 Z 0.1 Z 100
N= 1023 (0.1 − z)e−t/1000 dtdzdydx
x=0 y=0 z=0 t=0
which may be evaluated as follows

Z 0.1 Z 0.1 Z 0.1 Z 100
N = 1023 (0.1 − z)e−t/1000 dtdzdydx
x=0 y=0 z=0 t=0
Z 0.1 Z 0.1 Z 0.1 100
23 −t/1000
= − 1000 × 10 (0.1 − z)e dzdydx
x=0 y=0 z=0 t=0
Z 0.1 Z 0.1 Z 0.1
10 (0.1 − z)(1 − e−0.1 ) dzdydx
26
=
x=0 y=0 z=0
Z 0.1 Z 0.1 Z 0.1
9.5 × 1024 (0.1 − z) dzdydx

=
x=0 y=0 z=0
Z 0.1 Z 0.1 0.1
24 2
= 9.5 × 10 (0.1z − 0.5z ) dydx
x=0 y=0 z=0
Z 0.1 Z 0.1
= 9.5 × 1024 [0.005] dydx
x=0 y=0
Z 0.1 Z 0.1
24
= 0.005 × 9.5 × 10 dydx
x=0 y=0
= 0.005 × 9.5 × 1024 × 0.1 × 0.1 = 4.75 × 1020
Thus the number of decays is approximately equal to 4.75 × 1020
58 HELM (2006):
®
Task
For the solid prism shown below (the subject of the Task on page 50) find
(a) the coordinates of the centre of mass
(b) the moment of inertia about the x- , y- and z-axes.
1
y+z =1
1 y
3
x
x=3
Your solution
(a)
HELM (2006): 59
Answer
The x, y and z coordinates of the centre of mass of a solid of constant density are given on page
55 by
Z Z Z
x dV y dV z dV
x= Z y= Z z= Z
dV dV dV
Z
For the triangular prism, the task on page 50 showed that the denominator dV has value 1.5.
The numerator of the expression for x is given by
1 1−z 3 1 3
1−z Z 1 Z 1−z
x2
Z Z Z Z Z Z
9
x dV = x dxdydz = dydz = dydz
z=0 y=0 x=0 z=0 y=0 2 0 z=0 y=0 2
Z 1 1−z Z 1 Z 1
9 9 9 9
= y dz = (1 − z) − 0 dz = − z dz
z=0 2 y=0 z=0 2 z=0 2 2
1
9 9 2 9 9 9
= z− z = − − (0 − 0) = = 2.25
2 4 0 2 4 4
2.25 1 1
So, x = = 1.5. By similar integration it can be shown that ȳ = , z̄ = .
1.5 3 3
Your solution
(b)
60 HELM (2006):
®
Answer
Z
The moment of inertia about the x−axis, Ix is given by Ix = ρ (y 2 + z 2 ) dV which for the solid
V
under consideration is given by
Z 3 Z 1 Z 1−y 3 1
(1 − y)3
Z Z
2 2 2 3
Ix = ρ (y + z ) dzdydx = ρ y −y + dydx
x=0 y=0 z=0 x=0 y=0 3
Z 3
1 1
= ρ dx = ρ
x=0 6 2
Now, the mass M of the solid is given by M = ρ × Volume = 23 ρ (where the volume had been
calculated in a previous example) so
1 1 M 1
Ix = ρ = ρ × 3 = M
2 2 2
ρ 3
Z
Similarly, the moment of inertia about the y−axis, Iy is given by Iy = ρ (x2 + z 2 ) dV which for
V
the solid under consideration is given by
Z 3 Z 1 Z 1−y 3 1
(1 − y)3
Z Z
2 2 2
Ix = ρ (x + z ) dzdydx = ρ x (1 − y) + dydx
x=0 y=0 z=0 x=0 y=0 3
Z 3
1 2 1 19
= ρ x + dx = ρ
x=0 2 12 4
19 19 M 19 19
and so Iy = ρ = ρ × 3 = M . Finally, by symmetry, Iz = Iy = M .
4 4 2
ρ 6 6
Exercise
For the solid shown below (the subject of the Task on page 47) find the centre of mass and the
moment of inertia about the x-, y- and z-axes.
z
z=6
6
y = 4 − x2
4 y
2
x
Answer
(x̄, ȳ, z̄) = (0.75, 1.6, 3) Ix = 15.66M Iy = 12.8M Iz = 4.46M
HELM (2006): 61
Task
A cube of side 2 is made of laminated material so that, with the origin at one
corner, the density of the material is kx.
(a) First find the mass M of the cube:
Your solution
Answer Z 2 Z 2 Z 2
The integrations over the cube are of the form dV .
x=0 y=0 z=0
The mass M is given by
Z 2 Z 2 Z 2
M = ρ dzdydx
x=0 y=0 z=0
Z 2 Z 2 Z 2
= kx dzdydx
x=0 y=0 z=0
Z 2 Z 2 Z 2 2
2
= 2kx dydx = 4kx dx = 2kx = 8k
x=0 y=0 x=0 0
62 HELM (2006):
®
(b) Now find the position of the centre of mass of the cube:
Your solution
Answer Z
ρx dV
The x-coordinate of the centre of mass will be given by where the numerator is given by
M
Z Z 2 Z 2 Z 2 Z 2 Z 2 Z 2
ρxdV = ρx dzdydx = kx2 dzdydx
x=0 y=0 z=0 x=0 y=0 z=0
Z 2 Z 2 Z 2 2
2 4 32
= 2kx dydx = 4kx dx = kx3 2
= k
x=0 y=0 x=0 3 0 3
32
3
k4
So x = = .
8k 3
Z
ρy dV
The y-coordinate of the centre of mass is given by where the numerator is given by
M
Z Z 2 Z 2 Z 2 Z 2 Z 2 Z 2
ρydV = ρy dzdydx = kxy dzdydx
x=0 y=0 z=0 x=0 y=0 z=0
Z 2 Z 2 Z 2 2
2
= 2kxy dydx = 4kx dx = 2kx = 8k
x=0 y=0 x=0 0
8k
So y = = 1.
8k
By symmetry (the density depends only on x), z = y = 1.
4
The coordinates of the centre of mass are ( , 1, 1).
3
HELM (2006): 63
(c) Finally find the moments of inertia about the x-, y- and z-axes:
Your solution
64 HELM (2006):
®
Answer R
The moment of inertia about the x-axis is given by Ix = V ρ (y 2 + z 2 ) dV (page 58). In this case,
Z 2 Z 2 Z 2
Ix = kx(y 2 + z 2 ) dzdydx
x=0 y=0 z=0
Z 2 Z 2 2 Z 2 Z 2
2 1 3 8
= kx(y z + z ) dydx = kx(2y 2 + ) dydx
x=0 y=0 3 z=0 x=0 y=0 3
Z 2 2 Z 2
2 8 32
= kx( y 3 + y) dx = kx( ) dx
x=0 3 3 y=0 x=0 3
2
16 2 64 8
= kx = k= M
3 0 3 3
where the last step involves substituting that the mass M = 8k.
R
Similarly, the moment of inertia about the y-axis is given by Iy = V
ρ (x2 + z 2 ) dV i.e.
Z 2 Z 2 Z 2
Iy = kx(x2 + z 2 ) dzdydx
x=0 y=0 z=0
Z 2 Z2 Z2 Z2 2 Z 2
3 2 3 1 3
= k(x + xz ) dzdydx = k(x z + xz ) dydx
x=0 y=0 z=0 x=0 y=0 3 z=0
Z 2 Z 2 Z 2
3 8 3 16
= k(2x + x) dydx = k(4x + x) dx
x=0 y=0 3 x=0 3
2
8 32 80 10
= k(x4 + x2 ) = k(16 + ) = k = M
3 x=0 3 3 3
10
By symmetry, Iz = Iy = 3
M.
HELM (2006): 65

Changing Coordinates 27.4

Introduction
We have seen how changing the variable of integration of a single integral or changing the
coordinate system for multiple integrals can make integrals easier to evaluate. In this Section we
introduce the Jacobian. The Jacobian gives a general method for transforming the coordinates of
any multiple integral.
' $
Prerequisites • be familiar with the concept of a function of

several variables
• be able to evaluate the determinant of a
matrix
&
' %
$
• decide which coordinate transformation
simplifies an integral
Learning Outcomes • determine the Jacobian for a coordinate

transformation
• evaluate multiple integrals using a
transformation of coordinates
& %
66 HELM (2006):
®
1. Changing variables in multiple integrals

Z b
When the method of substitution is used to solve an integral of the form f (x) dx three parts
a
of the integral are changed, the limits, the function and the infinitesimal dx. So if the substitution
is of the form x = x(u) the u limits, c and d, are found by solving a = x(c) and b = x(d) and the
function is expressed in terms of u as f (x(u)).
x(u)
δx
δx
δu δu
u
Figure 28
Figure 28 shows why the dx needs to be changed. While the δu is the same length for all u, the δx
d
change as u changes. The rate at which they change is precisely x(u). This gives the relation
du
dx
δx = δu
du
Hence the transformed integral can be written as
Z b Z d
dx
f (x) dx = f (x (u)) du
a c du
dx
Here the is playing the part of the Jacobian that we will define.
du
Another change of coordinates that you have seen is the transformations from cartesian coordinates
(x, y) to polar coordinates (r, θ).
Recall that a double integral in polar coordinates is expressed as
ZZ ZZ
f (x, y) dxdy = g(r, θ) rdrdθ
δr
δr
B
A
δθ
r1 r1 + δr r2 r2 + δr
Figure 29
We can see from Figure 29 that the area elements change in size as r increases. The circumference
θ
of a circle of radius r is 2πr, so the length of an arc spanned by an angle θ is 2πr = rθ. Hence
2π
HELM (2006): 67
Section 27.4: Changing Coordinates
the area elements in polar coordinates are approximated by rectangles of width δr and length rδθ.
Thus under the transformation from cartesian to polar coordinates we have the relation
δxδy → rδrδθ
that is, rδrδθ plays the same role as δxδy. This is why the r term appears in the integrand. Here r
is playing the part of the Jacobian.
2. The Jacobian
ZZ
Given an integral of the form f (x, y) dxdy
A
Assume we have a change of variables of the form x = x(u, v) and y = y(u, v) then the Jacobian of
the transformation is defined as

∂x ∂x

∂u ∂v
J(u, v) =

∂y ∂y

∂u ∂v

Key Point 10
Jacobian in Two Variables
For given transformations x = x(u, v) and y = y(u, v) the Jacobian is

∂x ∂x

∂u ∂v
J(u, v) =

∂y ∂y

∂u ∂v

Notice the pattern occurring in the x, y, u and v. Across a row of the determinant the numerators
are the same and down a column the denominators are the same.
Notation
Different textbooks use different notation for the Jacobian. The following are equivalent.

x, y ∂(x, y)
J(u, v) = J(x, y; u, v) = J =
u, v ∂(u, v)
The Jacobian correctly describes how area elements change under such a transformation. The required
relationship is
dxdy → |J(u, v)| dudv
that is, |J(u, v)| dudv plays the role of dxdy.
68 HELM (2006):
®
Key Point 11
Jacobian for Transforming Areas
When transforming area elements employing the Jacobian it is the modulus of the Jacobian that
must be used.
Example 24
Find the area of the circle of radius R.
y
θ = π/2
θ=π θ θ=0
r=0 θ = 2π x
r=R
Figure 30
Solution
Let A beZthe region bounded by a circle of radius R centred at the origin. Then the area of this
region is dA. We will calculate this area by changing to polar coordinates, so consider the usual
A
transformation x = r cos θ, y = r sin θ from cartesian to polar coordinates. First we require all the
partial derivatives
∂x ∂y ∂x ∂y
= cos θ = sin θ = −r sin θ = r cos θ
∂r ∂r ∂θ ∂θ
Thus

∂x ∂x

∂r ∂θ

cos θ −r sin θ

J(r, θ) =

= = cos θ × r cos θ − (−r sin θ) × sin θ
∂y ∂y sin θ r cos θ

∂r ∂θ

= r cos2 θ + sin2 θ = r

HELM (2006): 69
Solution (contd.)
This confirms the previous result for polar coordinates, dxdy → rdrdθ. The limits on r are r = 0
(centre) to r = R (edge). The limits on θ are θ = 0 to θ = 2π, i.e. starting to the right and going
once round anticlockwise. The required area is
Z 2π Z R Z 2π Z R
R2
Z
dA = |J(r, θ)| drdθ = r drdθ = 2π = πR2
A 0 0 0 0 2
Note that here r > 0 so |J(r, θ)| = J(r, θ) = r.
Example 25
The diamond shaped region A in Figure 31(a) is bounded by the lines x + 2y = 2,
x − 2y = 2, x + 2y = −2 and x − 2y = −2. We wish to evaluate the integral
ZZ
I= (3x + 6y)2 dA
A
over this region. Since the region A is neither vertically nor horizontally simple,
evaluating I without changing coordinates would require separating the region into
two simple triangular regions. So we use a change of coordinates to transform A
to a square region in Figure 31(b) and evaluate I.
y v
1 v=2
x − 2y = −2 x + 2y = 2
u=2
−2
A A!
2 u
x
u = −2
x + 2y = −2 x − 2y = 2
−1 v = −2
(a) (b)
Figure 31
70 HELM (2006):
®
Solution
By considering the equations of the boundary lines of region A it is easy to see that the change of
coordinates
du = x + 2y (1) v = x − 2y (2)
will transform the boundary lines to u = 2, u = −2, v = 2 and v = −2. These values of u and v
are the new limits of integration. The region A will be transformed to the square region A0 shown
above.
We require the inverse transformations so that we can substitute for x and y in terms of u and v.
By adding (1) and (2) we obtain u + v = 2x and by subtracting (1) and (2) we obtain u − v = 4y,
thus the required change of coordinates is
1 1
x= (u + v) y = (u − v)
2 4
Substituting for x and y in the integrand (3x + 6y)2 of I gives
3
2
2
(u + v) + 64 (u − v) = 9u2
We have the new limits of integration and the new form of the integrand, we now require the
Jacobian. The required partial derivatives are
∂x 1 ∂x 1 ∂y 1 ∂y 1
= = = =−
∂u 2 ∂v 2 ∂u 4 ∂v 4
Then the Jacobian is

1 1

2 2 1
J(u, v) = =−

1 1 4
−
4 4

1
Then dA0 = |J(u, v)|dA = dA. Using the new limits, integrand and the Jacobian, the integral
4
can be written
Z 2Z 2
9 2
I= u dudv.
−2 −2 4
You should evaluate this integral and check that I = 48.
HELM (2006): 71
Task ZZ
This Task concerns using a transformation to evaluate (x2 + y 2 ) dxdy.
(a) Given the transformations u = x + y, v = x − y express x and y in terms of u and v to find the
inverse transformations:
Your solution
Answer
u=x+y (1)
v =x−y (2)
Add equations (1) and (2) u + v = 2x
Subtract equation (2) from equation (1) u − v = 2y
So x = 12 (u + v) y = 21 (u − v)
(b) Find the Jacobian J(u, v) for the transformation in part (a):
Your solution
Answer
∂x 1 ∂x 1 ∂y 1 ∂y 1
Evaluating the partial derivatives, = , = , = and = − so the Jacobian
∂u 2 ∂v 2 ∂u 2 ∂v 2
∂x ∂x 1 1

∂u ∂v 2 2 1 1 1
= =− − =−

4 4 2

∂y ∂y 1 1
−
∂u ∂v 2 2

72 HELM (2006):
®
ZZ
x2 + y 2

(c) Express the integral I = dxdy in terms of u and v, using the transformations
introduced in (a) and the Jacobian found in (b):
Your solution
Answer
1 1 1
On letting x = (u + v), y = 2
(u − v) and dxdy = |J| dudv = 2
dudv, the integral
ZZ 2
2 2

x +y dxdy becomes
ZZ
1 1 1
I = (u + v)2 + (u − v)2 × dudv
4 4 2
ZZ
1 2 1
= u + v 2 × dudv
2 2
ZZ
1 2
u + v 2 dudv

=
4
(d) Find the limits on u and v for the rectangle with vertices (x, y) = (0, 0), (2, 2), (−1, 5), (−3, 3):
Your solution
Answer
For (0, 0), u = 0 and v = 0
For (2, 2), u = 4 and v = 0
For (−1, 5), u = 4 and v = −6
For (−3, 3), u = 0 and v = −6
Thus, the limits on u are u = 0 to u = 4 while the limits on v are v = −6 to v = 0.
HELM (2006): 73
(e) Finally evaluate I:
Your solution
Answer
The integral is
Z 0 Z 4
1 2
u + v 2 dudv

I =
v=−6 u=0 4
4
1 0
Z Z 0
1 3 2 16 2
= u + uv dudv = + v dv
4 v=−6 3 u=0 v=−6 3
0
16 1 3 16 1
= v+ v =0− × (−6) + × (−216) = 104
3 3 −6 3 3
3. The Jacobian in 3 dimensions

When changing the coordinate system of a triple integral
ZZZ
I= f (x, y, z) dV
V
we need to extend the above definition of the Jacobian to 3 dimensions.
Key Point 12
Jacobian in Three Variables
For given transformations x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) the Jacobian is
∂x ∂x ∂x

∂u ∂v ∂w

∂y ∂y ∂y
J(u, v, w) =
∂u ∂v ∂w

∂z ∂z ∂z

∂u ∂v ∂w
The same pattern persists as in the 2-dimensional case (see Key Point 10). Across a row of the
determinant the numerators are the same and down a column the denominators are the same.
74 HELM (2006):
®
The volume element dV = dxdydz becomes dV = |J(u, v, w)| dudvdw. As before the limits and
integrand must also be transformed.
Example 26
Use spherical coordinates to find the volume of a sphere of radius R.
(x, y, z)
φ r
θ y
Figure 32
Solution
The change of coordinates from Cartesian to spherical polar coordinates is given by the transforma-
tion equations
x = r cos θ sin φ y = r sin θ sin φ z = r cos φ
We now need the nine partial derivatives
∂x ∂x ∂x
= cos θ sin φ = −r sin θ sin φ = r cos θ cos φ
∂r ∂θ ∂φ
∂y ∂y ∂y
= sin θ sin φ = r cos θ sin φ = r sin θ cos φ
∂r ∂θ ∂φ
∂z ∂z ∂z
= cos φ =0 = r sin φ
∂r ∂θ ∂φ
Hence we have

cos θ sin φ −r sin θ sin φ r cos θ cos φ

J(r, θ, φ) = sin θ sin φ r cos θ sin φ r sin θ cos φ

cos φ 0 −r sin φ

−r sin θ sin φ r cos θ cos φ
+ 0 − r sin φ cos θ sin φ −r sin θ sin φ

J(r, θ, φ) = cos φ
r cos θ sin φ r sin θ cos φ sin θ sin φ r cos θ sin φ
Check that this gives J(r, θ, φ) = −r2 sin φ. Notice that J(r, θ, φ) ≤ 0 for 0 ≤ φ ≤ π, so
|J(r, θ, φ)| = r2 sin φ. The limits are found as follows. The variable φ is related to ‘latitude’ with
φ = 0 representing the ‘North Pole’ with φ = π/2 representing the equator and φ = π representing
the ‘South Pole’.
HELM (2006): 75
Solution (contd.)
The variable θ is related to ‘longitude’ with values of 0 to 2π covering every point for each value of
φ. Thus limits on φ are 0 to π and limits on θ are 0 to 2π. The limits on r are r = 0 (centre) to
r = R (surface).
To find the volume of the sphere we then integrate the volume element dV = r2 sin φ drdθdφ
between these limits.
Z π Z 2π Z R Z π Z 2π
2 1 3
Volume = r sin φ drdθdφ = R sin φ dθdφ
0 0 0 0 0 3
Z π
2π 3 4
= R sin φ dφ = πR3
0 3 3
Example 27
Find the volume integral of the function f (x, y, z) = x − y over the parallelepiped
with the vertices of the base at
(x, y, z) = (0, 0, 0), (2, 0, 0), (3, 1, 0) and (1, 1, 0)
and the vertices of the upper face at
(x, y, z) = (0, 1, 2), (2, 1, 2), (3, 2, 2) and (1, 2, 2).
z
y
x
Figure 33
76 HELM (2006):
®
Solution
This will be a difficult integral to derive limits for in terms of x, y and z. However, it can be noted
that the base is described by z = 0 while the upper face is described by z = 2. Similarly, the front
face is described by 2y − z = 0 with the back face being described by 2y − z = 2. Finally the left
face satisfies 2x − 2y + z = 0 while the right face satisfies 2x − 2y + z = 4.
The above suggests a change of variable with the new variables satisfying u = 2x−2y+z, v = 2y−z
and w = z and the limits on u being 0 to 4, the limits on v being 0 to 2 and the limits on w being
0 to 2.
Inverting the relationship between u, v, w and x, y and z, gives
1 1
x = (u + v) y = (v + w) z=w
2 2
The Jacobian is given by
∂x ∂x ∂x

1 1
0

∂u ∂v ∂w
2 2

1
∂y ∂y ∂y
J(u, v, w) =
= 1 1 =

∂u ∂v ∂w 0 4

2 2

∂z ∂z ∂z

0 0 1
∂u ∂v ∂w
Note that the function f (x, y, z) = x − y equals 12 (u + v) − 21 (v + w) = 12 (u − w). Thus the integral
is
Z 2 Z 2 Z 4 Z 2 Z 2 Z 4
1 1 1
(u − w) dudvdw = (u − w) dudvdw
w=0 v=0 u=0 2 4 w=0 v=0 u=0 8
Z 2 Z 2 4
1 2 1
= u − uw dvdw
w=0 v=0 16 8 0
Z 2 Z 2
1
= 1 − w dvdw
w=0 v=0 2
Z 2 h
vw 2i
= v− dw
w=0 2 0
Z 2
= (2 − w) dw
w=0
2
1 2
= 2w − w
2 0
4
= 4− −0
2
= 2
HELM (2006): 77
Task
Find the Jacobian for the following transformation:
x = 2u + 3v − w, y = v − 5w, z = u + 4w
Your solution
Answer
Evaluating the partial derivatives,
∂x ∂x ∂x
= 2, = 3, = −1,
∂u ∂v ∂w
∂y ∂y ∂y
= 0, = 1, = −5,
∂u ∂v ∂w
∂z ∂z ∂z
= 1, = 0, =4
∂u ∂v ∂w
so the Jacobian is
∂x ∂x ∂x

∂u ∂v ∂w 2 3 −1

1 −5 3 −1
∂y ∂y ∂y
∂u ∂v ∂w = 0 1 −5 = 2
+ 1 = 2 × 4 + 1 × (−14) = −6

0 4 1 −5

1 0 4
∂z ∂z ∂z

∂u ∂v ∂w
where expansion of the determinant has taken place down the first column.
78 HELM (2006):
®
Volume of liquid in an ellipsoidal tank

Introduction
An ellipsoidal tank (elliptical when viewed from along x-, y- or z-axes) has a volume of liquid poured
into it. It is useful to know in advance how deep the liquid will be. In order to make this calculation,
it is necessary to perform a multiple integration and calculate a Jacobian.
b a
Figure 34
Problem in words
The metal tank is in the form of an ellipsoid, with semi-axes a, b and c. A volume V of liquid is
poured into the tank ( V < 43 πabc, the volume of the ellipsoid) and the problem is to calculate the
depth, h, of the liquid.
The shaded area is expressed as the triple integral
Z h Z y 2 Z x2
V = dxdydz
z=0 y=y1 x=x1
where limits of integration

r r
y 2 (z − c)2 y 2 (z − c)2
x1 = −a 1 − 2 − and x2 = +a 1− −
b c2 b2 c2
2
y 2 (z − c)2

x
which come from rearranging the equation of the ellipsoid 2 + 2 + = 1 and limits
a b c2
bp 2 bp 2
y1 = − c − (z − c)2 and y2 = + c − (z − c)2
c c
2
(z − c)2

y
from the equation of an ellipse in the y-z plane + =1 .
b2 c2
HELM (2006): 79
To calculate V , use the substitutions
1
(z − c)2 2

x = aτ cos φ 1 −
c2
1
(z − c)2 2

y = bτ sin φ 1 −
c2
z = z
now expressing the triple integral as

Z h Z φ2 Z τ2
V = J dτ dφdz
z=0 φ=φ1 τ =τ1
where J is the Jacobian of the transformation calculated from

∂x ∂x ∂x

∂τ ∂φ ∂z
∂y ∂y ∂y
J =

∂τ ∂φ ∂z

∂z ∂z ∂z
∂τ ∂φ ∂z
and reduces to
∂x ∂y ∂x ∂y ∂z ∂z
J = − since = =0
∂τ ∂φ ∂φ ∂τ ∂τ ∂φ
( 1 1 )
(z − c)2 2 (z − c)2 2

= a cos φ 1 − bτ cos φ 1 −
c2 c2
( 1 1 )
(z − c)2 2 (z − c)2 2

− −aτ sin φ 1 − b sin φ 1 −
c2 c2
(z − c)2

2 2

= abτ cos φ + sin φ 1 −
c2
(z − c)2

= abτ 1 −
c2
To determine limits of integration for φ, note that the substitutions above are similar to a cylindrical
polar co-ordinate system, and so φ goes from 0 to 2π. For τ , setting τ = 0 ⇒ x = 0 and y = 0, i.e.
the z-axis.
Setting τ = 1 gives
x2 (z − c)2

2
= cos φ 1 − (1)
a2 c2
and
y2 (z − c)2

2
= sin φ 1 − (2)
b2 c2
80 HELM (2006):
®
Summing both sides of Equations (1) and (2) gives

x2 y 2 (z − c)2

2 2
+ 2 = (cos φ + sin φ) 1 −
a2 b c2
or
x2 y 2 (z − c)2
+ 2 + =1
a2 b c2
which is the equation of the ellipsoid, i.e. the outer edge of the volume. Therefore the range of τ
should be 0 to 1. Now
Z h Z 2π Z 1
(z − c)2
V = ab 1− τ dτ dφdz
z=0 c2 φ=0 τ =0
Z 2π 2 1
ab h
Z
2 τ
= 2 (2zc − z ) dφdz
c z=0 φ=0 2 τ =0
2π
ab h
Z
2
= (2zc − z ) φ dz
2c2 z=0 φ=0
h
z3

πab 2
= cz −
c2 3 z=0
h3

πab 2
= ch −
c2 3
Interpretation
Suppose the tank has actual dimensions of a = 2 m, b = 0.5 m and c = 3 m and a volume of 7 m3
is to be poured into it. (The total volume of the tank is 4π m3 ≈ 12.57 m3 ). Then, from above
h3

πab 2
V = 2 ch −
c 3
which becomes
h3

π 2
7= 3h −
9 3
with solution h = 3.23 m (2 d.p.), compared to the maximum height of the ellipsoid of 6 m.
HELM (2006): 81
Exercises
1. The function f = x2 + y 2 is to be integrated over an elliptical cone with base being the ellipse,
x2 /4 + y 2 = 1, z = 0 and apex (point) at (0, 0, 5). The integral can be made simpler by means
w w
of the change of variables x = 2(1 − )τ cos θ, y = (1 − )τ sin θ, z = w.
5 5
z
(0, 0, 5)
x
(−2, 0, 0) (2, 0, 0)
(0, −1, 0)
(a) Find the limits on the variables τ , θ and w.

(b) Find the Jacobian J(τ, θ, w) for this transformation.
ZZZ
(c) Express the integral (x2 + y 2 ) dxdydz in terms of τ , θ and w.
(d) Evaluate this integral. [Hint:- it may be worth noting that cos2 θ ≡ 21 (1 + cos 2θ)].
Note: This integral has relevance in topics such as moments of inertia.

p
2. Using cylindrical polar coordinates, integrate the function f = z x2 + y 2 over the volume
between the surfaces z = 0 and z = 1 + x2 + y 2 for 0 ≤ x2 + y 2 ≤ 1.
3. A torus (doughnut) has major radius R and minor radius r. Using the transformation x =
(R + τ cos α) cos θ, y = (R + τ cos α) sin θ, z = τ sin α, find the volume of the torus. [Hints:-
limits on α and θ are 0 to 2π, limits on τ are 0 to r. Show that Jacobian is τ (R + τ cos α)].
z
y
τ
θ α
x
R r
82 HELM (2006):
®
4. Find the Jacobian for the following transformations.
(a) x = u2 + vw, y = 2v + u2 w, z = uvw

(b) Cylindrical polar coordinates. x = ρ cos θ, y = ρ sin θ, z = z
z
ρ
(x, y, z)
y
θ
Answers
1. (a) τ : 0 to 1, θ : 0 to 2π, w : 0 to 5
w
(b) 2(1 − )2 τ
5
Z 1 Z 2π Z 5
w
(c) 2 (1 − )4 τ 3 (4 cos2 θ + sin2 θ) dwdθdτ
τ =0 θ=0 w=0 5
5
(d) π
2
92
2. π
105
3. 2π 2 Rr2
4. (a) 4u2 v − 2u4 w + u2 vw2 − 2v 2 w, (b) ρ
HELM (2006): 83
Contents 28
Differential
Vector Calculus
28.1 Background to Vector Calculus 2
28.2 Differential Vector Calculus 17
28.3 Orthogonal Curvilinear Coordinates 37
Learning outcomes
In this Workbook you will learn about scalar and vector fields and how physical quantities
can be represented by such fields. You will be able to 'differentiate' such fields i.e. to find
how rapidly the scalar or vector field varies with position. Depending on whether the
original function and the intended derivative are scalars or vectors, there are three such
derivatives known as the 'gradient', the 'divergence' and the 'curl'. You will be able to
evaluate these derivatives for given fields. In addition, you will be able to work out the
derivatives while using polar coordinate systems.
Background to Vector
Calculus 28.1
Introduction
Vector Calculus is the study of the various derivatives and integrals of a scalar or vector function of
the variables defining position (x,y,z) and possibly also time (t). This Section considers functions of
several variables and introduces scalar and vector fields.
' $
two variables
Prerequisites • be familiar with the concept of partial
Before starting this Section you should . . . differentiation
• be familiar with the concept of vectors

&
%

• state the properties of scalar and vector fields

Learning Outcomes
On completion you should be able to . . . • work with a vector function of a variable

2 HELM (2006):
Workbook 28: Differential Vector Calculus
®
1. Functions of several variables and partial derivatives

These functions were first studied in 18. As a reminder:
• a function of the two independent variables x and y may be written as f (x, y)
∂f ∂f ∂ 2 f ∂ 2 f ∂2f
• the first and second order partial derivatives are , , , and .
∂x ∂y ∂x2 ∂y 2 ∂x∂y
Consider, for example, the function f (x, y) = x2 + 5xy + 3y 4 + 1. The first and second partial
derivatives are
∂f
= 2x + 5y (differentiating with respect to x keeping y constant)
∂x
∂f
= 5x + 12y 3 (differentiating with respect to y keeping x constant)
∂y
∂2f

∂ ∂f ∂
2
= = (2x + 5y) = 2
∂x ∂x ∂x ∂x
∂2f

∂ ∂f ∂
5x + 12y 3 = 36y 2

2
= =
∂y ∂y ∂y ∂y
2 2

∂ f ∂ f ∂ ∂f ∂
= = = (2x + 5y) = 5
∂x∂y ∂y∂x ∂y ∂x ∂y
The number of independent variables is not restricted to two. For example, if u is a function of the
three variables x, y and z, say u = x2 + y 2 + z 2 then:
∂u ∂u ∂u ∂2u ∂2u ∂2u
= 2x, = 2y, = 2z, = 2, = 2, =2
∂x ∂y ∂z ∂x2 ∂y 2 ∂z 2
Similarly, if u is a function of the four variables x, y, z and t say u = xy 2 z 3 et then
∂u ∂u ∂2u
= y 2 z 3 et , = xy 2 z 3 et , = 6xy 2 zet , etc.
∂x ∂t ∂z 2
2. Vector functions of a variable

Vectors were first studied in 9. A vector is a quantity that has magnitude and direction and
combines together with other vectors according to the triangle law. Examples are (i) a velocity of
60 mph West and (ii) a force of 98.1 newtons vertically downwards.
It is often convenient to express vectors in terms of i, j and k, which are unit vectors in the x, y and
z directions respectively. Examples are a = 3i + 4j and b = 2i − 2j + k
√ p
The magnitudes of these vectors are |a| = 32 + 42 = 5 and |b| = 22 + (−2)2 + 12 = 3 respec-
tively. In this case a and b are constant vectors, but a vector could be a function of an independent
variable such as t (which may represent time in certain applications).
Example 1
A particle is at the point A(3,0). At time t = 0 it starts moving at a constant
speed of 2 m s−1 in a direction parallel to the positive y-axis. Find expressions for
the position vector, r, of the particle at time t, together with its velocity v = dr
dt
d2 r
and acceleration a = dt2 .
HELM (2006): 3
Section 28.1: Background to Vector Calculus
Solution
In the first second of its motion the particle moves 2 metres to B and it moves a further 2 metres in
each subsequent second, to C, D, . . .. Because it moves parallel to the y-axis its velocity is v = 2j.
As its velocity is constant its acceleration is a = 0.
The position of the particle at t = 0, 1, 2, 3 is given in the table.
Time t 0 1 2 3
Position r 3i 3i + 2j 3i + 4j 3i + 6j
In general, after t seconds, the position vector of the particle is r = 3i + 2tj
Example 2
The position vector of a particle at time t is given by r = 2ti + t2 j . Find its
equation in Cartesian form and sketch the path followed by the particle.
Tabulating r = xi + yj at different times t:
Time t 0 1 2 3 4
x 0 2 4 6 8
y 0 1 4 9 16
r 0 2i + j 4i + 4j 6i + 9j 8i + 16j
Solution
To find the Cartesian equation of the curve
2 we eliminate t between x = 2t and y = t2 . Re-arrange
1 2 1 1 2
x = 2t as t = 2 x . Then y = t = 2 x = 4 x , which is a parabola. This is the path followed by
the particle. See Figure 1.
16
x
0 2 4 6 8
Figure 1: Path followed by a particle
4 HELM (2006):
®
In general, a three-dimensional vector function of one variable t is of the form

u = x(t)i + y(t)j + z(t)k.
Such functions may be differentiated one or more times and the rules of differentiation are derived
from those for ordinary scalar functions. In particular, if u and v are vector functions of t and if c is
a constant, then:
d du dv
Rule 1. (u + v) = +
dt dt dt
d du
Rule 2. (cu) = c
dt dt
d dv du
Rule 3. (u · v) = u · + ·v
dt dt dt
d dv du
Rule 4. (u × v) = u × + ×v
dt dt dt
Also, if a particle moves so that its position vector at time t is r(t) = x(t)i + y(t)j + z(t)k then the
velocity of the particle is
dr dx(t) dy(t) dz(t)
v= = ṙ = i+ j+ k = ẋi + ẏj + żk
dt dt dt dt
and its acceleration is
dv d2 r d2 x(t) d2 y(t) d2 z(t)
a= = 2 = r̈ = i + j + k = ẍi + ÿj + z̈k
dt dt dt2 dt2 dt2
Example 3
Find the derivative (with respect to t) of the position vector r = t2 i + 3tj + 4k.
Also find a unit vector tangential to the curve traced out by the position vector at
the point where t = 2.
Solution
Differentiating r with respect to t,
dr
ṙ = = 2ti + 3j
dt
so
ṙ(2) = 4i + 3j
A unit vector in this direction, which is tangential to the curve, is
ṙ(2) 4i + 3j 4 3
=√ = i+ j
|ṙ(2)| 42 + 32 5 5
HELM (2006): 5
Example 4
For the position vectors (i) r = 3i + 2tj and (ii) r = 2ti + t2 j use the general
expressions for velocity and acceleration to confirm the values of v and a found
earlier in Examples 1 and 2.
Solution
(i) r = 3i + 2tj. Then
dr d d(3) d(2t)
v= = ṙ = (3i + 2tj) = i+ j = 0i + 2j = 2j
dt dt dt dt
and
dv d d(2)
a= = r̈ = (2j) = j = 0j = 0
dt dt dt
which agree with those found earlier.
(ii) r = 2ti + t2 j. Then

dr d d(2t) d(t2 )
v= = ṙ = (2ti + t2 j) = i+ j = 2i + 2tj
dt dt dt dt
and
dv d d(2) d(2t)
a= = r̈ = (2i + 2tj) = i+ j = 0i + 2j = 2j
dt dt dt dt
which agree with those found earlier.
Example 5
A particle of mass m = 1 kg has position vector r. The torque (moment of force)
H relative to the origin acting on the particle as a result of a force F is defined as
H = r × F , where, by Newton’s second law, F = mr̈. The angular momentum
(moment of momentum) L of the particle is defined as L = r × mṙ . Find L and
H for the particle where (i) r = 3i + 2tj and (ii) r = 2ti + t2 j, and show that in
each case the torque law H = L̇ is satisfied.
Solution
(i) Here r = 3i + 2tj so ṙ = 2j and a = 0. Then
d
L = r × mṙ = (3i + 2tj) × 2j = 6k so L̇ = (6)k = 0
dt
and
H = r × F = r × mr̈ = (3i + 2tj) × 0 = 0 giving H = L̇ as required.
6 HELM (2006):
®
Solution (contd.)
(ii) Here r = 2ti + t2 j so ṙ = 2i + 2tj and a = 2j. Then
L = r × mṙ = (2ti + t2 j) × (2i + 2tj) = (4t2 − 2t2 )k = 2t2 k so L̇ = 4tk

and
H = r × F = r × mr̈ = (2ti + t2 j) × 2j = 4tk giving H = L̇ as required.
Task
A particle moves so that its position vector is r = 12ti + (19t − 5t2 )j.
dr d2 r
(a) Find and 2 .
dt dt
dr
(b) When is the j-component of equal to zero?
dt
(c) Find a unit vector normal to its trajectory when t = 1.
Your solution
Answer
dr d2 r
(a) = 12i + (19 − 10t)j, = −10j.
dt dt2
dr
(b) The j-component of , (also written ṙ) is zero when t = 1.9.
dt
(c) When t = 1 ṙ = 12i + 9j. A vector perpendicular to this is ṙ = 9i − 12j. Its magnitude
√
is 81 + 144 = 15. So a unit vector in this direction is 12 15
9
i − 15 j = 45 i − 53 j. The unit
4 3
vector − i + j is also a solution.
5 5
HELM (2006): 7
Task
A particle moving at a constant speed around a circle moves so that
r = cos(πt)i + sin(πt)j
dr d2 r
(a) Find and 2 .
dt dt
dr d2 r
(b) Find r · and r × 2 .
dt dt
Your solution
Answer
dr d2 r
(a) = −π sin πti + π cos πtj, 2
= −π 2 cos πti − π 2 sin πtj = −π 2 r,
dt dt
dr dr
(b) r. = −π cos πt sin πt + π cos πt sin πt = 0 ⇒ is perpendicular to r
dt dt

2 i j k
d2 r

d r
r × 2 = cos πt sin πt 0 = 0 ⇒ is parallel to r.
dt −π 2 cos πt π 2 sin πt 0 dt2
8 HELM (2006):
®
Task
1. If r = sin(2t)i + cos(2t)j + t2 k and (1 + t2 ) |r̈|2 = c |ṙ|2 , find the value of c.
Your solution
Answer
ṙ = 2 cos(2t)i − 2 sin(2t)j + 2tk, r̈ = −4 sin(2t)i − 4 cos(2t)j + 2k
|r̈|2 = 16 sin2 (2t) + 16 cos2 (2t) + 4 = 20 |ṙ|2 = 4 cos2 (2t) + 4 sin2 (2t) + 4t2 = 4(1 + t2 )
∴ 20(1 + t2 ) = 4c(1 + t2 ) so that c = 5.
3. Scalar fields
A scalar field is a distribution of scalar values over a region of space (which may be 1D, 2D or 3D)
so that a scalar value is associated with each point of space. Examples of scalar fields follow.
1.
100 81 50 10 0
74 30
100 90 62 18 0
83 41 7
100 95 70 26 0
67 37
100 86 50 10 0
Figure 2: Temperature in a plate, one side held at 100◦ C the other at 0◦ C
2.
10 7
9 15 29 20 12 3
5 20 40 61 52 35 14
8 36 42 23 5
55
19 14 4
Figure 3: Height of land above sea level
HELM (2006): 9
3. The mean annual rainfall at different locations in Britain.
4. The light intensity near a 100 watt light bulb.
To define a scalar field we need to:
• Describe the region of space where it is found (this is the domain)
• Give a rule to show how the value of the scalar is related to every point in the domain.
Consider the scalar field defined by φ(x, y) = x + y over the rectangle 0 ≤ x ≤ 4, 0 ≤ y ≤ 2. We

can calculate, and plot, values of φ at different (x, y) points. For example φ(0, 2) = 0 + 2 = 2,
φ(4, 1) = 4 + 1 = 5 and so on.
y
2.0
2 3 4 5 6
2 3 4 5
1.0 1 2 3 4 5
1 2 3 4
0 1 2 3 4
0 1.0 2.0 3.0 4.0 x
Figure 4: The scalar field φ(x, y) = x + y

Contours
A contour on a map is a curve joining points that are the same height above sea level. These contours
give far more information about the shape of the land than selected spot heights.
For example, the contours near the top of a hill might look like those shown in Figure 5 where the
numbers are the values of the heights above sea level.
In general for a scalar field φ(x, y, z) , contour curves are the family of curves given by φ = c , for
different values of the constant c.
60 50 40 30 20 10
Figure 5: Contour lines
10 HELM (2006):
®
Example 6
Describe contour curves for the following scalar fields and sketch typical contours
for (a) and (b).
(a) φ(x, y) = x + y
(b) φ(x, y) = 9 − x2 − y 2
1
(c) φ(x, y) = 2
x + y2 + z2
Solution
(a) The contour curves for φ(x, y) = x + y are x + y = c or y = −x + c.

These are straight lines of gradient −1. See Figure 6(a).
(b) For φ(x, y) = 9 − x2 − y 2 , the contour curves are 9 − x2 − y 2 = c, or x2 + y 2 =
√9 − c.
See Figure 6(b). These are circles, centered at the origin, radius 9 − c.
y
y
φ=0
2 φ=5 φ=5
φ=4 φ=8
x
1 φ=2 1 2 3
φ=3
φ=1
0 1 2 3 4
x
(a) (b)
Figure 6: Contours for (a) x + y (b) 9 − x2 − y 2

1
(c) For the three-dimensional scalar field φ(x, y, z) = the contour surfaces are
x2 + y2 + z2
1 1
= c or x2 + y 2 + z 2 = . These are spheres, centered at the origin and of
x2 2
+y +z 2 c
1
radius √ .
c
HELM (2006): 11
Task
1. Describe the contours for the following scalar fields
(a) φ = y − x (b) φ = x2 + y 2 (c) φ = y − x2
Your solution
Answer
(a) Straight lines of gradient 1, (b) Circles; centred at origin, (c) Parabolas y = x2 + c.
Key Point 1
A scalar field F (in three-dimensional space) returns a real value for the function F for every point
(x, y, z) in the domain of the field.
4. Vector fields
A vector field is a distribution of vectors over a region of space such that a vector is associated with
each point of the region. Examples are:
1. The velocity of water flowing in a river (Figure 7).
Figure 7: Velocity of water in a river
2. The gravitational pull of the Earth (Figure 8). At every point there is a gravitational pull
towards the centre of the Earth.
Figure 8: Gravitational pull of the Earth
Note: the length of the vector is used to indicate its magnitude (i.e. greater near the centre
of the Earth.)
12 HELM (2006):
®
3. The flow of heat in a metal plate insulated on its sides (Figure 9). Heat flows from the hot
portion on the left to the cool portion on the right.
100◦ 0◦
Figure 9: Flow of heat in a metal plate
To define a vector field we need to :
• Describe the region of space where the vectors are found (the domain)
• Give a rule for associating a vector with each point of the domain.
Note that in the case of the heat flowing in a plate, the temperature can be described by a scalar
field while the flow of heat is described by a vector field.
Consider the flow of water in different situations.
(a) In a pond where the water is motionless everywhere, the velocity at all points is zero.
That is, v(x, y, z) = 0 , or for brevity, v = 0.
(b) Consider a straight river with steady flow downstream (see Figure 10). The surface
velocity v can be seen by watching the motion of a light floating object, such as a leaf.
The leaf will float downstream parallel to the bank so v will be a multiple of j. However,
the speed is usually smallest near the bank and fastest in the middle of the river. In this
simple model, the velocity v is assumed to be independent of the depth z. That is, v
varies, in the i, or x, direction so that v will be of the form v = f (x)j.
y
v
bank bank
j
i x
Figure 10: Flow in a straight river
(c) In a more realistic model v would vary as we move downstream and would be different at
different depths due to, for example, rocks or bends. The velocity at any point could also
depend on when the observation was made (for example the speed would be higher shortly
after heavy rain) and so in general the velocity would be a function of the four variables
x, y, z and t, and be of the form v = f1 (x, y, z, t)i + f2 (x, y, z, t)j + f3 (x, y, z, t)k, for
suitable functions f1 , f2 and f3 .
HELM (2006): 13
Example 7
Sketch sample vectors at the points (3, 2), (−2, 2), (−3, −1), (−1, 4) for the
two-dimensional vector field defined by v = xi + 2j.
Solution
At (3, 2), v = 3i + 2j
At (−2, 2), v = −2i + 2j
At (−3, −1), v = −3i + 2j
At (−1, 4), v = −i + 2j
Plotting these vectors v gives the arrows in Figure 11.
−6 −4 −2 2 4 6
−2
−4
Figure 11: Sample vectors for the vector field v = xi + 2j
It is possible to construct curves which start from and are in the same direction as any one vector
and are guided by the direction of successive vectors. Starting at different points gives a set of
non-intersecting lines called, depending on the context, vector field lines, lines of flow, streamlines
or lines of force.
For example, consider the vector field F = −yi + xj; F can be calculated at various points in the
xy plane. Some of the individual vectors can be seen in Figure 12(a) while Figure 12(b) shows them
converted seamlessly to field lines. For this function F the field lines are circles centered at the origin.
y y
2 2
1 1
−2 −1 −2 −1
1 2 x 1 2 x
−1 −1
−2 −2
(a) (b)
Figure 12: (a) Vectors at various points (b) Converted to field lines
14 HELM (2006):
®
Example 8
The Earth is affected by the gravitational force field of the Sun. This vector
field is such that each vector F is directed towards the Sun and has magnitude
1 p
proportional to 2 , where r = x2 + y 2 + z 2 is the distance from the Sun to the
r
Earth. Derive an equation for F and sketch some field lines.
Solution
The field has magnitude proportional to r−2 = (x2 + y 2 + z 2 )−1 and points directly towards the
Sun (the origin) i.e. parallel to a unit vector pointing towards the origin. At the point given by
−xi − yj − zk −xi − yj − zk
r = xi + yj + zk, a unit vector pointing towards the origin is =p .
−xi − yj − zk x2 + y 2 + z 2
Multiplying the unit vector by the required magnitude r−2 = (x2 + y 2 + z 2 )−1 (and by a constant
−xi − yj − zk
of proportionality c) gives F = c 2 . Figure 13 shows some field lines for F .
(x + y 2 + z 2 )3/2
Earth
Sun
Figure 13: Gravitational field of the Sun
Key Point 2
A vector field F (x, y, z) (in three-dimensional coordinates) returns a vector F 0 = F (x0 , y0 , z0 ) for
every point (x0 , y0 , z0 ) in the domain of the field.
HELM (2006): 15
Exercises
1. Which of the following are scalar fields and which are vector fields?
(a) F = x2 − yz
2x − z
(b) G = p
x2 + y 2 + z 2 + 1
(c) f = xi + yj + zk
y−1 z−1 x−1
(d) H = 2
x+ 2 y+ 2 z
z +1 x +1 y +1
(e) g = (y + z)i
2. Draw streamlines for the vector fields
(a) f = i + 2j
(b) g = i + y 2 j
Answers
1. (a), (b) and (d) are scalar fields as the quantities defined are scalars.
(c) and (e) are vector fields as the quantities defined are vectors.
2.
(a) The vectors point in the (b) As |y| increases, the

same direction everywhere y-component increases
16 HELM (2006):
®
Differential Vector
Calculus 28.2
Introduction
A vector field or a scalar field can be differentiated with respect to position in three ways to produce
another vector field or scalar field. This Section studies the three derivatives, that is: (i) the gradient
of a scalar field (ii) the divergence of a vector field and (iii) the curl of a vector field.
' $
two variables
Prerequisites • be familiar with the concept of partial
Before starting this Section you should . . . differentiation
• be familiar with scalar and vector fields

&
%

Learning Outcomes • find the divergence, gradient or curl of a

vector or scalar field

HELM (2006): 17
Section 28.2: Differential Vector Calculus
1. The gradient of a scalar field
Consider the height φ above sea level at various points on a hill. Some contours for such a hill are
shown in Figure 14.
φ = 10
20
30 40 50 60
C B
A D
Figure 14: “Contour map” of a hill

We are interested in how φ changes from one point to another. Starting from A and making
a displacement d the change in height (φ ) depends on the direction of the displacement. The
magnitude of each d is the same.
Displacement Change in φ
AB 40 − 30 = 10
AC 40 − 30 = 10
AD 30 − 30 = 0
AE 20 − 30 = −10
The change in φ clearly depends on the direction of the displacement. For the paths shown φ
increases most rapidly along AB, does not increase at all along AD (as A and D are both on the
same contour and so are both at the same height) and decreases along AE.
The direction in which φ changes fastest is along the line of greatest slope which is orthogonal (i.e.
perpendicular) to the contours. Hence, at each point of a scalar field we can define a vector field
giving the magnitude and direction of the greatest rate of change of φ locally.
A vector field, called the gradient, written grad φ, can be associated with a scalar field φ so that
at every point the direction of the vector field is orthogonal to the scalar field contour which is the
direction of the maximum rate of change of φ.
For a second example consider a metal plate heated at one corner and cooled by an ice bag at the
opposite corner. All edges and surfaces are insulated. After a while a steady state situation exists in
which the temperature φ at any point remains the same. Some temperature contours are shown in
Figure 15.
heat source heat source
35 35
30 25 30 25
20 15 20 15
10 10
5 5
ice bag ice bag
(a) (b)
Figure 15: Temperature contours and heat flow lines for a metal plate
18 HELM (2006):
®
The direction of the heat flow is along the flow lines which are orthogonal to the contours (see the
dashed lines in Figure 15(b)); this heat flow is proportional to F = grad φ.
Definition
∂φ ∂φ ∂φ
The gradient of the scalar field φ = f (x, y, z) is grad φ = ∇φ = i+ j+ k
∂x ∂y ∂z
Often, instead of grad φ, the notation ∇φ is used. (∇ is a vector differential operator called ‘del’ or
∂ ∂ ∂
‘nabla’ defined by i+ j + k. As a vector differential operator, it retains the characteristics
∂x ∂y ∂z
of a vector while also carrying out differentiation.)
The vector grad φ gives the magnitude and direction of the greatest rate of change of φ at any
point, and is always orthogonal to the contours of φ. For example, in Figure 14, grad φ points in
the direction of AB while the contour line is parallel to AD i.e. perpendicular to AB. Similarly, in
Figure 15(b), the various intersections of the contours with the lines representing grad φ occur at
right-angles.
For the hill considered earlier the direction and magnitude of grad φ are shown at various points
in Figure 16. Note that the magnitude of grad φ is greatest (as indicated by the length of the arrow)
when the hill is at its steepest (as indicated by the closeness of the contours).
φ = 10
20
30 40 50 60
Figure 16: Grad φ and the steepest ascent direction for a hill
Key Point 3
φ is a scalar field but grad φ is a vector field.
HELM (2006): 19
Example 9
Find grad φ for
(a) φ = x2 − 3y (b) φ = xy 2 z 3
Solution
∂ 2 ∂ ∂
(a) grad φ = (x − 3y)i + (x2 − 3y)j + (x2 − 3y)k = 2xi + (−3)j + 0k = 2xi − 3j
∂x ∂y ∂z
∂ ∂ ∂
(b) grad φ = (xy 2 z 3 )i + (xy 2 z 3 )j + (xy 2 z 3 )k = y 2 z 3 i + 2xyz 3 j + 3xy 2 z 2 k
∂x ∂y ∂z
Example 10
For f = x2 + y 2 find grad f at the point A(1, 2). Show that the direction of
grad f is orthogonal to the contour at this point.
Solution
∂f ∂f ∂f
grad f = i+ j+ k = 2xi + 2yj + 0k = 2xi + 2yj
∂x ∂y ∂z
and at A(1, 2) this equals 2 × 1i + 2 × 2j = 2i + 4j.
Since f = x2 + y 2 then the contours are defined by x2 + y 2 = constant, so the contours are circles
centred at the origin. The vector grad f at A(1, 2) points directly away from the origin and hence
1
grad f and the contour are orthogonal; see Figure 17. Note that r(A) = i + 2j = grad f .
2
y grad f
2 A
O 1 x
Figure 17: Grad f is perpendicular to the contour lines
The change in a function φ in a given direction (specified as a unit vector a) is determined from the
scalar product (grad φ) · a. This scalar quantity is called the directional derivative.
Note:
• a along a contour implies a is perpendicular to grad φ which implies a · grad φ = 0.
• a perpendicular to a contour implies a · grad φ is a maximum.
20 HELM (2006):
®
Task
Given φ = x2 y 2 z 2 , find
(a) grad φ
(b) grad φ at (−1, 1, 1) and a unit vector in this direction.
(c) the derivative of φ at (2, 1, −1) in the direction of
(i) i (ii) d = 35 i + 45 k.
Your solution
Answer
∂φ ∂φ ∂φ
(a) grad φ = i+ j+ k = 2xy 2 z 2 i + 2x2 yz 2 j + 2x2 y 2 zk
∂x ∂y ∂z
(b) At (−1, 1, 1), grad φ = −2i + 2j + 2k
A unit vector in this direction is
grad φ −2i + 2j + 2k 1 1 1 1
=p = √ (−2i + 2j + 2k) = − √ i + √ j + √ k
|grad φ| 2 2
(−2) + 2 + 2 2 2 3 3 3 3
(c) At (2, 1, −1), grad φ = 4i + 8j − 8k
(i) To find the derivative of φ in the direction of i take the scalar product
(4i + 8j − 8k) · i = 4 × 1 + 0 + 0 = 4. So the derivative in the direction of d is 4.
3 4
(ii) To find the derivative of φ in the direction of d = i + k take the scalar product
5 5
3 4 3 4 12 32
(4i + 8j − 8k) · ( i + k) = 4 × + 0 + (−8) × = − = −4.
5 5 5 5 5 5
So the derivative in the direction of d is −4.
HELM (2006): 21
Exercises
1. Find grad φ for the following scalar fields
(a) φ = y − x. (b) φ = y − x2 , (c) φ = x2 + y 2 + z 2 .
2. Find grad φ for each of the following two-dimensional scalar fields given that r = xi + yj and
p
r = x2 + y 2 (you should express your answer in terms of r).
1
(a) φ = r, (b) φ = ln r, (c) φ = , (d) φ = rn .
r
3. If φ = x3 y 2 z, find,
(a) ∇φ
(b) a unit vector normal to the contour at the point (1, 1, 1).
(c) the rate of change of φ at (1, 1, 1) in the direction of i.
(d) the rate of change of φ at (1, 1, 1) in the direction of the unit vector n = √1 (i + j + k).
3
4. Find a unit vector which is normal to the sphere x2 + (y − 1)2 + (z + 1)2 = 2 at the point
(0, 0, 0).
5. Find unit vectors normal to φ1 = y − x2 and φ2 = x + y − 2. Hence find the angle between
the curves y = x2 and y = 2 − x at their point of intersection in the first quadrant.
Answers
∂ ∂
1. (a) (y − x)i + (y − x)j = −i + j
∂x ∂y
(b) −2xi + j
∂ 2 ∂ ∂
(c) [ (x + y 2 + z 2 )]i + [ (x2 + y 2 + z 2 )]j + [ (x2 + y 2 + z 2 )]k = 2xi + 2yj + 2zk
∂x ∂y ∂z
r r r
2. (a) , (b) 2 , (c) − 3 , (d) nrn−2 r
r r r
1 √
3. (a) 3x2 y 2 zi + 2x3 yzj + x3 y 2 k, (b) √ (3i + 2j + k), (c) 3, (d) 2 3
14
4. The vector field ∇φ where φ = x2 + (y − 1)2 + (z + 1)2 is 2xi + 2(y − 1)j + 2(z + 1)k
The value that this vector field takes at the point (0, 0, 0) is −2j +2k which is a vector normal
to the sphere.
1
Dividing this vector by its magnitude forms a unit vector: √ (−j + k)
2
5. 108◦ or 72◦ (intersect at (1, 1)) [At intersection, grad φ1 = −2i + j and grad φ2 = i + j.]
22 HELM (2006):
®
2. The divergence of a vector field

Consider the vector field F = F1 i + F2 j + F3 k.
In 3D cartesian coordinates the divergence of F is defined to be

∂F1 ∂F2 ∂F3
div F = + + .
∂x ∂y ∂z
Note that F is a vector field but div F is a scalar.
In terms of the differential operator ∇, div F = ∇ · F since
∂ ∂ ∂ ∂F1 ∂F2 ∂F3
∇ · F = (i +j + k ) · (F1 i + F2 j + F3 k) = + + .
∂x ∂y ∂z ∂x ∂y ∂z
Physical Significance of the Divergence
The meaning of the divergence is most easily understood by considering the behaviour of a fluid
and hence is relevant to engineering topics such as thermodynamics. The divergence (of the vector
field representing velocity) at a point in a fluid (liquid or gas) is a measure of the rate per unit volume
at which the fluid is flowing away from the point. A negative divergence is a convergence indicating a
flow towards the point. Physically positive divergence means that either the fluid is expanding or that
fluid is being supplied by a source external to the field. Conversely convergence means a contraction
or the presence of a sink through which fluid is removed from the field. The lines of flow diverge
from a source and converge to a sink.
If there is no gain or loss of fluid anywhere then div v = 0 which is the equation of continuity
for an incompressible fluid.
The divergence also enters engineering topics such as electromagnetism. A magnetic field (B) has
the property ∇ · B = 0, that is there are no isolated sources or sinks of magnetic field (no magnetic
monopoles).
Key Point 4
F is a vector field but div F is a scalar field.
HELM (2006): 23
Example 11
Find the divergence of the following vector fields.
(a) F = x2 i + y 2 j + z 2 k
(b) r = xi + yj + zk
(c) v = −xi + yj + 2k
Solution
∂ 2 ∂ 2 ∂
(a) div F = (x ) + (y ) + (z 2 ) = 2x + 2y + 2z
∂x ∂y ∂z
∂ ∂ ∂
(b) div r = (x) + (y) + (z) = 1 + 1 + 1 = 3
∂x ∂y ∂z
∂ ∂ ∂
(c) div v = (−x) + (y) + (2) = −1 + 1 + 0 = 0
∂x ∂y ∂z
Example 12
Find the value of a for which F = (2x2 y + z 2 )i + (xy 2 − x2 z)j + (axyz − 2x2 y 2 )k
is incompressible.
Solution
F is incompressible if div F = 0.
∂ ∂ ∂
div F = (2x2 y + z 2 ) + (xy 2 − x2 z) + (axyz − 2x2 y 2 ) = 4xy + 2xy + axy
∂x ∂y ∂z
which is zero if a = −6.
Task
1. Find the divergence of the following vector field, in general terms and at the
point (1, 0, 3).
F 1 = x3 i + y 3 j + z 3 k
Your solution
Answer
(a) 3x2 + 3y 2 + 3z 2 , 30
24 HELM (2006):
®
Task
1. Find the divergence of F 2 = x2 yi − 2xy 2 j, in general terms and at (1, 0, 3).
Your solution
Answer
−2xy, 0,
Task
1. Find the divergence of F 3 = x2 zi − 2y 3 z 3 j + xyz 2 k, in general terms and
at the point (1, 0, 3).
Your solution
Answer
2xz − 6y 2 z 3 + 2xyz, 6
3. The curl of a vector field

The curl of the vector field given by F = F1 i + F2 j + F3 k is defined as the vector field

i j k

∂ ∂ ∂
curl F = ∇ × F =
∂x ∂y ∂z

F1 F2 F3

∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1
= − i+ − j+ − k
∂y ∂z ∂z ∂x ∂x ∂y
Physical significance of curl
The divergence of a vector field represents the outflow rate from a point; however the curl of a vector
field represents the rotation at a point.
Consider the flow of water down a river (Figure 18). The surface velocity v of the water is revealed
by watching a light floating object such as a leaf. You will notice two types of motion. First the
leaf floats down the river following the streamlines of v, but it may also rotate. This rotation may
be quite fast near the bank, but slow or zero in midstream. Rotation occurs when the velocity, and
HELM (2006): 25
hence the drag, is greater on one side of the leaf than the other.
bank bank
Figure 18: Rotation of a leaf in a stream

Note that for a two-dimensional vector field, such as v described here, curl v is perpendicular to the
motion, and this is the direction of the axis about which the leaf rotates. The magnitude of curl v
is related to the speed of rotation.
For motion in three dimensions a particle will tend to rotate about the axis that points in the direction
of curl v, with its magnitude measuring the speed of rotation.
If, at any point P, curl v = 0 then there is no rotation at P and v is said to be irrotational at P. If
curl v = 0 at all points of the domain of v then the vector field is an irrotational vector field.
Key Point 5
Note that F is a vector field and that curl F is also a vector field.
Example 13
Find curl v for the following two-dimensional vector fields
(a) v = xi + 2j (b) v = −yi + xj
If v represents the surface velocity of the flow of water, describe the motion of a
floating leaf.
Solution

i j k

∂ ∂ ∂
(a) ∇ × v = ∂x ∂y ∂z

x 2 0

∂ ∂ ∂ ∂ ∂ ∂
= (0) − (2) i + (x) − (0) j + (2) − (x) k = 0
∂y ∂z ∂z ∂x ∂x ∂y
A floating leaf will travel along the streamlines without rotating.
26 HELM (2006):
®
Solution (contd.)
(b)

i j k

∂ ∂ ∂

∇ × v = ∂x ∂y ∂z

−y x 0

∂ ∂ ∂ ∂ ∂ ∂
= (0) − (x) i + (−y) − (0) j + (x) − (−y) k
∂y ∂z ∂z ∂x ∂x ∂y
= 0i + 0j + 2k = 2k
A floating leaf will travel along the streamlines (anti-clockwise around the origin ) and will rotate
anticlockwise (as seen from above).
An analogy of the right-hand screw rule is that a positive (anti-clockwise) rotation in the xy plane
represents a positive z-component of the curl. Similar results apply for the other components.
Example 14
(a) Find the curl of u = x2 i + y 2 j. When is u irrotational?
(b) Given F = (xy − xz)i + 3x2 j + yzk, find curl F at the origin (0, 0, 0)
and at the point P = (1, 2, 3).
Solution
(a)

i j k

∂ ∂ ∂
curl u = ∇ × F = ∂x ∂y ∂z

x2 y 2 0

∂ ∂ 2 ∂ 2 ∂ ∂ 2 ∂ 2
= (0) − (y ) i + (x ) − (0) j + (y ) − (x ) k
∂y ∂z ∂z ∂x ∂x ∂y
= 0i + 0j + 0k = 0
curl u = 0 so u is irrotational everywhere.
HELM (2006): 27
Solution (contd.)
(b)

i j k

∂ ∂ ∂

curl F = ∇ × F = ∂x ∂y ∂z

xy − xz 3x2 yz

∂ ∂ 2 ∂ ∂
= (yz) − (3x ) i + (xy − xz) − (yz) j
∂y ∂z ∂z ∂x

∂ 2 ∂
+ (3x ) − (xy − xz) k
∂x ∂y
= zi − xj + 5xk
At the point (0, 0, 0), curl F = 0. At the point (1, 2, 3), curl F = 3i − j + 5k.
Current associated with a magnetic field
Introduction
In a magnetic field B, an associated current is given by:
1
I= (∇ × B)
µ0
Problem in words
Given the magnetic field B = B0 xk find the associated current I.
Figure 19: Magnetic field profile

We need to evaluate the curl of B.
28 HELM (2006):
®

i j k

∂ ∂ ∂
∇ × B =
∂x ∂y ∂z

0 0 B0 x
= 0i − B0 j + 0k
= −B0 j
B0
and so I = − j.
µ0
Interpretation
The current is perpendicular to the field and to the direction of variation of the field.
Task
Find the curl of the following two-dimensional vector field (a) in general terms and
(b) at the point (1, 2).
F 2 = y 2 i + xyj
Your solution
Answer
i j k

∂ ∂ ∂
(a) ∇ × F2 = = 0i + 0j + (y − 2y)k = −yk
∂x ∂y ∂z

2
y xy 0
(b) −2k
HELM (2006): 29
Exercises
1. Find the curl of each of the following two-dimensional vector fields. Give each in general terms
and also at the point (1, 2).
(a) F 1 = 2xi + 2yj

(b) F 3 = x2 y 3 i − x3 y 2 j
2. Find the curl of each of the following three-dimensional vector fields. Give each in general
terms and also at the point (2, 1, 3).
(a) F 1 = y 2 z 3 i + 2xyz 3 j + 3xy 2 z 2 k

(b) F 2 = (xy + z 2 )i + x2 j + (xz − 2)k
3. The surface water velocity on a straight uniform river 20 metres wide is modelled by the vector
1
v = 50 x(20 − x)j where x is the distance from the west bank (see diagram).
x i
20 m
(a) Find the velocity v at each bank and at midstream.

(b) Find ∇ × v at each bank and at midstream.
4. The velocity field on the surface of an emptying bathroom sink can be modelled by two
functions, the first describing the swirling vortex of radius a near the plughole and the second
describing the more gently rotating fluid outside the vortex region. These functions are
p
u(x, y) = w(−yi + xj), 2
x +y ≤a2
wa2 (−yi + xj) p

v(x, y) = x2 + y2 ≥a
x2 + y 2
Find (a) curl u and (b)curl v.

Answers
1. (a) 0; (b) −6x2 y 2 k, −24k
2. (a) 0; (b) zj + xk, 3j + 2k
3. (a) 0; 0; 2j, (b) +0.4k; −0.4k; 0
4. (a) 2wk; (b) 0
30 HELM (2006):
®
4. The Laplacian
The Laplacian of a function φ is written as ∇2 φ and is defined as: Laplacian φ = div grad φ, that is
∇2 φ = ∇ · ∇φ

∂φ ∂φ ∂φ
= ∇· i+ j+ k
∂x ∂y ∂z
∂2φ ∂2φ ∂2φ
= + + 2
∂x2 ∂y 2 ∂z
∂2φ ∂2φ ∂2φ

The equation ∇2 φ = 0, that is + + = 0 is known as Laplace’s equation and has
∂x2 ∂y 2 ∂z 2
applications in many branches of engineering including Heat Flow, Electrical and Magnetic Fields
and Fluid Mechanics.
Example 15
Find the Laplacian of u = x2 y 2 z + 2xz.
Solution
∂2u ∂2u ∂2u
∇2 u = + + = 2y 2 z + 2x2 z + 0 = 2(x2 + y 2 )z
∂x2 ∂y 2 ∂z 2
HELM (2006): 31
5. Examples involving grad, div, curl and the Laplacian
The vector differential operators can be combined in several ways as the following examples show.
Example 16
If A = 2yzi − x2 yj + xz 2 k, B = x2 i + yzj − xyk and φ = 2x2 yz 3 , find
(a) (A · ∇)φ (b) A · ∇φ (c) B × ∇φ (d) ∇2 φ
Solution
(a)

2 2 ∂ ∂ ∂
(A · ∇)φ = (2yzi − x yj + xz k) · ( i + j + k) φ
∂x ∂y ∂z

∂ ∂ ∂
= 2yz − x2 y + xz 2 2x2 yz 3
∂x ∂y ∂z
∂ ∂ ∂
= 2yz (2x2 yz 3 ) − x2 y (2x2 yz 3 ) + xz 2 (2x2 yz 3 )
∂x ∂y ∂z
3 2 2 3 2 2 2
= 2yz(4xyz ) − x y(2x y ) + xz (6x yz )
= 8xy 2 z 4 − 2x4 y 4 + 6x3 yz 4
(b)
∂ ∂ ∂
∇φ = (2x2 yz 3 )i + (2x2 yz 3 )j + (2x2 yz 3 )k
∂x ∂y ∂z
3 2 3 2 2
= 4xyz i + 2x z j + 6x yz k
2yzi − x2 yj + xz 2 k · (4xyz 3 i + 2x2 z 3 j + 6x2 yz 2 k)

So A · ∇φ =
= 8xy 2 z 4 − 2x4 yz 3 + 6x3 yz 4
(c) ∇φ = 4xyz 3 i + 2x2 z 3 j + 6x2 yz 2 k so

i j k

B × ∇φ = x2 yz −xy

4xyz 3 2x2 z 3 6x2 yz 2
= i(6x2 y 2 z 3 + 2x3 yz 3 ) + j(−4x2 y 2 z 3 − 6x4 yz 2 ) + k(2x4 z 3 − 4xy 2 z 4 )
∂2 ∂2 ∂2
(d) ∇2 φ = (2x 2
yz 3
) + (2x 2
yz 3
) + (2x2 yz 3 ) = 4yz 3 + 0 + 12x2 yz
∂x2 ∂y 2 ∂z 2
32 HELM (2006):
®
Example 17
For each of the expressions below determine whether the quantity can be formed
and, if so, whether it is a scalar or a vector.
(a) grad(div A)
(b) grad(grad φ)
(c) curl(div F )
(d) div [ curl (A×grad φ) ]
Solution
(a) A is a vector and divA can be calculated and is a scalar. Hence, grad(div A) can be
formed and is a vector.
(b) φ is a scalar so grad φ can be formed and is a vector. As grad φ is a vector, it is not
possible to take grad(grad φ).
(c) F is a vector and hence div F is a scalar. It is not possible to take the curl of a scalar
so curl(div F ) does not exist.
(d) φ is a scalar so grad φ exists and is a vector. A×grad φ exists and is also a vector as is
curl A×grad φ. The divergence can be taken of this last vector to give
div [ curl (A×grad φ) ] which is a scalar.
6. Identities involving grad, div and curl

There are numerous identities involving the vector derivatives; a selection are given in Table 1.
Table 1
1 div(φA) = grad φ · A + φ div A or ∇ · (φA) = (∇φ) · A + φ(∇ · A)
2 curl(φA) = grad φ × A + φ curl A or ∇ × (φA) = (∇φ) × A + φ(∇ × A)
3 div (A × B) = B· curl A − A· curl B or ∇ · (A × B) = B · (∇ × A) − A · (∇ × B)
4 curl (A × B) = (B· grad ) A − (A· grad ) B or ∇ × (A × B) = (B · ∇)A − (A · ∇)B
+A div B − B div A +A ∇ · B − B ∇ · A
5 grad (A · B) = (B· grad ) A + (A· grad ) B or ∇(A · B) = (B · ∇)A + (A · ∇)B
+A× curl B + B× curl A +A × (∇ × B) + B × (∇ × A)
6 curl (grad φ) = 0 or ∇ × (∇φ) = 0
7 div (curl A) = 0 or ∇ · (∇ × A) = 0
HELM (2006): 33
Example 18
Show for any vector field A = A1 i + A2 j + A3 k, that div curl A = 0.
Solution

i j j

∂ ∂ ∂
div curl A = div

∂x ∂y ∂z

A1 A2 A3

∂A3 ∂A2 ∂A1 ∂A3 ∂A2 ∂A1
= div − i+ − j+ − k
∂y ∂z ∂z ∂x ∂x ∂y

∂ ∂A3 ∂A2 ∂ ∂A1 ∂A3 ∂ ∂A2 ∂A1
= − + − + −
∂x ∂y ∂z ∂y ∂z ∂x ∂z ∂x ∂y
2 2 2 2 2 2
∂ A3 ∂ A2 ∂ A1 ∂ A3 ∂ A2 ∂ A1
= − + − + −
∂x∂y ∂z∂x ∂y∂z ∂y∂x ∂z∂x ∂z∂y
= 0
∂ 2 A3 ∂ 2 A3
N.B. This assumes = etc.
∂x∂y ∂y∂x
Example 19
Verify identity 1 for the vector A = 2xyi − 3zk and the function φ = xy 2 .
Solution
φA = 2x2 y 3 i − 3xy 2 zk so
∂ ∂
∇ · φA = ∇ · 2x2 y 3 i − 3xy 2 zk = (2x2 y 3 ) + (−3xy 2 z) = 4xy 3 − 3xy 2

∂x ∂z
So LHS = 4xy 3 − 3xy 2 .
∂ ∂ ∂
∇φ = (xy 2 )i + (xy 2 )j + (xy 2 )k = y 2 i + 2xyj so
∂x ∂y ∂z
(∇φ) · A = (y i + 2xyj) · (2xyi − 3zk) = 2xy 3
2
∇ · A = ∇ · (2xyi − 3zk) = 2y − 3 so φ∇ · A = 2xy 3 − 3xy 2 giving

(∇φ) · A + φ(∇ · A) = 2xy 3 + (2xy 3 − 3xy 2 ) = 4xy 3 − 3xy 2
So RHS = 4xy 3 − 3xy 2 = LHS.
So ∇ · (φA) = (∇φ) · A + φ(∇ · A) in this case.
34 HELM (2006):
®
Task
If F = x2 yi − 2xzj + 2yzk, find
(a) ∇ · F
(b) ∇ × F
(c) ∇(∇ · F )
(d) ∇ · (∇ × F )
(e) ∇ × (∇ × F )
Your solution
Answer
(a) 2xy + 2y,
(b) (2x + 2z)i − (x2 + 2z)k,
(c) 2yi + (2 + 2x)j (using answer to (a)),
(d) 0 (using answer to (b)),
(e) (2 + 2x)j (using answer to (b))
HELM (2006): 35
Task
If φ = 2xz − y 2 z, find
(a) ∇φ
(b) ∇2 φ = ∇ · (∇φ)
(c) ∇ × (∇φ)
Your solution
Answer
(a) 2zi − 2yzj + (2x − y 2 )k, (b) −2z, (c) 0 where (b) and (c) use the answer to (a).
Exercise
Which of the following combinations of grad, div and curl can be formed? If a quantity can be
formed, state whether it is a scalar or a vector.
(a) div (grad φ)

(b) div (div A)
(c) curl (curl F )
(d) div (curl F )
(e) curl (grad φ)
(f) curl (div A)
(g) div (A · B)
(h) grad (φ1 φ2 )
(i) curl (div (A× grad φ))
Answers
(a), (d) are scalars;
(c), (e), (h) are vectors;
(b), (f), (g) and (i) are not defined.
36 HELM (2006):
®
Orthogonal
Curvilinear
Coordinates 28.3
Introduction
The derivatives div, grad and curl from Section 28.2 can be carried out using coordinate systems other
than the rectangular Cartesian coordinates. This Section shows how to calculate these derivatives in
other coordinate systems. Two coordinate systems - cylindrical polar coordinates and spherical polar
coordinates - will be illustrated.

• be able to find the gradient, divergence and
Prerequisites curl of a field in Cartesian coordinates
Before starting this Section you should . . . • be familiar with polar coordinates

• find the divergence, gradient or curl of a
Learning Outcomes vector or scalar field expressed in terms of
On completion you should be able to . . . orthogonal curvilinear coordinates

HELM (2006): 37
Section 28.3: Orthogonal Curvilinear Coordinates
1. Orthogonal curvilinear coordinates
The results shown in Section 28.2 have been given in terms of the familiar Cartesian (x, y, z) co-
ordinate system. However, other coordinate systems can be used to better describe some physical
situations. A set of coordinates u = u(x, y, z), v = v(x, y, z) and w = w(x, y, z) where the direc-
tions at any point indicated by u, v and w are orthogonal (perpendicular) to each other is referred to
as a set of orthogonal curvilinear coordinates.
q With each coordinate is associated a scale factor
∂x 2 ∂y 2 ∂z 2

hu , hv or hw respectively where hu = ∂u
+ ∂u + ∂u (with similar expressions for hv and
hw ). The scale factor gives a measure of how a change in the coordinate changes the position of a
point.
Two commonly-used sets of orthogonal curvilinear coordinates are cylindrical polar coordinates
and spherical polar coordinates. These are similar to the plane polar coordinates introduced in
17.2 but represent extensions to three dimensions.
Cylindrical polar coordinates

This corresponds to plane polar (ρ, φ) coordinates with an added z-coordinate directed out of the
xy plane. Normally the variables ρ and φ are used instead of r and θ to give the three coordinates
ρ, φ and z. A cylinder has equation ρ = constant.
The relationship between the coordinate systems is given by
x = ρ cos φ y = ρ sin φ z=z
(i.e. the same z is used by the two coordinate systems). See Figure 20(a).
z z k̂
ρ ρ φ̂
(x, y, z) (x, y, z)
ρ̂
y y
φ ρ φ ρ
(a) (b)
x x
Figure 20: Cylindrical polar coordinates
The scale factors hρ , hφ and hz are given as follows

s
2 2 2
∂x ∂y ∂z p
hρ = + + = (cos φ)2 + (sin φ)2 + 0 = 1
∂ρ ∂ρ ∂ρ
s
2 2 2
∂x ∂y ∂z p
hφ = + + = (−ρ sin φ)2 + (ρ cos φ)2 + 0 = ρ
∂φ ∂φ ∂φ
s
2 2 2
∂x ∂y ∂z p
hz = + + = (02 + 02 + 12 ) = 1
∂z ∂z ∂z
38 HELM (2006):
®
Spherical polar coordinates

In this system a point is referred to by its distance from the origin r and two angles φ and θ. The
angle θ is the angle between the positive z-axis and the line from the origin to the point. The angle
φ is the angle from the x-axis to the projection of the point in the xy plane.
A useful analogy is of latitude, longitude and height on Earth.
• The variable r plays the role of height (but height measured above the centre of Earth rather
than from the surface).
• The variable θ plays the role of latitude but is modified so that θ = 0 represents the North
π
Pole, θ = 90◦ = represents the equator and θ = 180◦ = π represents the South Pole.
2
• The variable φ plays the role of longitude.
A sphere has equation r = constant.

The relationship between the coordinate systems is given by
x = r sin θ cos φ y = r sin θ sin φ z = r cos θ. See Figure 21.
(x, y, z)
θ r
y
φ
x,
Figure 21: Spherical polar coordinates
The scale factors hr , hθ and hφ are given by

s
2 2 2
∂x ∂y ∂z p
hr = + + = (sin θ cos φ)2 + (sin θ sin φ)2 + (cos θ)2 = 1
∂r ∂r ∂r
s
2 2 2
∂x ∂y ∂z p
hθ = + + = (r cos θ cos φ)2 + (r cos θ sin φ)2 + (−r sin θ)2 = r
∂θ ∂θ ∂θ
s
2 2 2
∂x ∂y ∂z p
hφ = + + = (−r sin θ sin φ)2 + (r sin θ sin φ)2 + 0 = r sin θ
∂φ ∂φ ∂φ
HELM (2006): 39
2. Vector derivatives in orthogonal coordinates
Given an orthogonal coordinate system u, v, w with unit vectors û, v̂ and ŵ and scale factors, hu ,
hv and hw , it is possible to find the derivatives ∇f , ∇ · F and ∇ × F .
It is found that
1 ∂f 1 ∂f 1 ∂f
grad f = ∇f = û + v̂ + ŵ
hu ∂u hv ∂v hw ∂w
If F = Fu û + Fv v̂ + Fw ŵ
then

1 ∂ ∂ ∂
div F = ∇ · F = (Fu hv hw ) + (Fv hu hw ) + (Fw hu hv )
hu hv hw ∂u ∂v ∂w
Also if F = Fu û + Fv v̂ + Fw ŵ
then

hu û hv v̂ hw ŵ

1 ∂

∂ ∂

curl F = ∇ × F =

hu hv hw

∂u ∂v ∂w

h F h F h F
u u v v w w
Key Point 6
In orthogonal curvilinear coordinates, the vector derivatives ∇f , ∇ · F and ∇ × F include the scale
factors hu , hv and hw .
3. Cylindrical polar coordinates

In cylindrical polar coordinates (ρ, φ, z), the three unit vectors are ρ̂, φ̂ and ẑ (see Figure 20(b) on
page 38) with scale factors
hρ = 1, hφ = ρ, hz = 1.
The quantities ρ and φ are related to x and y by x = ρ cos φ and y = ρ sin φ. The unit vectors are
ρ̂ = cos φi + sin φj and φ̂ = − sin φi + cos φj. In cylindrical polar coordinates,
∂f 1 ∂f ∂f
grad f = ∇f = ρ̂ + φ̂ + ẑ
∂ρ ρ ∂φ ∂z
The scale factor ρ is necessary in the φ-component because the derivatives with respect to φ are
distorted by the distance from the axis ρ = 0.
40 HELM (2006):
®
If F = Fρ ρ̂ + Fφ φ̂ + Fz ẑ then

1 ∂ ∂ ∂
div F = ∇ · F = (ρFρ ) + (Fφ ) + (ρFz )
ρ ∂ρ ∂φ ∂z

ρ̂ ρφ̂ ẑ

1 ∂ ∂ ∂
curl F = ∇ × F = .
ρ ∂ρ ∂φ ∂z

Fρ ρFφ Fz
Example 20
Working in cylindrical polar coordinates, find ∇f for f = ρ2 + z 2
Solution
∂f ∂f ∂f
If f = ρ2 + z 2 then = 2ρ, = 0 and = 2z so ∇f = 2ρρ̂ + 2z ẑ.
∂ρ ∂φ ∂z
Example 21
Working in cylindrical polar coordinates find
(a) ∇f for f = ρ3 sin φ

(b) Show that the result for (a) is consistent with that found working in
Cartesian coordinates.
Solution
∂f ∂f ∂f
(a) If f = ρ3 sin φ then = 3ρ2 sin φ, = ρ3 cos φ and = 0 and hence,
∂ρ ∂φ ∂z
∇f = 3ρ2 sin φρ̂ + ρ2 cos φφ̂.
(b) f = ρ3 sin φ = ρ2 ρ sin φ = (x2 + y 2 )y = x2 y + y 3 so ∇f = 2xyi + (x2 + 3y 2 )j.
Using cylindrical polar coordinates, from (a) we have
∇f = 3ρ2 sin φρ̂ + ρ3 cos φφ̂

= 3ρ2 sin φ(cos φi + sin φj) + ρ2 cos φ(− sin φi + cos φj)
= 3ρ2 sin φ cos φ − ρ2 sin φ cos φ i + 3ρ2 sin2 φ + ρ2 cos2 φ j

= 2ρ2 sin φ cos φ i + 3ρ2 sin2 φ + ρ2 cos2 φ j = 2xyi + (3y 2 + x2 )j

So the results using Cartesian and cylindrical polar coordinates are consistent.
HELM (2006): 41
Example 22
Find ∇ · F for F = Fρ ρ̂ + Fφ φ̂ + Fz ẑ = ρ3 ρ̂ + ρz φ̂ + ρz sin φẑ. Show that the
results are consistent with those found using Cartesian coordinates.
Solution
Here, Fρ = ρ3 , Fφ = ρz and Fz = ρz sin φ so

1 ∂ ∂ ∂
∇·F = (ρFρ ) + (Fφ ) + (ρFz )
ρ ∂ρ ∂φ ∂z

1 ∂ 4 ∂ ∂ 2
= (ρ ) + (ρz) + (ρ z sin φ)
ρ ∂ρ ∂φ ∂z
1 3
4ρ + 0 + ρ2 sin φ

=
ρ
= 4ρ2 + ρ sin φ
Converting to Cartesian coordinates,
F = Fρ ρ̂ + Fφ φ̂ + Fz ẑ = ρ3 ρ̂ + ρz φ̂ + ρz sin φẑ
= ρ3 (cos φi + sin φj) + ρz(− sin φi + cos φj) + ρz sin φk
= (ρ3 cos φ − ρz sin φ)i + (ρ3 sin φ + ρz cos φ) + ρzk
= ρ2 (ρ cos φ) − ρ sin φz i + ρ2 (ρ sin φ) + ρ cos φz j + ρ sin φzk

= (x2 + y 2 )x − yz i + (x2 + y 2 )y + xz j + yzk

= (x3 + xy 2 − yz)i + (x2 y + y 3 + xz)j + yzk
So
∂ 3 ∂ 2 ∂
∇·F = (x + xy 2 − yz) + (x y + y 3 + xz) + (yz)
∂x ∂y ∂z
2 2 2 2 2 2
= (3x + y ) + (x + 3y ) + y = 4x + 4y + y
= 4(x2 + y 2 ) + y
= 4ρ2 + ρ sin φ
So ∇ · F is the same in both coordinate systems.
42 HELM (2006):
®
Example 23
Find ∇ × F for F = ρ2 ρ̂ + z sin φφ̂ + 2z cos φẑ.
Solution

ρ̂ ρφ̂ ẑ ρ̂ ρ φ̂ ẑ

1 ∂ ∂ ∂ 1 ∂ ∂ ∂
∇×F = =
ρ ∂ρ ∂φ ∂z ρ ∂ρ ∂φ ∂z

2
Fρ ρFφ Fz ρ ρz sin φ 2z cos φ

1 ∂ ∂ ∂ 2 ∂ ∂ ∂ 2
= ρ̂ 2z cos φ − ρz sin φ +ρφ̂ ρ − 2z cos φ + ẑ ρz sin φ − ρ
ρ ∂φ ∂z ∂z ∂ρ ∂ρ ∂φ
1h i
= ρ̂(−2z sin φ − ρ sin φ) + ρφ̂(0) + ẑ(z sin φ)
ρ
(2z sin φ + sin φ) z sin φ
= − ρ̂ + ẑ
ρ ρ
Divergence of a magnetic field
Introduction
A magnetic field B must satisfy ∇ · B = 0. An associated current is given by:
1
I= (∇ × B)
µ0
Problem in words
For the magnetic field (in cylindrical polar coordinates ρ, φ, z)
ρ
B = B0 φ̂ + αẑ
1 + ρ2
show that the divergence of B is zero and find the associated current.
We must
1
(a) show that ∇ · B = 0 (b) find the current I = (∇ × B)
µ0
HELM (2006): 43
(a) Express B as (Bρ , Bφ , Bz ); then

1 ∂ ∂Bφ ∂Bz
∇·B = (ρBρ ) + +ρ
ρ ∂ρ ∂φ ∂z

1 ∂ ∂ ρ ∂
= (0) + B0 + ρ (α)
ρ ∂ρ ∂φ 1 + ρ2 ∂z
1
= [0 + 0 + 0] = 0 as required.
ρ
(b) To find the current evaluate

ρ̂ ρφ̂ ẑ
ρ̂ ρφ̂ ẑ

∂ ∂ ∂
1 1 1∂ ∂ ∂
I = (∇ × B) = = ∂ρ ∂φ ∂z
µ0 µ0 ρ ∂ρ ∂φ ∂z
2

Bρ ρBφ Bz 0 B0 ρ

α
1 + ρ2
2
1 ∂ ρ
= 0ρ̂ + 0ρφ̂ + B0 ẑ
µ0 ρ ∂ρ 1 + ρ2
1 1 + 4ρ3
= B0 ẑ
µ0 ρ (1 + ρ2 )2
Interpretation
The magnetic field is in the form of a helix with the current pointing along its axis (Fig 22). Such
an arrangement is often used for the magnetic containment of charged particles in a fusion reactor.
z
I
B
y
x
Figure 22: The magnetic field forms a helix
44 HELM (2006):
®
Example 24
A magnetic field B is given by B = ρ−2 φ̂ + kẑ. Find ∇ · B and ∇ × B.
Solution

1 ∂ ∂ −2 ∂ 1
∇·B = 0+ ρ + kρ = [0 + 0 + 0] = 0
ρ ∂ρ ∂φ ∂z ρ

ρ̂ ρφ̂ ẑ ρ̂ ρφ̂ ẑ

1 ∂ ∂ ∂ 1 ∂ ∂ ∂
∇×B = =
ρ ∂ρ ∂φ ∂z ρ ∂ρ ∂φ ∂z

−2
Bρ Bφ Bz ρ 0 k
=0
All magnetic fields satisfy ∇ · B = 0 i.e. an absence of magnetic monopoles. There is a class of
magnetic fields known as potential fields that satisfy ∇ × B = 0
Task
Using cylindrical polar coordinates, find ∇f for f = ρ2 z sin φ
Your solution
Answer
∂ 2 1 ∂ 2 ∂
[ρ z sin φ]ρ̂ + [ρ z sin φ]φ̂ + [ρ2 z sin φ]ẑ = 2ρz sin φρ̂ + ρz cos φφ̂ + ρ2 sin φẑ
∂ρ ρ ∂φ ∂z
HELM (2006): 45
Task
1. Using cylindrical polar coordinates, find ∇f for f = z sin 2φ
Your solution
Answer
∂ 1 ∂ ∂ 2
[z sin 2φ]ρ̂ + [z sin 2φ]φ̂ + [z sin 2φ]ẑ = z cos 2φφ̂ + sin 2φẑ
∂ρ ρ ∂φ ∂z ρ
Task
Find ∇ · F for F = ρ cos φρ̂ − ρ sin φρ̂ + ρz ẑ
i.e. Fρ = ρ cos φ, Fφ = −ρ sin φ, Fz = ρz
∂ ∂ ∂
(a) First find the derivatives [ρFρ ], [Fφ ], [ρFz ]:
∂ρ ∂φ ∂z
Your solution
Answer
2ρ cos φ, −ρ cos φ, ρ2
(b) Now combine these to find ∇ · F :
Your solution
46 HELM (2006):
®
Answer

1 ∂ ∂ ∂
∇·F = (ρFρ ) + (Fφ ) + (ρFz )
ρ ∂ρ ∂φ ∂z

1 ∂ 2 ∂ ∂ 2
= (ρ cos φ) + (−ρ sin φ) + (ρ z)
ρ ∂ρ ∂φ ∂z
1
2ρ cos φ − ρ cos φ + ρ2

=
ρ
= cos φ + ρ
Task
Find ∇ × F for F = Fρ ρ̂ + Fφ φ̂ + Fz ẑ = ρ3 ρ̂ + ρz φ̂ + ρz sin φẑ. Show that the
results are consistent with those found using Cartesian coordinates.
(a) Find the curl ∇ × F :

Your solution
Answer

ρ̂ ρφ̂ ẑ

1 ∂ ∂ ∂
= (z cos φ − ρ)ρ̂ − z sin φφ̂ + 2z ẑ
ρ ∂ρ ∂φ ∂z

3 2

ρ ρ z ρz sin φ
(b) Find F in Cartesian coordinates:

Your solution
Answer
Use ρ̂ = cos φi+sin φj, φ̂ = − sin φi+cos φj to get F = (x3 +xy 2 −yz)i+(x2 y +y 3 +xz)j +yzk
(c) Hence find ∇ × F in Cartesian coordinates:

Your solution
Answer
(z − x)i − yj + 2zk
HELM (2006): 47
(d) Using ρ̂ = cos φi + sin φj and φ̂ = − sin φi + cos φj, show that the solution to part (a) is equal
to the solution for part (c):
Your solution
Exercises
1. For F = ρρ̂ + (ρ sin θ + z)φ̂ + ρz ẑ, find ∇ · F and ∇ × F .
2. For f = ρ2 z 2 cos 2φ, find ∇ × (∇f ).

Answers
1. 1 + cos θ + ρ, −ρ̂ − z cos θφ̂ + (2ρ sin φ + z)ẑ
2. 0
4. Spherical polar coordinates

In spherical polar coordinates (r, θ, φ), the 3 unit vectors are r̂, θ̂ and φ̂ with scale factors hr = 1,
hθ = r, hφ = r sin θ. The quantities r, θ and φ are related to x, y and z by x = r sin θ cos φ,
y = r sin θ sin φ and z = r cos θ. In spherical polar coordinates,
∂f 1 ∂f 1 ∂f
grad f = ∇f = r̂ + θ̂ + φ̂
∂r r ∂θ r sin θ ∂φ
If F = Fr r̂ + Fθ θ̂ + Fφ φ̂
then

1 ∂ 2 ∂ ∂
div F = ∇ · F = 2 (r sin θFr ) + (r sin θFθ ) + (rFφ )
r sin θ ∂r ∂θ ∂φ

r̂ r θ̂ r sin θ φ̂

1 ∂ ∂ ∂
curl F = ∇ × F = 2

Fr rFθ r sin θFφ
48 HELM (2006):
®
Example 25
In spherical polar coordinates, find ∇f for
1
(a) f = r (b) f = (c) f = r2 sin(φ + θ)
r
[Note: parts (a) and (b) relate to Exercises 2(a) and 2(c) on page 22.]
Solution
∂f 1 ∂f 1 ∂f
(a) ∇f = r̂ + θ̂ + φ̂
∂(r) 1 ∂(r) 1 ∂(r)
= r̂ + θ̂ + φ̂
= 1r̂ = r̂
∂f 1 ∂f 1 ∂f
(b) ∇f = r̂ + θ̂ + φ̂
∂( 1r ) 1 ∂( 1r ) 1 ∂( 1r )
= r̂ + θ̂ + φ̂
1
= − 2 r̂
r
∂f 1 ∂f 1 ∂f
(c) ∇f = r̂ + θ̂ + φ̂
∂(r sin(φ + θ) 1 ∂(r sin(φ + θ) 1 ∂(r2 sin(φ + θ)
= r̂ + θ̂ + φ̂
1 1
= 2r sin(φ + θ)r̂ + r2 cos(φ + θ)θ̂ + r2 cos(φ + θ)φ̂
r r sin θ
r cos(φ + θ)
= 2r sin(φ + θ)r̂ + r cos(φ + θ)θ̂ + φ̂
sin θ
HELM (2006): 49
Electric potential
Introduction
There is a scalar quantity V , called the electric potential, which satisfies
∇V = −E where E is the electric field.
It is often easier to handle scalar fields rather than vector fields. It is therefore convenient to work
with V and then derive E from it.
Problem in words
Given the electric potential, find the electric field.
For a point charge, Q, the potential V is given by
Q
V =
4π0 r
Q
Verify, using spherical polar coordinates, that E = −∇V is indeed E = r̂
4π0 r2
In spherical polar coordinates:
∂V 1 ∂V 1 ∂V
∇V = r̂ + θ̂ + φ̂
∂r r ∂θ r sin φ ∂φ
∂V
= r̂ as the other partial derivatives are zero
∂r

∂ Q
= r̂
∂r 4π0 r
Q
= − r̂
4π0 r2
Interpretation
Q
So E = r̂ as required.
4π0 r2
This is a form of Coulomb’s Law. A positive charge will experience a positive repulsion radially
outwards in the field of another positive charge.
50 HELM (2006):
®
Example 26
Using spherical polar coordinates, find ∇ · F for the following vector functions.
(a) F = rr̂ (b) F = r2 sin θr̂ (c) F = r sin θ r̂ + r2 sin φ θ̂ + r cos θ φ̂
Solution
(a)

1 ∂ 2 ∂ ∂
∇·F = 2 (r sin θFr ) + (r sin θFθ ) + (rFφ )

1 ∂ 2 ∂ ∂
= 2 (r sin θ × r) + (r sin θ × 0) + (r × 0)

1 ∂ 3 ∂ ∂ 1 2
= 2 (r sin θ) + (0) + (0) = 2 3r sin θ + 0 + 0 = 3
r sin θ ∂r ∂θ ∂φ r sin θ
Note :- in Cartesian coordinates, the corresponding vector is F = xi + yj + zk with

∇ · F = 1 + 1 + 1 = 3 (hence consistency).
(b)

1 ∂ 2 ∂ ∂
∇·F = (r sin θ Fr ) + (r sin θ Fθ ) + (rFφ )
r2 sin θ ∂r ∂θ ∂φ

1 ∂ 2 2 ∂ ∂
= (r sin θ r sin θ) + (r sin θ × 0) + (r × 0)

1 ∂ 4 2 ∂ ∂
= (r sin θ) + (0) + (0)
1 3 2
= 2
4r sin θ + 0 + 0 = 4r sin θ
r sin θ
(c)

1 ∂ 2 ∂ ∂
∇·F = (r sin θ Fr ) + (r sin θ Fθ ) + (rFφ )

1 ∂ 2 ∂ 2 ∂
= (r sin θ r sin θ) + (r sin θ × r sin φ) + (r × r cos θ)

1 ∂ 3 2 ∂ 3 ∂ 2
= (r sin θ) + (r sin θ sin φ) + (r cos φ)
1 2 2 3

= 3r sin θ + r cos θ sin φ + 0 = 3 sin θ + r cot θ sin φ
r2 sin θ
HELM (2006): 51
Example 27
Find ∇ × F for the following vector fields F .
(a) F = rk r̂, where k is a constant (b) F = r2 cos θ r̂ + sin θ θ̂ + sin2 θ φ̂
Solution
(a)

r̂ 2
r θ̂ r sin θ φ̂

1 ∂ ∂ ∂
∇×F = 2

Fr rFθ r2 sin θFφ

r̂
rθ̂ r2 sin θ φ̂

1 ∂ ∂ ∂
= 2
r sin θ ∂r
∂θ ∂φ

k 2

r r × 0 r sin θ × 0

1 ∂ ∂ ∂ k ∂
= 2 (0) − (0) r̂ + (r ) − (0) rθ̂
r sin θ ∂θ ∂φ ∂φ ∂r

∂ ∂
+ (0) − (rk ) r2 sin θ φ̂
∂r ∂θ
= 0 r̂ + 0 θ̂ + 0 φ̂ = 0
(b)

r̂ r θ̂ r sin θ φ̂ r̂ rθ̂ r sin θ φ̂

1 ∂ ∂ ∂ 1 ∂ ∂ ∂
∇×F = 2
=
2

r sin θ ∂r ∂θ
∂φ r sin θ ∂r
∂θ ∂φ

2
Fr rFθ r sin θFφ r cos θ r × sin θ r sin θ × sin2 θ

1 ∂ 3 ∂ ∂ 2 ∂ 3
= (r sin θ) − (r sin θ) r̂ + (r cos θ) − (r sin θ) rθ̂
r2 sin θ ∂θ ∂φ ∂φ ∂r

∂ ∂
+ (r sin θ) − (r2 cos θ) r sin θ φ̂
∂r ∂θ
1 h
2
3
i
= 3r sin θ cos θ + 0 r̂ + 0 − sin θ r θ̂ + (sin θ + r sin θ) r sin θ φ̂
r2 sin θ
3 sin θ cos θ sin2 θ (1 + r)
= r̂ − θ̂ + sin θ φ̂
r r r
52 HELM (2006):
®
Task
Using spherical polar coordinates, find ∇f for
(a) f = r4
r
(b) f =
r2 + 1
(c) f = r2 sin 2θ cos φ
Your solution
Answer
(a) 4r3 r̂,
1 − r2
(b) r̂,
(1 + r2 )2
∂ 2 1 ∂ 2 1 ∂ 2
(c) (r sin 2θ cos φ)r̂ + (r sin 2θ cos φ)φ̂ + 2 (r sin 2θ cos φ)
= 2r sin 2θ cos φ r̂ + 2r cos 2θ cos φ θ̂ − 2r cos θ sin φ φ̂
Exercises
1. For F = r sin θr̂ + r cos φθ̂ + r sin φφ̂, find ∇ · F and ∇ × F .
2. For F = r−4 cos θr̂ + r−4 sin θθ̂, find ∇ · F and ∇ × F .
3. For F = r2 cos θr̂ + cos φθ̂ find ∇ · (∇ × F ).
Answers
θ
1. cos φ(cot θ + cosecθ) + 3 sin θ, cot sin φr̂ − 2 sin φθ̂ + (2 cos φ − cos θ)φ̂
2
2. 0, −2r−5 sin θφ̂
3. 0
HELM (2006): 53
Contents 29
Integral Vector
Calculus
29.1 Line Integrals Involving Vectors 2
29.2 Surface and Volume Integrals 34
29.3 Integral Vector Theorems 54
Learning outcomes
In this Workbook you will learn how to integrate functions involving vectors. You will learn
how to evaluate line integrals i.e. where a scalar or a vector is summed along a line or
contour. You will be able to evaluate surface and volume integrals where a function
involving vectors is summed over a surface or volume. You will learn about some theorems
relating to line, surface or volume integrals i.e Stokes' theorem, Gauss' theorem and
Green's theorem.
Line Integrals
Involving Vectors 29.1
Introduction
28 considered the differentiation of scalar and vector fields. Here we consider how to integrate
such fields along a line. Firstly, integrals involving scalars along a line will be considered. Subsequently,
line integrals involving vectors will be considered. These can be scalar or vector depending on the
form of integral used. Of particular interest are the integrals of conservative vector fields.
#
• have a thorough understanding of the basic
Prerequisites
Before starting this Section you should . . . • be familiar with the operators div, grad and
curl
"
!

Learning Outcomes • integrate a scalar or vector quantity along a

line

2 HELM (2006):
Workbook 29: Integral Vector Calculus
®
1. Line integrals
28 was concerned with evaluating an integral over all points within a rectangle or other shape
(or over a cuboid or other volume). In a related manner, an integral can take place over a line or
curve running through a two-dimensional (or three-dimensional) shape. Line integrals may involve
scalar or vector fields. Those involving scalar fields are dealt with first.
Line integrals in two dimensions

A line integral in two dimensions may be written as
Z
F (x, y)dw
C
There are three main features determining this integral:

F (x, y) This is the scalar function to be integrated e.g. F (x, y) = x2 + 4y 2 .
C This is the curve along which integration takes place. e.g. y = x2 or x = sin y
or x = t − 1; y = t2 (where x and y are expressed in terms of a parameter t).
dw This states the variable of the integration. Three main cases are dx, dy and ds.
Here ‘s’ is arc length and so indicates position along the
s curve C.
2
dy
q
ds may be written as ds = (dx)2 + (dy)2 or ds = 1 + dx.
dx
A fourth case is when F (x, y) dw has the form: F1 dx+F2 dy. This is a combination
Z of the cases dx and dy.
The integral F (x, y) ds represents the area beneath the surface z = F (x, y) but above the line
C
C. Z Z
The integrals F (x, y) dx and F (x, y) dy represent the projections of this area onto the xz
C C
and yz planes respectively. Z Z
A particular case of the integral F (x, y) ds is the integral 1 ds. This is a means of calculating
C C
the length along a curve i.e. an arc length.
z
y

f (x, y)dy f (x, y)ds
C C
curve C

f (x, y)dx
C
Figure 1: Representation of a line integral and its projections onto the xz and yz planes
HELM (2006): 3
Section 29.1: Line Integrals Involving Vectors
The technique with a line integral is to express all quantities in an integral in terms of a single
variable. Often, if the integral is with respect to ’x’ or ’y’, the curve ’C’ and the
function ’F ’ may be expressed in terms of the relevant variable. If the integral is carried out
with respect to ds, normally everything is expressed in terms of x. If x and y are given in terms of
a parameter t, normally everything is expressed in terms of t.
Example
Z 1
Find x (1 + 4y) dx where C is the curve y = x2 , starting from x = 0, y = 0
c
and ending at x = 1, y = 1.
Solution
As this integral concerns only points along C and the integration is carried out with respect to x,
y may be replaced by x2 . The limits on x will be 0 to 1. So the integral becomes
Z Z 1 Z 1
2
x + 4x3 dx

x(1 + 4y) dx = x 1 + 4x dx =
C x=0 x=0
2 1
x 1 3
= + x4 = + 1 − (0) =
2 0 2 2
Example
Z 2
Find x (1 + 4y) dy where C is the curve y = x2 , starting from
c
x = 0, y = 0 and ending at x = 1, y = 1. This is the same as Example 1 other
than dx being replaced by dy.
Solution
As this integral concerns only points along C and the integration is carried out with respect to y,
everything may be expressed in terms of y, i.e. x may be replaced by y 1/2 . The limits on y will
be 0 to 1. So the integral becomes
Z Z 1 Z 1
1/2
y 1/2 + 4y 3/2 dx

x(1 + 4y) dy = y (1 + 4y) dx =
C y=0 y=0
1
2 3/2 8 5/2 2 8 34
= y + x = + − (0) =
3 5 0 3 5 15
4 HELM (2006):
®
Example
Z 3
Find x (1 + 4y) ds where C is the curve y = x2 , starting from x = 0, y = 0
c
and ending at x = 1, y = 1. Once again, this is the same as the previous
two examples other than the integration being carried out with respect to s, the
coordinate along the curve C.
Solution
As this integral s
is with respect to x, all parts of the integral can be expressed in terms of x, Along
2
dy
q √
2
y = x , ds = 1 + dx = 1 + (2x)2 dx = 1 + 4x2 dx
dx
So, the integral is
√
Z Z 1 Z 1
2
3/2
x (1 + 4y) ds = x 1 + 4x 2
1 + 4x dx = x 1 + 4x2 dx
c x=0 x=0
dU
This can be evaluated using the transformation U = 1 + 4x2 so dU = 8xdx i.e. x dx = .
8
When x = 0, U = 1 and when x = 1, U = 5.
The integral therefore equals
Z 1
1 5
Z
2 3/2
U 3/2 dU

x 1 + 4x dx =
x=0 8 U =1
5
1 2 5/2 1 5/2
= × U = 5 − 1 ≈ 2.745
8 5 1 20
Note that the results for Examples 1,2 and 3 are all different: Example 3 is the area between a curve
and a surface above; Examples 1 and 2 give projections of this area onto other planes.
Example
Z 4
Find xy dx where, on C, x and y are given by x = 3t2 , y = t3 − 1 for t starting
C
at t = 0 and progressing to t = 1.
Solution
Everything can be expressed in terms of t, the parameter. Here x = 3t2 so dx = 6t dt. The limits
on t are t = 0 and t = 1. The integral becomes
Z Z 1 Z 1
2 3
xy dx = 3t (t − 1) 6t dt = (18t6 − 18t3 ) dt
C t=0 t=0
1
18 7 18 4 18 9 27
= t − t = − −0=−
7 4 0 7 2 14
HELM (2006): 5
Key Point 1
A line integral is normally evaluated by expressing all variables in terms of one variable.
In general Z Z Z
f (x, y) ds 6= f (x, y) dy 6= f (x, y) dx
C C C
Task Z Z Z
2
For F (x, y) = 2x + y , find F (x, y) dx, F (x, y) dy and F (x, y) ds
C C C
where C is the line y = 2x from (0, 0) to (1, 2).
Express each integral as a simple integral with respect to a single variable and hence evaluate each
integral:
Your solution
Answer
Z 1
7
Z 2
14
Z 1 √ 7√
(2x + 4x2 ) dx, , 2
(y + y ) dy, , (2x + 4x2 ) 5 dx, 5
x=0 3 y=0 3 x=0 3
Task Z Z Z
Find F (x, y) dx, F (x, y) dy and F (x, y) ds where F (x, y) = 1 and C
C C C
is the curve y = 21 x2 − 14 ln x from (1, 12 ) to (2, 2 − 41 ln 2).
Your solution
6 HELM (2006):
®
Answer
Z 2 Z 2−1/4 ln 2
3 1
1 dx = 1, 1 dy = − ln 2,
1 1/2 2 4
Z 2
1 3 1
(x + ) dx = + ln 2.
1 4x 2 4
Task Z Z Z
Find F (x, y) dx, F (x, y) dy and F (x, y) ds where F (x, y) = sin 2x and
C C C
π
C is the curve y = sin x from (0, 0) to ( , 1).
2
Your solution
Answer
Z π/2 Z π/2
2
sin 2x dx = 1, 2 sin x cos2 x dx =
0 0 3
Z π/2 √ √
sin 2x 1 + cos2 x dx Using the substitution u = 1 + cos2 x gives 23 (2 2 − 1).
0
2. Line integrals of scalar products

Z
Integrals of the form F ·dr, referred to at the end of the previous sub-section, occur in applications
C
such as the following.
T
δr
dr
A S (current position)
Figure 2: Schematic for cyclist travelling from A to B into a head wind
HELM (2006): 7
Consider a cyclist riding along the road from A to B (Figure 2). Suppose it is necessary to find the
total work the cyclist has to do in overcoming a wind of velocity v.
On moving from S to T , the work done is given by ‘Force × distance’ = F × δr cos θ where F , the
force, is directly proportional to v, but in the opposite direction, and δr cos θ is the component of
the distance travelled in the direction of the wind.
So, the work done travelling δr is −kv · δr. ZLetting δr become infinitesimally small, the work done
B
becomes −kv · dr and the total work is −k v · dr.
A
This is an example of the integral along a line of the scalar product of a vector field and a vector
describing the line. The term scalar line integral is often used for integrals of this form. The vector
dr may be considered to be dxi + dyj + dxk.
Multiplying out the scalar product,
Z in three dimensions,
Z the ’scalar line integral’ of the vector F
along contour C is given by F · dr and equals {Fx dx + Fy dy + Fz dz} in three dimensions
Z C C
( {Fx dx + Fy dy} in two dimensions.)

C
If the contour
I C has its startZ and end pointsIin the same positions i.e. it represents a closed contour,
the symbol rather than is used, i.e. F · dr .
C C C
As before, to evaluate the line integral, express the path and the function F in terms of either x, y
and z, or in terms of a parameter t. Note that in examples t often represents time.
Example
Z 5
Find {2xy dx − 5x dy} where C is the curve y = x3 with x varying from x = 0
C
to x = 1.
Z
[This is the integral F · dr where F = 2xyi − 5xj and dr = dxi + dyj.]
C
Solution
It is possible to split this integral into two different integrals and express the first term as a function
of x and the second term as a function of y. However, it is also possible to express everything in
terms of x. Note that on C, y = x3 so dy = 3x2 dx and the integral becomes
Z Z 1 Z 1
3 2
(2x4 − 15x3 ) dx

{2xy dx − 5x dy} = 2x x dx − 5x 3x dx =
C x=0 0
1
2 5 15 4 2 15 67
= x − x = − −0=−
5 4 0 5 4 20
8 HELM (2006):
®
Key Point 2
Z Z
An integral of the form F · dr may be expressed as {Fx dx + Fy dy + Fz dz}. Knowing the
C C
expression for the path C, every term in the integral can be further expressed in terms of one of
the variables x, y or z or in terms of a parameter t and hence integrated.
If an integral is two-dimensional there are no terms involving z.

Z
The integral F · dr evaluates to a scalar.
C
Example 6
Three paths from (0, 0) to (1, 2) are defined by
(a) C1 : y = 2x
(b) C2 : y = 2x2
(c) C3 : y = 0 from (0, 0) to (1, 0) and x = 1 from (1, 0) to (1, 2)
Z
Sketch each path and find F · dr, where F = y 2 i + xyj, along each path.
Solution
Z Z
2 dy
(a) F · dr = y dx + xydy . Along y = 2x, = 2 so dy = 2dx. Then
dx
Z Z 1
(2x)2 dx + x (2x) (2dx)

F · dr =
C1 x=0
Z 1 Z 1 1
2 2 8 8
8x dx = x2
2

= 4x + 4x dx = =
0 0 3 0 3
y
2 A(1, 2)
y = 2x
C1
x
1
Figure 3(a): Integration along path C1
HELM (2006): 9
Solution (contd.)
Z Z
2 dy
(b) F · dr = y dx + xydy . Along y = 2x2 , = 4x so dy = 4xdx. Then
dx 1
Z Z 1 n o Z 1
2 2
2
4 12 5 12
F · dr = 2x dx + x 2x (4xdx) = 12x dx = x =
C2 x=0 0 5 0 5
y
2 A(1, 2)
y = 2x2
C2
1 x
Figure 3(b): Integration along path C2
Note that the answer is different to part (a), i.e., the line integral depends upon the path taken.
(c) As the contour C3 , has two distinct parts with different equations, it is necessary to break the
full contour OA into the two parts, namely OB and BA where B is the point (1, 0). Hence
Z Z B Z A
F · dr = F · dr + F · dr
C3 O B
Along OB, y = 0 so dy = 0. Then

Z B Z 1 Z 1
2

F · dr = 0 dx + x × 0 × 0 = 0dx = 0
O x=0 0
Along AB, x = 1 so dx = 0. Then

Z B Z 2 Z 2 2
2 1
ydy = y 2

F · dr = y × 0 + 1 × y × dy = = 2.
A y=0 0 2 0
Z
Hence F · dr = 0 + 2 = 2
C3
y
A(1, 2)
2
x=1
C3
B
O y=0 1 x
Figure 3(c): Integration along path C3

Once again, the result is path dependent.
10 HELM (2006):
®
Key Point 3
In general, the value of the line integral depends on the path of integration as well as the end points.
Example 7
Z O
Find F · dr, where F = y 2 i + xyj (as in Example 6) and the path from A
A
to O is the straight line from (1, 2) to (0, 0), that is the reverse of C1 in Example
6(a). I
Deduce F · dr, the integral around the closed path C formed by the parabola
C
y = 2x2 from (0, 0) to (1, 2) and the line y = 2x from (1, 2) to (0, 0).
Solution
Reversing the path swaps the limits of integration, this results in a change of sign for the value of
the integral.
Z O Z A
8
F · dr = − F · dr = −
A O 3
12
The integral along the parabola (calculated in (iii) above) evaluates to , then
5
I I I
12 8 4
F · dr = F · dr + F · dr = − = − ≈ −0.267
C C2 C4 5 3 15
Example 8
Consider the vector field
F = y 2 z 3 i + 2xyz 3 j + 3xy 2 z 2 k
Let C1 and C2 be the curves from O = (0, 0, 0) to A = (1, 1, 1), given by
C1 : x = t, y = t, z=t (0 ≤ t ≤ 1)
C2 : x = t2 , y = t, z = t2 (0 ≤ t ≤ 1)
(a) Evaluate the scalar integral of the vector field along each path.
I
(b) Find the value of F · dr where C is the closed path along C1 from
C
O to A and back along C2 from A to O.
HELM (2006): 11
Solution
(a) The path C1 is given in terms of the parameter t by x = t, y = t and z = t. Hence

dx dy dz dr dx dy dz
= = = 1 and = i+ j+ k =i+j+k
dt dt dt dt dt dt dt
Now by substituting for x = y = z = t in F we have
F = t5 i + 2t5 j + 3t5 k
dr
Hence F · = t5 + 2t5 + 3t5 = 6t5 . The values of t = 0 and t = 1 correspond to the
dt
start and end point of C1 and so these are the required limits of integration. Now
Z Z 1 Z 1 1
dr
F · dr = F · dt = 6t dt = t6 = 1
5
C1 0 dt 0 0
dr
For the path C2 the parameterisation is x = t2 , y = t and z = t2 so = 2ti + j + 2tk.
dt
Substituting x = t2 , y = t and z = t2 in F we have
dr
F = t8 i + 2t9 j + 3t8 k and F · = 2t9 + 2t9 + 6t9 = 10t9
dt
Z Z 1 1
9 10
F · dr = 10t dt = t =1
C2 0 0
(b) For the closed path C

I Z Z
F · dr = F · dr − F · dr = 1 − 1 = 0
C C1 C2
(Note that the line integral round a closed path is not necessarily zero - see Example 7.)
Further points on Example 8
Vector Field Path Line Integral

F C1 1
F C2 1
F closed 0
Note that the line integral of F is 1 for both paths C1 and C2 . In fact, this would hold for any path
from (0, 0, 0) to (1, 1, 1).
The field F is an example of a conservative vector field; these are discussed in detail in the
next subsection.
Z
In F · dr, the vector field F may be the gradient of a scalar field or the curl of a vector field.
C
12 HELM (2006):
®
Task
Consider the vector field
G = xi + (4x − y)j
Let C1 and C2 be the curves from O = (0, 0, 0) to A = (1, 1, 1), given by
C1 : x = t, y = t, z=t (0 ≤ t ≤ 1)
C2 : x = t2 , y = t, z = t2 (0 ≤ t ≤ 1)
Z
(a) Evaluate the scalar integral G · dr of each vector field along each
C
path.
I
(b) Find the value of G · dr where C is the closed path along C1 from
C
O to A and back along C2 from A to O.
Your solution
HELM (2006): 13
Answer
(a) The path C1 is given in terms of the parameter t by x = t, y = t and z = t. Hence
dx dy dz dr dx dy dz
= = = 1 and = i+ j+ k =i+j+k
dt dt dt dt dt dt dt
Substituting for x = y = z = t in G we have
dr
G = ti + 3tj and G · = t + 3t = 4t
dt
The limits of integration are t = 0 and t = 1, then
Z Z 1 Z 1 1
dr
G · dr = G · dt = 4tdt = 2t2 = 2
C1 0 dt 0 0
dr
For the path C2 the parameterisation is x = t2 , y = t and z = t2 so = 2ti + j + 2tk.
dt
Substituting x = t2 , y = t and z = t2 in G we have
dr
G = t2 i + 4t2 − t j and G · = 2t3 + 4t2 − t

dt
Z Z 1 1
3 2
1 4 4 3 1 2 4
G · dr = 2t + 4t − t dt = t + t − t =
C2 0 2 3 2 0 3
I Z Z
4 2
(b) For the closed path C G · dr = G · dr − G · dr = 2 − =
C C1 C2 3 3
(Note that the integral around the closed path is non-zero, unlike Example 8.)
Example 9
Z
∇(x2 y) · dr where C is the contour y = 2x − x2 from (0, 0) to (2, 0).

Find
C
Solution
Z
2xy dx + x2 dy .
2 2

Note that ∇(x y) = 2xyi + x j so the integral is
C
On y = 2x − x2 , dy = (2 − 2x) dx so the integral becomes
Z Z 2
2
2x(2x − x2 ) dx + x2 (2 − 2x) dx

2xy dx + x dy =
C x=0
Z 2 2
2 3 3 4
= (6x − 4x ) dx = 2x − x =0
0 0
14 HELM (2006):
®
Task Z
Evaluate F · dr, where F = (x − y)i + (x + y)j along each of the following
C
paths
(a) C1 : from (1, 1) to (2, 4) along the straight line y = 3x − 2:

(b) C2 : from (1, 1) to (2, 4) along the parabola y = x2 :
(c) C3 : along the straight line x = 1 from (1, 1) to (1, 4) then along the
straight line y = 4 from (1, 4) to (2, 4).
Your solution
Answer
Z 2
(a) (10x − 4) dx = 11,
1
Z 2
35
(b) (x + x2 + 2x3 ) dx = , (this differs from (a) showing path dependence)
1 3
Z 4 Z 2
(c) (1 + y) dy + (x − 4) dx = 8
1 1
Task I
For the function F and paths in the last Task, deduce F · dr for the closed
paths
(a) C1 followed by the reverse of C2 .

(b) C2 followed by the reverse of C3 .
(c) C3 followed by the reverse of C1 .
Your solution
HELM (2006): 15
Answer
1 10
(a) − , (b) , (c) −3. (note that all these are non-zero.)
3 3
Exercises
Z
1. Consider F · dr, where F = 3x2 y 2 i + (2x3 y − 1)j. Find the value of the line integral along
C
each of the paths from (0, 0) to (1, 4).
(a) y = 4x (b) y = 4x2 (c) y = 4x1/2 (d) y = 4x3
2. Consider the vector field F = 2xi + (xz − 2)j + xyk and the two curves between (0, 0, 0) and
(1, −1, 2) defined by
C1 : x = t2 , y = −t, z = 2t for 0 ≤ t ≤ 1.
C2 : x = t − 1, y = 1 − t, z = 2t − 2 for 1 ≤ t ≤ 2.
Z Z
(a) Find F · dr, F · dr
C1 C2
I
(b) Find F · dr where C is the closed path from (0, 0, 0) to (1, −1, 2) along C1 and back
C
to (0, 0, 0) along C2 .
3. Consider the vector field G = x2 zi + y 2 zj + 13 (x3 + y 3 )k and the two curves between (0, 0, 0)
and (1, −1, 2) defined by
C1 : x = t2 , y = −t, z = 2t for 0 ≤ t ≤ 1.
C2 : x = t − 1, y = 1 − t, z = 2t − 2 for 1 ≤ t ≤ 2.
Z Z
(a) Find G · dr, G · dr
C1 C2
I
(b) Find G · dr where C is the closed path from (0, 0, 0) to (1, −1, 2) along C1 and back
C
to (0, 0, 0) along C2 .
Z
4. Find F · dr) along y = 2x from (0, 0) to (2, 4) for
C
(a) F = ∇(x2 y)
(b) F = ∇ × ( 21 x2 y 2 k)
Answers
1. All are 12, and in fact the integral would be 12 for any path from (0,0) to (1,4).
5
2 (a) 2, 3
, (b) 0.
1
3 (a) 0, 3
, (b) 0.
4. 16, −16.
16 HELM (2006):
®
3. Conservative vector fields

For some line integrals in the previous section, the integral depended only on the vector field F and
the start and end points of the line but not on the actual path of the line between the start and end
points. However, for other line integrals, the result depended on the actual details of the path of the
line.
Vector fields are classified according to whether the line integrals are path dependent or path indepen-
dent. Those vector fields for which all line integrals between all pairs of points are path independent
are called conservative vector fields.
There are five properties of a conservative vector field (P1 to P5). It is impossible to check the value
of every line integral over every path, but instead it is possible to use any one of these five properties
(and in particular property P3 below) to determine whether a vector field is conservative. They are
also used to simplify calculations with conservative vector fields.
Z B
P1 The line integral F · dr depends only on the end points A and B and is independent of
A
the actual path taken.
I
P2 The line integral around any closed curve is zero. That is F · dr = 0 for all C.
C
P3 The curl of a conservative vector field F is zero i.e. ∇ × F = 0.

P4 For anyI conservative vector field F , it is possible to find a scalar field φ such that ∇φ = F .
Then, F · dr = φ(B) − φ(A) where A and B are the start and end points of contour C.
C
[This is sometimes called the Fundamental Theorem of Line Integrals and is comparable with
the Fundamental Theorems of Calculus.]
P5 All gradient fields are conservative. That is, F = ∇φ is a conservative vector field for any
scalar field φ.
Example 10
The following vector fields were considered in the Examples of the previous sub-
section.
1. F 1 = y 2 i + xyj (Example 6) 2. F 2 = 2xi + 2yj (Example 7)
3. F 3 = y 2 z 3 i + 2xyz 3 j + 3xy 2 z 2 k (Example 8)
4. F 4 = xi + (4x − y) j (Task on page 13)
Determine which of these vector fields are conservative e.g. by referring to the
answers given in the solution. For those that are conservative find a scalar field φ
such that F = ∇φ and use property P4 to verify the line integrals found.
Solution
1. Two different values were obtained for line integrals over the paths C1 and C2 . Hence, by P1,
F 1 is not conservative. [It is also possible to reach this conclusion from P3 by finding that
∇ × F = −yk 6= 0.]
HELM (2006): 17
Solution
2. Both line integrals from (0, 0) to (4, 2) had the same value i.e. 20 and for the closed path the
line integral was 0. This alone does not mean that F 2 is conservative as there could be other untried
paths giving different values. So by using P3

i j k

∂ ∂ ∂
∇ × F 2 = = i(0 − 0) − j(0 − 0) + k(0 − 0) = 0
∂x ∂y ∂z

2x 2y 0
As ∇ × F 2 = 0, P3 gives that F 2 is a conservative vector field.
∂φ ∂φ
Now, find a φ such that F 2 = ∇φ. Then i+ j = 2xi + 2yj.
∂x ∂y
Thus
∂φ

= 2x ⇒ φ = x2 + f (y) 

∂x 

⇒ φ = x2 + y 2 (+ constant)
∂φ 
= 2y ⇒ φ = y 2 + g(x) 


∂y
Z (4,2) Z (4,2)
Using P4: F 2 · dr = (∇φ) · dr = φ(4, 2) − φ(0, 0) = (42 + 22 ) − (02 + 02 ) = 20.
(0,0) (0,0)
3. The fact that line integrals along two different paths between the same start and end points is
consistent with F 3 being a conservative field according to P1. So too is the fact that the integral
around a closed path is zero according to P2. However, neither fact can be used to conclude that
F 3 is a conservative field. This can be done by showing that ∇ × F 3 = 0.
i j k

∂ ∂ ∂
Now, = (6xyz 2 − 6xyz 2 )i − (3y 2 z 2 − 3y 2 z 2 )j + (2yz 3 − 2yz 3 )k = 0.
∂x ∂y ∂z

2 3 3 2 2

y z 2xyz 3xy z
As ∇ × F 3 = 0, P3 gives that F 3 is a conservative field.
To find φ that satisfies ∇φ = F 3 , it is necessary to satisfy
∂φ

= y2z3 → φ = xy 2 z 3 + f (y, z)  
∂x 





∂φ 3 2 3

= 2xyz → φ = xy z + g(x, z) → φ = xy 2 z 3
∂y 




∂φ


2 2 2 3
= 3xy z → φ = xy z + h(x, y) 

∂z Z
(1,1,1)
Using P4: F 3 · dr = φ(1, 1, 1) − φ(0, 0, 0) = 1 − 0 = 1.
(0,0,0)
18 HELM (2006):
®
Solution
4. As the integral along C1 is 2 and the integral along C2 (same start and end points but different
intermediate points) is 43 , F4 is not a conservative field using P1.
Note that ∇ × F 4 = 4k 6= 0 so, using P3, this is an independent conclusion that F 4 is not
conservative.
Work done moving a charge in an electric field

Introduction
If a charge, q, is moved through an electric field, E, from A to B, then the work
required is given by the line integral
Z B
WAB = −q E · dl
A
Problem in words
Compare the work done in moving a charge through the electric field around a point charge in a
vacuum via two different paths.
An electric field E is given by
Q
E = r̂
4πε0 r2
Q xi + yj + zk
= ×p
4πε0 (x2 2
+y +z ) 2
x2 + y 2 + z 2
Q(xi + yj + zk)
= 3
4πε0 (x2 + y 2 + z 2 ) 2
1
where r is the position vector with magnitude r and unit vector r̂, and is a combination of
4π0
constants of proportionality, where 0 = 10−9 /36π F m−1 .
Given that Q = 10−8 C, find the work done in bringing a charge of q = 10−10 C from the point
A = (10, 10, 0) to the point B = (1, 1, 0) (where the dimensions are in metres)
(a) by the direct straight line y = x, z = 0

(b) by the straight line pair via C = (10, 1, 0)
HELM (2006): 19
y
A
a
b
b
B C
O x
Figure 4: Two routes (a and b) along which a charge can move through an electric field
The path comprises two straight lines from A = (10, 10, 0) to B = (1, 1, 0) via C = (10, 1, 0) (see
Figure 4).
(a) Here q/(4πε0 ) = 90 so

90[xi + yj]
E= 3
(x2 + y 2 ) 2
as z = 0 over the region of interest. The work done
Z B
WAB = −q E · dl
A
Z B
90
= −10−10 3 [xi + yj] · [dxi + dyj]
A (x2 + y 2 ) 2
Using y = x, dy = dx
Z 1
−10 90
WAB = −10 3 {x dx + x dx}
(2x2 ) 2
x=10
Z 1
90
= −10−10 √ x−3 2x dx
10 (2 2)
90 × −10−10 1 −2
Z
= √ x dx
2 10
1
9 × −10−9

−1
= √ −x
2 10
1
9 × 10−9

= √ x−1
2 10
−9
9 × 10
= √ [1 − 0.1]
2
= 5.73 × 10−9 J
20 HELM (2006):
®
(b) The first part of the path is A to C where x = 10, dx = 0 and y goes from 10 to 1.
Z C
WAC = −Q E · dl
A
Z 1
90
= −10−10 3 [xi + yj] · [0i + dyj]
y=10 (100 + y 2 ) 2
Z 1
90y dy
= −10−10 3
10 (100 + y2) 2
Z 101
45 du
= −10−10 3 substituting u = 100 + y 2 , du = 2y dy
u=200 u2
Z 101
3
−10
= −45 × 10 u− 2 du
200
1 101
h i
= −45 × 10−10 −2u− 2
200

−10 2 2
= 45 × 10 √ −√ = 2.59 × 10−10 J
101 200
The second part is C to B, where y = 1, dy = 0 and x goes from 10 to 1.

Z 1
−10 90
WCB = −10 3 [xi + yj] · [dxi + 0j]
2
x=10 (x + 1) 2
Z 1
−10 90x dx
= −10 3
2
10 (x + 1) 2
Z 2
−10 45 du
= −10 3 substituting u = x2 + 1, du = 2x dx
u=101 u 2
Z 2
3
−10
= −45 × 10 u− 2 du
101
1 2
h i
= −45 × 10−10 −2u− 2
101

−10 2 2
= 45 × 10 √ −√ = 5.468 × 10−9 J
2 101
The sum of the two components WAC and WCB is 5.73 × 10−9 J. Therefore the work
done over the two routes is identical.
Interpretation
In fact, the work done is independent of the route taken as the electric field E around a point charge
in a vacuum is a conservative field.
HELM (2006): 21
Example 11
Z (2,1)
(2xy + 1)dx + (x2 − 2y)dy is independent of the

1. Show that I =
(0,0)
path taken.
2. Find I using property P1.
3. Find I using property P4.

I
(2xy + 1)dx + (x2 − 2y)dy where C is

4. Find I =
C
(a) the circle x2 + y 2 = 1

(b) the square with vertices (0, 0), (1, 0), (1, 1), (0, 1).
Solution
Z (2,1) Z
2

1. The integral I = (2xy + 1)dx + (x − 2y)dy may be re-written F · dr where
(0,0) C
2
F = (2xy + 1)i + (x − 2y)j.

i j k

∂ ∂ ∂
Now ∇ × F = = 0i + 0j + 0k = 0
∂x ∂y ∂z

2xy + 1 x2 − 2y 0
As ∇ × F = 0, F is a conservative field and I is independent of the path taken between (0, 0)

and (2, 1).
2. As I is independent of the path taken from (0, 0) to (2, 1), it can be evaluated along any
such path. One possibility is the straight line y = 12 x. On this line, dy = 12 dx. The integral
I becomes
Z (2,1)
(2xy + 1)dx + (x2 − 2y)dy

I =
(0,0)
Z 2
1 1
= (2x × x + 1)dx + (x2 − 4x) dx
x=0 2 2
Z 2
3 1
= ( x2 − x + 1)dx
0 2 2
2
1 3 1 2
= x − x +x =4−1+2−0=5
2 4 0
22 HELM (2006):
®
Solution (contd.)
3. If F = ∇φ then
∂φ

= 2xy + 1 → φ = x2 y + x + f (y) 

∂x 

→ φ = x2 y + x − y 2 + C.
∂φ 
= x2 − 2y → φ = x2 y − y 2 + g(x) 


∂y
These are consistent if φ = x2 y + x − y 2 (plus a constant which may be omitted since it
cancels).
So I = φ(2, 1) − φ(0, 0) = (4 + 2 − 1) − 0 = 5
4. As F is a conservative field, all integrals around a closed contour are zero.
Exercises
1. Determine whether the following vector fields are conservative
(a) F = (x − y)i + (x + y)j

(b) F = 3x2 y 2 i + (2x3 y − 1)j
(c) F = 2xi + (xz − 2)j + xyk
(d) F = x2 zi + y 2 zj + 13 (x3 + y 3 )k
Z
2. Consider the integral F · dr with F = 3x2 y 2 i + (2x3 y − 1)j. From Exercise 1(b) F is a
C
Z vector field. Find a scalar field φ so that ∇φ = F . Hence use P4 to evaluate the
conservative
integral F · dr where C is an integral with start-point (0, 0) and end point (1, 4).
C
3. For the following conservative

Z vector fields F , find a scalar field φ such that ∇φ = F and
hence evaluate the I = F · dr for the contours C indicated.
C
(a) F = (4x3 y − 2x)i + (x4 − 2y)j; any path from (0, 0) to (2, 1).
(b) F = (ex + y 3 )i + (3xy 2 )j; closed path starting from any point on the circle x2 + y 2 = 1.
(c) F = (y 2 + sin z)i + 2xyj + x cos zk; any path from (1, 1, 0) to (2, 0, π).
1
(d) F = i + 4y 3 z 2 j + 2y 4 zk; any path from (1, 1, 1) to (1, 2, 3).
x
Answers
1. (a) No, (b) Yes, (c) No, (d) Yes
2. x3 y 2 − y + C, 12
3. (a) x4 y − x2 − y 2 , 11; (b) ex + xy 3 , 0; (c) xy 2 + x sin z, −1; (d) ln x + y 4 z 2 ,143
HELM (2006): 23
4. Vector line integrals
Z Z
It is also possible to form the less commonly used integrals f (x, y, z) dr and F (x, y, z) × dr.
C C
Each of these integrals evaluates to a vector. Z
Remembering that dr = dx i + dy j + dz k, an integral of the form f (x, y, z) dr becomes
Z Z Z C
f (x, y, z)dx i + f (x, y, z) dy j + f (x, y, z)dz k. The first term can be evaluated by
C C C
expressing y and z in terms of x. Similarly the second and third terms can be evaluated by expressing
all terms as functions of y and z respectively. Alternatively, all variables can be expressed in terms
of a parameter t. If an integral is two-dimensional, the term in z will be absent.
Example 12 Z
Evaluate the integral xy 2 dr where C represents the contour y = x2 from (0, 0)
C
to (1, 1).
Solution
This is a two-dimensional integral so the term in z will be absent.
Z
I = xy 2 dr
ZC
= xy 2 (dxi + dyj)
ZC Z
2
= xy dx i + xy 2 dy j
C C
Z 1 Z 1
2 2
= x(x ) dx i + y 1/2 y 2 dy j
x=0 y=0
Z 1 Z 1
= x5 dx i + y 5/2 dy j
0 0
1 1
1 6 2 7/2
= x i+ x j
6 0 7 0
1 2
= i+ j
6 7
24 HELM (2006):
®
Example
Z 13
Find I = xdr for the contour C given parametrically by x = cos t, y = sin t,
C
z = t − π starting at t = 0 and going to t = 2π, i.e. the contour starts at
(1, 0, −π) and finishes at (1, 0, π).
Solution
Z
The integral becomes x(dxi + dyj + dzk).
C
Now, x = cos t, y = sin t, z = t − π so dx = − sin t dt, dy = cos t dt and dz = dt. So
Z 2π
I = cos t(− sin t dti + cos t dtj + dtk)
0
Z 2π Z 2π Z 2π
2
= − cos t sin t dt i + cos t dt j + cos t dt k
0 0 0
1 2π 1 2π
Z Z h i2π
= − sin 2t dt i + (1 + cos 2t) dt j + sin t k
2 0 2 0 0
2π
1h i2π 1 1
= cos 2t i + t + sin 2t j + 0k
4 0 2 2 0
= 0i + π j = πj
Z
Integrals of the form F × dr can be evaluated as follows. The vector field F = F1 i + F2 j + F3 k
C
and dr = dx i + dy j + dz k so

i j k

F × dr = F1 F2 F3 = (F2 dz − F3 dy)i + (F3 dx − F1 dz)j + (F1 dy − F2 dx)k

dx dy dz
= (F3 j − F2 k)dx + (F1 k − F3 i)dy + (F2 i − F1 j)dz
There are a maximum of six terms involved in one such integral; the exact details may dictate which
form to use.
HELM (2006): 25
Example 14 Z
Evaluate the integral (x2 i + 3xyj) × dr where C represents the curve y = 2x2
C
from (0, 0) to (1, 2).
Solution
Note that the z component of
F and dr are both zero.
i j k

2
So F × dr = x 3xy 0 = (x2 dy − 3xydx)k

dx dy 0
Z Z
and (x i + 3xyj) × dr = (x2 dy − 3xydx)k
2
C C
Now, on C, y = 2x2 so dy = 4xdx and
Z Z
2
(x i + 3xyj) × dr = {x2 dy − 3xydx}k
C
ZC1
2
x × 4xdx − 3x × 2x2 dx k

=
Zx=0
1
= −2x3 dxk
0
1
1 4
= − x k
2 0
1
= − k
2
26 HELM (2006):
®
Force on a loop from a magnetic field
Introduction
A current I in a magnetic field B is subject to a force F given by
F = I dl × B
where the current can be regarded as having magnitude I and flowing (positive charge) in the
direction given by the vector dl. The force is known as the Lorentz force and is responsible for the
workings of an electric motor. If current flows around a loop, the total force on the loop is given by
the integral of F around the loop, i.e.
I I
F = (I dl × B) = −I (B × dl)
where the closed path of the integral represents one circuit of the loop.
z
B y
x
dl
I
Figure 5: The magnetic field through a loop of current

Problem in words
A current of 1 amp flows around a circuit in the shape of the unit circle in the Oxy plane. A field
of 1 gauss in the positive z-direction is present. Find the total force on the circuit loop.
Choose an origin at the centre of the circuit and use polar coordinates to describe the position of
any point on the circuit and the length of a small element.
Calculate the line integral around the circuit representing the force using the given values of current
and magnetic field.
The circuit is described parametrically by
x = cos θ y = sin θ z = 0
with
dl = − sin θ dθ i + cos θ dθ j
B = Bk
HELM (2006): 27
since B is constant. Therefore, the force on the circuit is given by
I I
F = −IB k × dl = − k × dl (since I = 1 A and B = 1 G)
where

i j k

k × dl = 0 0 1

− sin θ dθ cos θ dθ 0

= − cos θ i − sin θ j dθ
So
Z 2π
F = − − cos θ i − sin θ j dθ
θ=0
h i2π
= sin θ i − cos θ j
θ=0
= (0 − 0) i − (1 − 1) j = 0
Hence there is no net force on the loop.

Interpretation
At any given point of the circle, the force on the point opposite is of the same magnitude but opposite
direction, and so cancels, leaving a zero net force.
Tip: Use symmetry argument to avoid detailed calculations whenever possible!
28 HELM (2006):
®
Magnetic field from a line current

Introduction
A charge Q, moving at a steady velocity v produces a magnetic field given by
Qµ0
dB = (v × r)
4πr2
where µ0 is the permeability of free space (4π×10−7 H m−1 ), r is the position vector from the point
of interest, P , to the line current.
If, instead of a single charge, a current is used, then it is necessary to integrate over all charges in
the current. So, the total magnetic field due to the current is given by
Z D Z D
µ0 I dl × r̂
B= dB =
C C 4π r2
where Idl is the continuous form of Qv, r̂ is a unit vector along r and the current extends from C
to D. Note that the field is perpendicular to both the current and the line from the current to P .
Problem in words
Find the magnetic field strength (or magnetic flux density), measured in tesla (T), due to a current
I directed vertically downwards, starting at z = c and ending at z = −d. What is the field 1 m from
the current when c = 5 m, d = 10 m and I = 1 A?
z=5
dl
φ √
z h2 + z 2
r
P
h
z = −10
Figure 6: The current element dl and point P where the field is calculated
Here
Idl = −Ikdz
i.e. dl = −k dz (pointing downwards). Imagine (without loss of generality) a point P a distance h
from the line current and a distance z below a typical line element of the current. The increment of
field is given by
µ0 I
dB = dl × r̂
4π(h2 + z 2 )
√
where h2 + z 2 is the distance of P from the typical element. Since dl = −k dz and r̂ is a unit
vector, the magnitude of the vector product is
h dz
|dl × r̂| = sin φ dz = √
h2 + z 2
HELM (2006): 29
and is in a direction which (by the right-hand-rule) points OUT of the page to the right of the
line and IN to it on the left. Knowing the direction of the field, now calculate the magnitude: the
increment of field is given by
µ0 I h dz µ0 I
dB = 2 2
√ = h(h2 + z 2 )−3/2 dz
4π(h + z ) h + z
2 2 4π
so that the total field at a point is
Z c
µ0 I
B= h(h2 + z 2 )−3/2 dz
z=−d 4π
This integral can be evaluated by means of the substitution z = h tan u where
z = h tan u ⇒ dz = h sec2 u du
z=c ⇒ u = tan−1 (c/h) = uc
z = −d ⇒ u = − tan−1 (d/h) = ud
Substituting into the total field integral gives
µ0 I uc
Z
B = h(h2 sec2 u)−3/2 h sec2 u du
4π ud
µ0 I uc cos u du
Z
=
4π ud h
µ0 I h i uc
= sin u
4πh ud
!
µ0 I c/h d/h y
= p +p as sin(tan−1 y) = p
4πh 1 + (c/h)2 1 + (d/h)2 1 + y2
!
µ0 I c d
= p +p
4πh (h2 + c2 ) (h2 + d2 )
and B = B θ̂ where θ̂ is a unit vector in a circumferential direction around the line current. Now if
I = 1 A, c = 5 m, d = 10 m and h = 1 m the magnetic field becomes

−7 5 10
B = 10 √ +√ = 1.98 × 10−7 T = 1.98 milli-gauss.
26 101
Interpretation
Note that if c and d → ∞ then
!
µ0 I c d µ0 I µ0 I
B = p +p → [2] =
4πh (h2 + c2 ) (h2 + d2 ) 4πh 2πh
i.e. the field lines are circles around the line current and the magnetic field strength is inversely
proportional to the distance of the point of interest P from the current.
A scalar or vector involved in a vector line integral may itself be a vector derivative as this next
Example illustrates.
30 HELM (2006):
®
Example 15 Z
Find the vector line integral (∇ · F ) dr where F is the vector x2 i + 2xyj + 2xzk
C
and C is the curve y = x2 , z = x3 from x = 0 to x = 1 i.e. from (0, 0, 0) to
(1, 1, 1).
Solution
As F = x2 i + 2xyj + 2xzk, ∇ · F = 2x + 2x + 2x = 6x.
The integral
Z Z
(∇ · F ) dr = 6x(dxi + dyj + dzk)
C ZC
Z Z
= 6x dx i + 6x dy j + 6x dz k
C C C
The first term is

Z Z 1 1
2
6x dx i = 6x dx i = 3x i = 3i
C x=0 0
In the second term, as y = x2 on C, dy may be replaced by 2x dx so

Z Z 1 Z 1 1
2 3
6x dy j = 6x × 2x dx j = 12x dx j = 4x j = 4j
C x=0 0 0
3 2
In the third term, as z = x on C, dz may be replaced by 3x dx so
Z Z 1 Z 1 1
2 3 9 4 9
6x dz k = 6x × 3x dx k = 18x dx k = x k = k
C x=0 0 2 0 2
Z
9
On summing, (∇ · F ) dr = 3i + 4j + k.
C 2
HELM (2006): 31
Task Z
Find the vector line integral f dr where f = x2 and C is
C
(a) the curve y = x1/2 from (0, 0) to (9, 3).

(b) the line y = x/3 from (0, 0) to (9, 3).
Your solution
Answer
Z 9 Z 9
1 243 1 243
(a) (x2 i + x3/2 j)dx = 243i + j, (b) (x2 i + x2 j)dx = 243i + j.
0 2 5 0 3 2
Task Z
Evaluate the vector line integral F × dr when C represents the contour
C
y = 4 − 4x, z = 2 − 2x from (0, 4, 2) to (1, 0, 0) and F is the vector field (x − z)j.
Your solution
Answer
Z 1
1
{(4 − 6x)i + (2 − 3x)k} = i + k
0 2
32 HELM (2006):
®
Exercises
Z
1. Evaluate the vector line integral (∇ · F ) dr in the case where F = xi + xyj + xy 2 k and
C
C is the contour described by x = 2t, y = t2 , z = 1 − t for t starting at t = 0 and going to
t = 1.
2. When C is the contour y = x3 , z = 0, from (0, 0, 0) to (1, 1, 0), evaluate the vector line
integrals
Z
(a) {∇(xy)} × dr
C
Z
∇ × (x2 i + y 2 k) × dr

(b)
C
Answers
7
1. 4i + j − 2k,
3
2. (a) 0, (b) k
HELM (2006): 33
Surface and
Volume Integrals 29.2
Introduction
A vector or scalar field - including one formed from a vector derivative (div, grad or curl) - can be
integrated over a surface or volume. This Section shows how to carry out such operations.

• be familiar with vector derivatives

Prerequisites
• be familiar with double and triple integrals

Learning Outcomes • carry out operations involving integrations of

scalar and vector fields

34 HELM (2006):
®
1. Surface integrals involving vectors

The unit normal
For the surface of any three-dimensional shape, it is possible to find a vector lying perpendicular to
the surface and with magnitude 1. The unit vector points outwards from the surface and is usually
denoted by n̂.
Example 16
If S is the surface of the sphere x2 + y 2 + z 2 = a2 find the unit normal n̂.
Solution
The unit normal at the point P (x, y, z) points away from the centre of the sphere i.e. it lies in
the direction of xi + yj + zk. To make this a unit vector it must be divided by its magnitude
p
x2 + y 2 + z 2 i.e. the unit vector
x y z
n̂ = p i+ p j+p k
x2 + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2
x y z
= i+ j+ k
a a a
p
where a = x2 + y 2 + z 2
n k
j
P (x, y, z) y
i
a
x
Figure 7: A unit normal n̂ to a sphere
HELM (2006): 35
Section 29.2: Surface and Volume Integrals
Example 17
For the cube 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1, find the unit normal n̂ for each
face.
Solution
On the face given by x = 0, the unit normal points in the negative x-direction. Hence the unit
normal is −i. Similarly :-
On the face x = 1 the unit normal is i. On the face y = 0 the unit normal is −j.
On the face y = 1 the unit normal is j. On the face z = 0 the unit normal is −k.
On the face z = 1 the unit normal is k.
dS and the unit normal

The vector dS is a vector, being an element of the surface with magnitude dudv and direction per-
pendicular to the surface.
If the plane in question is the Oxy plane, then dS = n̂dudv = kdxdy.

dS
v
u
du dv
Figure 8: The vector dS being an element of a surface, with magnitude dudv

If the plane in question is not one of the three coordinate planes (Oxy, Oxz, Oyz), appropriate
adjustments must be made to express dS in terms of two of dx and dy and dz.
Example 18
The rectangle OABC lies in the plane z = y (Figure 9).
The vertices are O = (0, 0, 0), A = (1, 0, 0), B = (1, 1, 1) and C = (0, 1, 1).
Find a unit vector n̂ normal to the plane and an appropriate vector dS expressed
in terms of dx and dy.
y
C B
E D
O x
A
Figure 9: The plane z = y passing through OABC
36 HELM (2006):
®
Solution
−→ −→
Note that two vectors in the rectangle are OA = i and OC = j + k. A vector perpendicular to the
√
plane is i × (j + k) = −j + k. However, this vector is of magnitude 2 so the unit normal vector
1 1 1
is n̂ = √ (−j + k) = − √ j + √ k.
2 2 2
1 1
The vector dS is therefore (− √ j + √ k)dudv where du and dv are increments in the plane of the
2 2
rectangle OABC. Now, one increment, say du, may point in the x-direction while p dv will point in a
direction up the plane, parallel to OC. Thus √ du = dx and (by Pythagoras) dv = (dy)2 + (dz)2 .
However, as z = y, dz = dy and hence dv = 2dy.
1 1 √
Thus, dS = (− √ j + √ k)dx 2dy = (−j + k)dxdy.
2 2
√
Note :- the factor of 2 could also have been found by comparing the area of rectangle OABC,
1
i.e. 1, with the area of its projection in the Oxy plane i.e. OADE with area √ .
2
Integrating a scalar field

A function can be integrated over a surface by constructing a double integral and integrating in a
manner similar to that shown in 27.1 and 27.2. Often, such integrals can be carried out
with respect to an element containing the unit normal.
Example 19
Evaluate the integral Z
1
dS
A 1 + x2
over the area A where A is the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, z = 0.
Solution
In this integral, dS becomes kdxdy i.e. the unit normal times the surface element. Thus the integral
is
Z 1 Z 1 Z 1 h
k −1
i1
2
dxdy = k tan x dy
y=0 x=0 1 + x y=0 0
Z 1 h Z 1
π i 1 π
= k( − 0) dy = k dy
y=0 4 0 4 y=0
π
= k
4
HELM (2006): 37
Example
ZZ 20
Find udS where u = r2 = x2 + y 2 + z 2 and S is the surface of the unit cube
S
0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1.
Solution
The unit cube has six faces and the unit normal vector n̂ points in a different direction on each face.
The surface integral must be evaluated for each face separately and the results summed.
On the face x = 0, the unit normal n̂ = −i and the surface integral is
Z 1 Z 1 Z 1 1
2 2 2 2 1 3
(0 + y + z )(−i)dzdy = −i y z+ z dy
y=0 z=0 y=0 3 z=0
Z 1 1
2 1 1 3 1 2
= −i y + dy = −i y + y = − i
y=0 3 3 3 0 3
On the face x = 1, the unit normal n̂ = i and the surface integral is

Z 1 Z 1 Z 1 1
2 2 2 2 1 3
(1 + y + z )(i)dzdy = i z+y z+ z dy
y=0 z=0 y=0 3 z=0
Z 1 1
2 4 1 3 4 5
= i y + dy = i y + y = i
y=0 3 3 3 0 3
The net contribution from the faces x = 0 and x = 1 is − 23 i + 53 i = i.

Due to the symmetry of the scalar field u and the unit cube, the net contribution from the faces
y = 0 and y = 1 is j while the netZ contribution
Z from the faces z = 0 and z = 1 is k.
The sum i.e. the surface integral udS = i + j + k
S
Key Point 4
A scalar function integrated with respect to a unit normal gives a vector quantity.
When the surface does not lie in one of the planes Oxy, Oxz, Oyz, extra care must be taken when
finding dS.
38 HELM (2006):
®
Example
ZZ 21
Find f dS where f is the function 2x and S is the surface of the triangle
S
bounded by (0, 0, 0), (0, 1, 1) and (1, 0, 1). (See Figure 10.)
z √
Area 3
2 y
x
1
Area
2
Figure 10: The triangle defining the area S
Solution
The unit vector n is perpendicular to two vectors in the plane e.g. (j + k) and (i + k). The vector
√
(j + k) × (i + k) = i + j − k and has magnitude 3. Hence the normal vector n̂ = √13 i + √13 j − √13 k.
√
As the area of the triangle is 23 and the area of its projection in the Oxy plane is 21 , the vector
√
3/2
dS = n̂ dydx = (i + j + k)dydx.
1/2
Thus
ZZ Z 1 Z 1−x
f dS = (i + j + k) 2x dydx
S x=0 y=0
Z 1 1−x
= (i + j + k) 2xy dx
x=0 y=0
Z 1
= (i + j + k) (2x − 2x2 )dx
x=0
1
2 2 3 1
= (i + j + k) x − x = (i + j + k)
3 0 3
The scalar function being integrated may be the divergence of a suitable vector function.
HELM (2006): 39
Example
ZZ 22
Find (∇ · F )dS where F = 2xi + yzj + xyk and S is the surface of the
S
triangle with vertices at (0, 0, 0), (1, 0, 0) and (1, 1, 0).
Solution
Note that ∇ · F = 2 + z = 2 as z = 0 everywhere along S. As the triangle lies in the Oxy plane,
the normal vector n = k and dS = kdydx.
Thus,
ZZ Z 1 Z x Z 1 x Z 1 h i1
(∇ · F )dS = 2dydxk = 2y dxk = 2xdxk = x2 k = k
S x=0 y=0 0 0 0 0
Task ZZ
Evaluate the integral 4x dS where S represents the trapezium with vertices
S
at (0, 0), (3, 0), (2, 1) and (0, 1).
(a) Find the vector dS:
Your solution
Answer
k dx dy
(b) Write the surface integral as a double integral:
Your solution
Answer
Z 1 Z 3−y
4x dxdyk. The range of values of y is y = 0 to y = 1.
y=0 x=0
For each value of y, x varies from x = 0 to x = 3 − y
(c) Evaluate this double integral:
Your solution
40 HELM (2006):
®
Answer
38
k
3
Exercises
ZZ
1. Evaluate the integral xydS where S is the triangle with vertices at (0, 0, 4), (0, 2, 0) and
S
(1, 0, 0).
ZZ
2. Find the integral xyzdS where S is the surface of the unit cube 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
S
0 ≤ z ≤ 1.
ZZ
∇ · (x2 i + yzj + x2 yk) dS where S is the rectangle with vertices

3. Evaluate the integral
S
at (1, 0, 0), (1, 1, 0), (1, 1, 1) and (1, 0, 1).
2 1 1 1 5
Answers 1. i+ j+ k 2. (x + y + z), 3. i
3 6 12 4 2
Integrating a vector field

In a similar manner to the Zcase of a scalar field,
Z a vector field may be integrated over a surface.
Two common integrals are F (r) · dS and F (r) × dS which integrate to a scalar and a vector
S S
respectively. Again,Z when dS is expressed appropriately, the expression will reduce to a double
integral. The form F (r)dS has many important applications, e.g. flux.
S
Example 23
Evaluate the integral
Z
(x2 yi + zj + (2x + y)k) · dS
A
over the area A where A is the square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, z = 0.
Solution
On A, the unit normal is dxdyk so the integral becomes
Z
(x2 yi + zj + (2x + y)k) · (k dxdy)
A
Z 1 Z 1 Z 1 1
2
= (2x + y) dxdy = x + xy dy
y=0 x=0 y=0 x=0
Z 1 1
1 2 3
= (1 + y)dy = y + y =
y=0 2 0 2
HELM (2006): 41
Example
Z 24
Evaluate r · dS where A represents the surface of the unit cube 0 ≤ x ≤ 1,
A
0 ≤ y ≤ 1, 0 ≤ z ≤ 1 and r represents the vector xi + yj + zk .
Solution
The vector dS (in the direction of the normal vector) will be a constant vector on each face, but
will be different for each face.
On the face x = 0 (left), dS = −dydzi and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(0i + yj + zk) · (−dydzi) = 0 dydz = 0
z=0 y=0 z=0 y=0
Similarly on the face y = 0 (front), dS = −dxdzj and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(xi + 0j + zk) · (−dxdzj) = 0 dxdz = 0
z=0 x=0 z=0 x=0
Furthermore on the face z = 0 (bottom), dS = −dxdyk and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(xi + yj + 0k) · (−dxdyk) = 0 dxdy = 0
x=0 y=0 x=0 y=0
On these three faces, the contribution to the integral is zero. However, on the face x = 1 (right),
dS = +dydzi and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(1i + yj + zk) · (+dydzi) = 1 dydz = 1
z=0 y=0 z=0 y=0
Similarly, on the face y = 1 (back), dS = +dxdzj and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(xi + 1j + zk) · (+dxdzj) = 1 dxdz = 1
z=0 x=0 z=0 x=0
and finally,on the face z = 1 (top), dS = +dxdyk and the integral on this face is
Z 1 Z 1 Z 1 Z 1
(xi + yj + 1k) · (+dxdyk) = 1 dxdy = 1
y=0 x=0 y=0 x=0
Z
Adding together the contributions from the various faces gives r · dS = 0 + 0 + 0 + 1 + 1 + 1 = 3
A
42 HELM (2006):
®
Magnetic flux
Introduction
ZZ
The magnetic flux through a surface is given by B·dS where S is the surface under consideration,
S
B is the magnetic field and dS is the vector normal to the surface.
Problem in words
The field generated by an infinitely long vertical wire on the z-axis is given by
µ0 I −yi + xj
B=
2π x2 + y 2
Find the flux through a rectangular region (with sides parallel to the axes) on the plane y = 0.
ZZ
Find the integral B · dS over the surface, x1 ≤ x ≤ x2 , z1 ≤ z ≤ z2 . (see Figure 11 which
S
shows part of the plane y = 0 for which the flux is found and a single magnetic field line. The
strength of the field is inversely proportional to the distance from the axis.)
z
z2
z1
S
y
x1
x x2
Figure 11: The surface S defined by x1 ≤ x ≤ x2 , z1 ≤ z ≤ z2

µ0 I µ0 I
On y = 0, B = j and dS = dx dz j so B · dS = dx dz and the flux is given by
2πx 2πx
the double integral
Z z 2 Z x2 Z z2 h
µ0 I µ0 I i x2
dx dz = ln x dz
z=z1 x=x1 2πx 2π z=z1 x1
Z z2
µ0 I
= ln x2 − ln x1 dz
2π z=z1

µ0 I h i z2 µ0 I x2
= z ln x2 − ln x1 = (z2 − z1 ) ln
2π z=z1 2π x1
HELM (2006): 43
Interpretation
The magnetic flux increases in direct proportion to the extent of the side parallel to the axis (i.e.
along the z-direction) but logarithmically with respect to the extent of the side perpendicular to the
axis (i.e. along the x-axis).
Example 25 ZZ
2 2 2
If F = x i + y j + z k, evaluate F × dS where S is the part of the plane
S
z = 0 bounded by x = ±1, y = ±1.
Solution

i j k

2 2
Here dS = dxdyk and hence F × dS = x y z 2 = y 2 dxdyi − x2 dxdyj and

0 0 dxdy
ZZ Z 1 Z 1 Z 1 Z 1
2
F × dS = y dxdyi − x2 dxdyj
S y=−1 x=−1 y=−1 x=−1
The integral
Z 1 Z 1 Z 1 1 Z 1 1
2 2 2 4
y dxdy = y x dy = 2y dy = y 32
=
y=−1 x=−1 y=−1 x=−1 y=−1 3 −1 3
Z 1 Z 1
4
Similarly x2 dxdy = .
Z Z y=−1 x=−1 3
4 4
Thus F × dS = i − j
S 3 3
Key Point 5
Z
(a) An integral of the form F (r) · dS evaluates to a scalar.
S
Z
(b) An integral of the form F (r) × dS evaluates to a vector.
S
The vector function involved may be the gradient of a scalar or the curl of a vector.
44 HELM (2006):
®
Example
Z Z26
Integrate (∇φ).dS where φ = x2 + 2yz and S is the area between y = 0 and
S
y = x2 for 0 ≤ x ≤ 1 and z = 0. (See Figure 12.)
y
1
S
x
1
Figure 12: The area S between y = 0 and y = x2 , for 0 ≤ x ≤ 1 and z = 0
Solution
Here ∇φ = 2xi + 2zj + 2yk and dS = k dydx. Thus (∇φ).dS = 2ydydx and
ZZ Z 1 Z x2
(∇φ).dS = 2y dydx
S x=0 y=0
Z 1 x2 Z 1
2
= y dx = x4 dx
x=0 y=0 x=0
1
1 5 1
= x =
5 0 5
ZZ
For integrals of the form F · dS, non-Cartesian coordinates e.g. cylindrical polar or spherical
S
polar coordinates may be used. Once again, it is necessary to include any scale factors along with
the unit normal.
Example 27 Z
Using cylindrical polar coordinates, (see 28.3), find the integral F (r) · dS
S
for F = ρz ρ̂ + z sin2 φẑ and S being the complete surface (including ends) of the
cylinder ρ ≤ a, 0 ≤ z ≤ 1. (See Figure 13.)
z
y
z=1
ρ=a x
Figure 13: The cylinder ρ ≤ a, 0 ≤ z ≤ 1 defining S
HELM (2006): 45
Solution
Z
The integral F (r) · dS must be evaluated separately for the curved surface and the ends.
S
For the curved surface, dS = ρ̂adφdz (with the a coming from ρ the scale factor for φ and the fact
that ρ = a on the curved surface.) Thus, F · dS = a2 z dφdz and
ZZ Z a Z 2π
F (r) · dS = a2 z dφdz
S z=0 φ=0
Z a a
2 2 1 2
= 2πa z dz = 2πa z = πa4
z=0 2 0
On the bottom, z = 0 so F = 0 and the contribution to the integral is zero.

On the top, z = 1 and dS = ẑρ dρdφ and F · dS = ρz sin2 φ dφdρ = ρ sin2 φ dφdρ and
ZZ Z a Z 2π
F (r) · dS = ρ sin2 φ dφdρ
S ρ=0 φ=0
Z a
1
= π ρ dρ = πa2
ρ=0 2
ZZ
1 1
So F (r) · dS = πa4 + πa2 = πa2 (a2 + )
S 2 2
The current continuity equation
Introduction
When an electric current flows at a constant rate through a conductor, then the current continuity
equation states that
I
J · dS = 0
S
where J is the current density (or current flow per unit area) and S is a closed surface. The equation
is an expression of the fact that, under these conditions, the current flow into a closed volume equals
the flow out.
Problem in words
A person is standing nearby when lightning strikes the ground. Find the voltage between the feet of
the person.
46 HELM (2006):
®
Figure 14: Lightning: a current dissipating into the ground

The current from the lightning dissipates radially (see Fig 14).
(a) Find a relationship between the current I and current density J atZa distance r from the
strike by integrating the current density over the hemisphere I = J · dS
S
ρI
(b) Find the field E from the equation E = 2 where E = |E| and I is the current.
2π r
Z R2
(c) Find V from the integral E · dr
R1
Imagine a hemisphere of radius r level with the surface of the ground so that the point of lightning
strike is at its centre. By symmetry, the pattern of current flow from the point of strike will be
uniform radial lines, and the magnitude of J will be a constant, i.e. over the curved surface of the
hemisphere J = J r̂.
Since the amount of current entering the hemisphere is I, then it follows that the current leaving
must be the same i.e.
Z
I = J · dS (where Sc is the curved surface of the hemisphere)
Sc
Z
= (J r̂) · (dS r̂)
Sc
Z
= J dS
Sc
2
= 2πr J [= surface area (2πr2 ) × flux (J)]
since the surface area of a sphere is 4πr2 . Therefore

I
J=
2πr2
Note that if the current density J is uniformly radial over the curved surface, then the electric field
E must be also, i.e. E = E r̂. Using Ohm’s law
J = σE or E = ρJ
where σ = conductivity = 1/ρ. Hence
HELM (2006): 47
ρI
E=
2πr2
The voltage difference between two points at radii R1 and R2 from the lightning strike is found by
integrating E between them, so that
Z R2
V = E · dr
R1
Z R2
= E dr
R1
Z R2
ρI dr
=
2π R1 r2
R2
ρI −1
=
2π r R1

ρI 1 1 ρI R2 − R1
= − =
2π R1 R2 2π R1 R2
Interpretation
Suppose the lightning strength is current I = 10, 000 A, the person is 12 m away with feet 0.35 m
apart, and the resistivity of the ground is 80 Ω m. Clearly, the worst case (i.e. maximum voltage)
would occur when the difference between R1 and R2 is greatest, i.e. R1 =12 m and R2 =12.35 m
which would be the case if both feet were on the same radial line. The voltage produced under these
circumstances is

ρI 1 1
V = −
2π R1 R2

80 × 10000 1 1
= −
2π 12 12.35
= 300 V
Task
For F = (x2 + y 2 )i + (x2 + z 2 )j + 2xzk and S being the rectangleZ bounded by
(1, 0, 1), (1, 0, −1), (−1, 0, −1) and (−1, 0, 1) find the integral F · dS
S
Your solution
Answer Z 1 Z 1
8
dS = dxdzj (x2 + z 2 ) dxdz =
−1 −1 3
48 HELM (2006):
®
Task
For F = (x2 + y 2 )i + (x2 + z 2 )j + 2xzk and S being the rectangle bounded by
(1, 0, 1), (1, 0, −1), (−1, 0, −1) and
Z (−1, 0, 1) (i.e. the same F and S as in the
previous Task), find the integral F × dS
S
Your solution
Answer
Z 1 Z 1 Z 1 Z 1
2 4
(−2xz)i + (x + 0)k dxdz = k
−1 −1 −1 −1 3
Exercises
ZZ
1. Evaluate the integral ∇φ · dS for φ = x2 z sin y and S being the rectangle bounded by
S
(0, 0, 0), (1, 0, 1), (1, π, 1) and (0, π, 0).
ZZ
2. Evaluate the integral (∇ × F ) × dS where F = xey i + zey j and S represents the unit
S
square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1.
ZZ
3. Using spherical polar coordinates, evaluate the integral F · dS where F = r cos θr̂ and S
S
is the curved surface of the top half of the sphere r = a.
π
Answers 1. − , 2. (e − 1)j, 3. πa3
3
2. Volume integrals involving vectors

Integrating a scalar function of a vector over a volume is essentially the same procedure as in
27.3. In 3D cartesian coordinates the volume element dV is dxdydz. The scalar function may be
the divergence of a vector function.
HELM (2006): 49
Example 28
Integrate ∇ · F over the unit cube 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 where F is
the vector function x2 yi + (x − z)j + 2xz 2 k.
Solution
∂ 2 ∂ ∂
∇·F = (x y) + (x − z) + (2xz 2 ) = 2xy + 4xz
∂x ∂y ∂z
The integral is
Z 1 Z 1 Z 1 Z 1 Z 1 1
2
(2xy + 4xz)dzdydx = 2xyz + 2xz dydx
x=0 y=0 z=0 x=0 y=0 0
Z 1 Z 1 Z 1 1
= (2xy + 2x) dydx = xy 2 + 2xy dx
x=0 y=0 x=0 0
Z 1 1
3 3
= 3xdx = x2 =
x=0 2 0 2
Key Point 6
The volume integral of a scalar function (including the divergence of a vector) is a scalar.
Task
2 2
Using spherical
Z Zpolar
Z coordinates and the vector field F = r r̂ +r sin θφ̂, evaluate
the integral ∇ · F dV over the sphere given by r ≤ a.
V
Your solution
Answer Z a Z π Z 2π
∇ · F = 4r + 2r cos θ, {(4r + 2r cos θ)r2 sin θ} dφdθdr = 4πa4
r=0 θ=0 φ=0
The r2 sin θ term comes from the Jacobian for the transformation from spherical to cartesian coor-
dinates (see 27.4 and 28.3).
50 HELM (2006):
®
Exercises
ZZZ
1. Evaluate ∇·F dV when F is the vector field yzi+xyj and V is the unit cube 0 ≤ x ≤ 1,
V
0 ≤ y ≤ 1.
ZZZ
2 2 z 2 y
2. For the vector field F = (x y+sin z)i+(xy +e )j+(z +x )k, find the integral ∇·F dV
V
where V is the volume inside the tetrahedron bounded by x = 0, y = 0, z = 0 and x+y+z = 1.
1 1
Answers 1. , 2.
2 20
Less commonly, integrating a vector function over a volume integral is similar, but care should be
taken with the various components. It may help to think in terms of a separate volume integral for
each component. The vector function may be of the form ∇f or ∇ × F .
Example 29
Integrate the function F = x2 i+2j over the prism given by 0 ≤ x ≤ 1, 0 ≤ y ≤ 2,
0 ≤ z ≤ (1 − x). (See Figure 15.)
1 2
x
1
Figure 15: The prism bounded by 0 ≤ x ≤ 1, 0 ≤ y ≤ 2, 0 ≤ z ≤ (1 − x)
Solution
The integral is
Z 1 Z 2 Z 1−x Z 1 Z 2 1−x
2 2
(x i + 2j)dzdydx = x zi + 2zj dydx
x=0 y=0 z=0 x=0 y=0 z=0
Z 1 Z 2 Z 1 Z 2
2
(x2 − x3 )i + (2 − 2x)j dydx

= x (1 − x)i + 2(1 − x)j dydx =
x=0 y=0 x=0 y=0
Z 1 1
2 3
2 3 1 4 2
= (2x − 2x )i + (4 − 4x)j dx = ( x − x )i + (4x − 2x )j
x=0 3 2 0
1
= i + 2j
6
HELM (2006): 51
Example 30 ZZZ
2 2
For F = x yi + y j evaluate (∇ × F )dV where V is the volume under the
V
plane z = x + y + 2 (and above z = 0) for −1 ≤ x ≤ 1, −1 ≤ y ≤ 1.
Solution

i j k

∂ ∂ ∂
∇ × F = = −x2 k
∂x ∂y ∂z

2
x y y2 0
so
ZZZ Z 1 Z 1 Z x+y+2
(∇ × F )dV = (−x2 )kdzdydx
V x=−1 y=−1 z=0
Z 1 Z 1 x+y+2
2
= (−x )zk dydx
x=−1 y=−1 z=0
Z 1 Z 1
3
−x − x2 y − 2x2 dydxk

=
x=−1 y=−1
Z 1
1
3 1 2 2 2
= −x y − x y − 2x y dxk
x=−1 2 y=−1
Z 1 1
3 2
1 4 4 3 8
= −2x − 0 − 4x dxk = − x − x k=− k
x=−1 2 3 −1 3
(1, 1, 4) y
(−1, 1, 0) (1, 1, 0)
(−1, −1, 0) (1, −1, 0)
Figure 16: The plane defined by z = x + y + z, for z > 0, −1 ≤ x ≤ 1, −1 ≤ y ≤ 1
52 HELM (2006):
®
Key Point 7
The volume integral of a vector function (including the gradient of a scalar or the curl of a vector)
is a vector.
Task Z
Evaluate the integral F dV for the case where F = xi + y 2 j + zk and V is the
V
cube −1 ≤ x ≤ 1, −1 ≤ y ≤ 1, −1 ≤ z ≤ 1.
Your solution
Answer
Z 1 Z 1 Z 1
8
(xi + y 2 j + zk)dzdydx = j
x=−1 y=−1 z=−1 3
Exercises
1. For f = x2 + yz, and V theZvolume
Z Z bounded by y = 0, x + y = 1 and −x + y = 1 for
−1 ≤ z ≤ 1, find the integral (∇f )dV .
V
Z
2. Evaluate the integral (∇ × F )dV for the case where F = xzi + (x3 + y 3 )j − 4yk and V is
V
the cube −1 ≤ x ≤ 1, −1 ≤ y ≤ 1, −1 ≤ z ≤ 1.
Answers
2
1. k,
3
2. −32i + 8k
HELM (2006): 53
Integral Vector
Theorems 29.3
Introduction
Various theorems exist equating integrals involving vectors. You have already met the fundamental
theorem of line integrals. Those involving line, surface and volume integrals are introduced here.
They are the multivariable calculus equivalent of the fundamental theorem of calculus for single
variables (“integration and differentiation are the reverse of each other”).
Often, use of these theorems can make certain vector integrals easier. This Section introduces the
theorems known as Gauss’ theorem, Stokes’ theorem and Green’s theorem.
#
• be able to find the gradient of a scalar field
and the divergence and curl of a vector field
Prerequisites
Before starting this Section you should . . . • be familiar with the integration of vector
functions
" !

Learning Outcomes • use vector integral theorems to facilitate

vector integration

54 HELM (2006):
®
1. Stokes’ theorem
This is a theorem that equates a line integral to a surface integral. For any vector field F and a
contour C which bounds an area S,
ZZ I
(∇ × F ) · dS = F · dr
S C
dS
C
Figure 17: A surface for Stokes’ theorem
Notes
(a) dS is a vector perpendicular to the surface and dr is the line element along the contour C.
(b) Both sides of the equation are scalars.
(c) The theorem is often a useful way of calculating a line integral along a contour composed of
several distinct parts (e.g. a square or other figure).
(d) ∇ × F is a vector field representing the curl of the vector field F and may, alternatively, be
written as curl F .
Justification of Stokes’ theorem

Imagine that the surface is divided into a set of infinitesimally small rectangles ABCD where the
axes are adjusted so that AB and CD lie parallel to the new x-axis i.e. AB = δx and BC and AD
lie parallel
I to the new y-axis i.e. BC = δy.
Now, F · dr is calculated.
C
The contributions along AB, BC, CD and DA are

F (x, y, 0) · δx = Fx (x, y, z)δx,
F (x + δx, y, 0) · δy = Fy (x + δx, y, z)δy,
F (x, y + δy, 0) · (−δx) = −Fx (x, y + δy, z)δx
F (x, y, 0) · (−δx) = −Fy (x, y, z)δy.
Thus,
I
F · dr ≈ (Fx (x, y, z) − Fx (x, y + δy, z))δx + (Fy (x + δx, y, z) − Fy (x, y, z))δy
C
∂Fy ∂Fx
≈ − δxδy + δxδy
∂x ∂y
≈ (∇ × F )z δS ≈ (∇ × F ) · dS
as dS is perpendicular to the x- and

I y- axes.
Thus, for each small rectangle, F · dr ≈ (∇ × F ) · dS
C
HELM (2006): 55
Section 29.3: Integral Vector Theorems
When the contributions over all the rectangles are summed, the line integrals for the inner parts of
the rectangles cancel and all that remains is the line integral around the outside of the shape. The
surface integrals sum. Hence, the theorem applies for the area S bounded by the contour C.
While the above does not comprise a formal proof of Stokes’ theorem, it gives an appreciation of
where the theorem comes from.
Contribution does
not cancel
Contributions cancel
Figure 18: Line integral cancellation and non-cancellation
Key Point 8
Stokes’ Theorem
ZZ I
(∇ × F ) · dS = F · dr
S C
The closed contour integral of the scalar product of a vector function with the vector along the
contour is equal to the integral of the scalar product of the curl of that vector function and the unit
normal, over the corresponding surface.
Example 31
Verify Stokes’ theorem for the vector function F = y 2 i − (x + z)j + yzk and the
unit square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 for z = 0.
Solution
If F = y 2 i − (x + z)j + yzk then ∇ × F = (z + 1)i + (−1 − 2y)k = i + (−1 − 2y)k (as z = 0).
Note that dS = dxdyk so that (∇ × F ) · dS = (−1 − 2y)dydx
ZZ Z 1 Z 1
Thus (∇ × F ) · dS = (−1 − 2y)dydx
S x=0 y=0
Z 1 1 Z 1
2
= (−y − y ) dx = (−2)dx
x=0 y=0 x=0
1
= − 2x = −2 + 0 = −2
0
56 HELM (2006):
®
Solution (contd.)
I
To evaluate F · dr, consider it separately on the four sides.
C
When y = 0, F = −xj and dr = dxi so F · dr = 0 so the contribution to the integral is zero.
When x = 1, F = y 2 i − j and dr = dyj so F · dr = −dy so the contribution to the integral is
Z 1 1
(−dy) = − y = −1.
y=0 0
When y = 1, F = i − xj and dr = −dxi so F · dr = −dx so the contribution to the integral is
Z 1 1
(−dx) = − x = −1.
x=0 0
2
When x = 0, F = y i and dr = −dyj so F · dr = 0 so the contribution to the integral is zero.
I
The integral F · dr is the sum of the contributions i.e. 0 − 1 − 1 + 0 = −2.
Z ZC I
Thus (∇ × F ) · dS = F · dr = −2
S C
Example 32
Using cylindrical polar coordinates verify Stokes’ theorem for the function F = ρ2 φ̂
and the circle ρ = a and the surface ρ ≤ a. (It is effectively plane-polar coordinates
here as this example just considers the plane z = 0.)
Solution
I
Firstly, find F · dr. This can be done by integrating along the contour ρ = a from φ = 0 to
C
φ = 2π. Here F = a2 φ̂ (as ρ = a) and dr = adφφ̂ (remembering the scale factor) so F · dr = a3 dφ
and hence
I Z 2π
F · dr = a3 dφ = 2πa3
C 0
As F = ρ2 φ̂, ∇ × F = 3ρẑ and (∇ × F ) · dS = 3ρ as dS = ẑ.

Thus
ZZ Z 2π Z 1 Z 2π Z a
(∇ × F ) · dS = 3ρ × ρdρdφ = 3ρ2 dρdφ
S φ=0 ρ=0 φ=0 ρ=0
Z 2π a Z 2π
= ρ3 dφ = a3 dφ = 2πa3
φ=0 ρ=0 0
Hence
I ZZ
F · dr = (∇ × F ) · dS = 2πa3
C S
HELM (2006): 57
Example 33 I
Find the closed line integral F ·dr for the vector field F = y 2 i+(x2 −z)j +2xyk
C
and for the contour ABCDEF GHA in Figure 19.
y F (1, 7) E(5, 7)
H(0, 4) C(6, 4)
D(2, 4)
G(1, 4)
A(0, 0) B(6, 0) x
Figure 19: Closed contour ABCDEF GHA
Solution
To find the line integral directly would require eight line integrals i.e. along
Z ZAB, BC, CD, DE,
EF , F G, GH and HA. It is easier to carry out a surface integral to find (∇ × F ) · dS which
I S
is equal to the required line integral F · dr by Stokes’ theorem.

C
i j k

∂ ∂ ∂
As F = y 2 i + (x2 − z)j + 2xyk, ∇ × F = ∂x ∂z = (2x + 1)i − 2yj + (2x − 2y)k

∂y

y 2 x2 − z 2xy
As the contour lies in the x-y plane, the unit normal is k and dS = dxdyk
Hence (∇ × Z F )Z· dS = (2x − 2y)dxdy.
To work out (∇ × F ) · dS, it is necessary to divide the area inside the contour into two smaller
S
areas i.e. the rectangle ABCDGH and the trapezium DEF G. On ABCDGH, the integral is
Z 4 Z 6 Z 4 h i6 Z 4
2
(2x − 2y)dxdy = x − 2xy dy = (36 − 12y)dy
y=0 x=0 y=0 x=0 y=0
4
= 36y − 6y 2 = 36 × 4 − 6 × 16 − 0 = 48
0
On DEF G, the integral is
Z 7 Z y−2 Z 7 y−2 Z 7
2
(2x − 2y)dxdy = x − 2xy dy = (−y 2 + 2y + 3)dy
y=4 x=1 y=4 x=1 y=4
7
1 3 2 343 64
= − y + y + 3y = + 49 + 21 + − 16 − 12 = −51
3 4 3 3
ZZ
So the full integral is, (∇ × F ) · dS = 48 − 51 = −3.
S
ZZ I
By Stokes’ theorem, (∇ × F ) · dS = F · dr = −3
S C
58 HELM (2006):
®
Z Z
From Stokes’ theorem, it can be seen that surface integrals of the form (∇ × F ) · dS depend
S
only on the contour bounding the surface and not on the internal part of the surface.
Task
Verify Stokes’ theorem for the vector field F = x2 i + 2xyj + zk and the triangle
with vertices at (0, 0, 0), (3, 0, 0) and (3, 1, 0).
First find the normal vector dS:
Your solution
Answer
dxdyk
Then find the vector ∇ × F :
Your solution
Answer
2yk
Z 3 Z x/3
Now evaluate the double integral (∇ × F ) · dS:
x=0 y=0
Your solution
Answer
1
HELM (2006): 59
Z
Finally find the integral F · dr along the 3 sides of the triangle and so verify that the two sides of
the equation in the theorem are equal:
Your solution
Answer
9, 3, −11, Both sides are 1.
Exercises
1. Using plane-polar coordinates (or cylindrical polar coordinates with z = 0), verify Stokes’
theorem for the vector field F = ρρ̂ + ρ cos πρ
2
φ̂ and the semi-circle ρ ≤ 1, − π2 ≤ φ ≤ π2 .
2. Verify Stokes’ theorem for the vector field F = 2xi + (y 2 − z)j + xzk and the contour around
the rectangle with vertices at (0, −2, 0),(2, −2, 0), (2, 0, 1) and (0, 0, 1).
3. Verify Stokes’ theorem for the vector field F = −yi + xj + zk and for the contour starting
from the origin and going to (1, 0, 0), (0, 0, 0), (1, 1, 0) and (1, 1, 1) before returning to the
origin.
(a) Find the surface integral over the triangle (0, 0, 0), (1, 0, 0), (1, 1, 0).
(b) Find the surface integral over the triangle (1, 0, 0), (1, 1, 0), (1, 1, 1).
(c) Find the line integrals along the four parts of the contour.
(d) Show that the two sides of the equation of the theorem are equal.
4. Use Stokes’ theorem to evaluate the integral

I
1 2
F · dr where F = sin( + 1) + 5y i + (2x − ey )j
C x
and C is the contour starting at (0, 0) and going to (5, 0), (5, 2), (6, 2), (6, 5), (3, 5), (3, 2),
(0, 2) and returning to (0, 0).
Answers
1. Both sides are 0,
2. Both sides are −2
3. (a) 1 (b) 0 (as F is perpendicular to dS) (c) 0, 1, 1, −1 (d) Both sides are 1
4. −57, [∇ × F = −3].
60 HELM (2006):
®
2. Gauss’ theorem
This is sometimes also known as the divergence theorem and is similar to Stokes’ theorem but
equates a surface integral to a volume integral. Gauss’ theorem states that for a volume V , bounded
by a closed surface S, any ‘well-behaved’ vector field F satisfies
ZZ ZZZ
F · dS = ∇ · F dV
S V
Notes
1. dS is a unit normal pointing outwards.
2. Both sides of the equation are scalars.
3. The theorem is often a useful way of calculating a surface integral over a surface composed of
several distinct parts (e.g. a cube).
4. ∇ · F is a scalar field representing the divergence of the vector field F and may, alternatively,
be written as div F .
5. Gauss’ theorem can be justified in a manner similar to that used for Stokes’ theorem (i.e. by
proving it for a small volume element, then summing up the volume elements and allowing the
internal surface contributions to cancel.)
Key Point 9
Gauss’ Theorem
ZZ ZZZ
F · dS = ∇ · F dV
S V
The closed surface integral of the scalar product of a vector function with the unit normal (or flux of
a vector function through a surface) is equal to the integral of the divergence of that vector function
over the corresponding volume.
HELM (2006): 61
Example 34
Verify Gauss’ theorem for the unit cube 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 and
the function F = xi + zj
Solution
ZZ
To find F · dS, the integral must be evaluated for all six faces of the cube and the results
S
summed.
On the left face, x = 0, F = zj and dS = −i dydz so F · dS = 0 and
ZZ Z 1Z 1
F · dS = 0 dydz = 0
S 0 0
On the right face, x = 1, F = i + zj and dS = i dydz so F · dS = 1 dydz and

ZZ Z 1Z 1
F · dS = 1 dydz = 1
S 0 0
On the front face, y = 0, F = xi + zj and dS = −j dxdz so F · dS = −z dxdz and

ZZ Z 1Z 1
1
F · dS = − z dxdz = −
S 0 0 2
On the back face, y = 1, F = xi + zj and dS = j dxdz so F · dS = z dxdz and
ZZ Z 1Z 1
1
F · dS = z dxdz =
S 0 0 2
On the bottom face, z = 0, F = xi and dS = −k dydz so F · dS = 0 dxdy and
ZZ Z 1Z 1
F · dS = 0 dxdy = 0
S 0 0
On the top face, z = 1, F = xi + j and dS = k dydz so F · dS = 0 dxdy and

ZZ Z 1Z 1
F · dS = 0 dxdy = 0
S 0 0
ZZ
1 1
Thus, summing over all six faces, F · dS = 0 + 1 − + + 0 + 0 = 1.
S 2 2
ZZZ
∂ ∂
To find ∇ · F dV note that ∇ · F = x+ z = 1 + 0 = 1.
V ∂x ∂y
ZZZ Z 1 Z 1 Z 1
So ∇ · F dV = 1 dxdydz = 1.
V 0 0 0
ZZ ZZZ
So F · dS = ∇ · F dV = 1.
S V
Note that the volume integral required just one triple integral while the surface integral required six
double integrals. Reducing the number of integrals is often the motivation for using Gauss’ theorem.
62 HELM (2006):
®
Example 35 ZZ
Use Gauss’ theorem to evaluate the surface integral F · dS where F is the
S
vector field x2 yi + 2xyj + z 3 k and S is the surface of the unit cube 0 ≤ x ≤ 1,
0 ≤ y ≤ 1, 0 ≤ z ≤ 1.
Solution
Note that to carry out the surface integral directly will involve the evaluation of six double integrals
(one for each face
Z Z of
Z the cube). However, by Gauss’ theorem, the same result comes from the
surface integral ∇ · F dV . As ∇ · F = 2xy + 2x + 3z 2 , the surface integral becomes the
V
triple integral
Z 1Z 1Z 1
(2xy + 2x + 3z 2 ) dxdydz
0 0 0
Z 1Z 1 1 Z 1 Z 1
2 2 2
= x y + x + 3xz dydz = (y + 1 + 3z 2 )dydz
0 0 x=0 0 0
Z 1 1 Z 1 Z 1
1 2 1 3
= y + y + 3yz 2 dz = ( + 1 + 3z 2 )dz = ( + 3z 2 )dz
0 2 y=0 0 2 0 2
1
3 3 5
= z+z =
2 0 2
5
The six double integrals also sum to 2
but this approach requires a greater amount of work.
Gauss’ law
Introduction
From Gauss’ theorem, it is possible to derive a result which can be used to gain insight into situations
involved in Electrical Engineering. Knowing the electric field on a closed surface, it is possible to
find the electric charge within this surface. Alternatively, in a sufficiently symmetrical situation, it is
possible to find the electric field produced by a given charge distribution.
Gauss’ theorem states
Z Z Z Z Z
F · dS = ∇ · F dV
S V
If F = E, the electric field,

q
∇·F =∇·E =
ε0
HELM (2006): 63
where q is the amount of charge per unit volume and ε0 is the permittivity of free space: ε0 =
10−9 /36π F m−1 ≈ 8.84×10−12 F m−1 . Gauss’ theorem becomes
Z Z Z Z Z Z Z Z
q 1 Q
E · dS = dV = q dV =
S V ε0 ε0 V ε0
i.e.
Z Z
Q
E · dS =
S ε0
which is known as Gauss’ law. Note: this is one of the important Maxwell’s Laws. Using this law,
the charge within a given surface can be found from ε0 times the surface integral of E · dS over that
surface. In certain symmetrical circumstances, it can also be used to find the electric field produced
by a given charge distribution.
Problem in words
A point charge lies at the centre of a cube. Given the electric field, find the magnitude of the charge,
using Gauss’ law .
Consider the cube − 12 ≤ x ≤ 12 , − 12 ≤ y ≤ 12 , − 21 ≤ z ≤ 12 where the dimensions are in metres. A
point charge Q lies at the centre of the cube. If the electric field on the top face (z = 12 ) is given by
xi + yj + zk
E = 10 3
(x2 + y 2 + z 2 ) 2
find the charge Q from Gauss’ law .
" Z 1 Z 1 − 23 #
2 2 1 4π
Hint : x2 + y 2 + dy dx =
x=− 12 y=− 21 4 3
From Gauss’ law
Z Z
Q
E · dS =
S ε0
so
Z Z Z Z
Q = ε0 E · dS = −6ε0 E · dS
S S(top)
since, using the symmetry of the six faces of the cube, it is possible to integrate over just one of
them (here the top face is chosen) and multiply by 6. On the top face
xi + yj + 21 k
E = 10 3
x2 + y 2 + 14 2
and
dS = (element of surface area) × (unit normal to it)

= dx dy k
64 HELM (2006):
®
So
1
2
E · dS = 10 23 dy dx
1
x2 + y 2 + 4
− 32
1 2 2
= 5 x +y + dy dx
4
Now
Z Z Z 1 Z 1 − 32
2 2
2 1 2
E · dS = 5 x +y + dy dx
S(top) x=− 12 y=− 21 4
4π
= 5× (using the hint)
3
20π
=
3
So, from Gauss’ law above,

20π
Q = 6ε0 × = 40πε0 ≈ 10−9 C
3
Interpretation
Gauss’ law can be used to find a charge from its effects elsewhere.
xi + yj + 12 k r r̂
The form of E = 10 3 comes from the fact that E is radial and equals 10 3 = 10 2
x2 + y 2 + 14 2
r r
Example 36
Verify Gauss’ theorem for the vector field F = y 2 j − xzk and the triangular prism
with vertices at (0, 0, 0), (2, 0, 0), (0, 0, 1), (0, 4, 0), (2, 4, 0) and (0, 4, 1) (see
Figure 20).
z (0, 4, 1)
y
(0, 0, 1) (2, 4, 0)
(0, 4, 0)
(0, 0, 0)
(2, 0, 0) x
Figure 20: The triangular prism defined by six vertices
HELM (2006): 65
Solution
As F = y 2 j − xzk, ∇ · F = 0 + 2y − x = 2y − x.
Thus
ZZZ Z 2 Z 4 Z 1−x/2
∇ · F dV = (2y − x)dzdydx
V x=0 y=0 z=0
Z 2 Z 4 1−x/2 Z 2 Z 4
1
= 2yz − xz dydx = (2y − xy − x + x2 )dydx
x=0 y=0 z=0 x=0 y=0 2
Z 2 4 Z 2
2 1 2 1 2
= y − xy − xy + x y dx = (16 − 12x + 2x2 )dx
x=0 2 2 y=0 x=0
2
2 40
= 16x − 6x2 + x3 =
3 0 3
ZZ
To work out F · dS, it is necessary to consider the contributions from the five faces separately.
S
On the front face, y = 0, F = −xzk and dS = −j thus F · dS = 0 and the contribution to the
integral is zero.
On the back face, y = 4, F = 16j − xzk and dS = j thus F · dS = 16 and the contribution to the
integral is
Z 2 Z 1−x/2 Z 2 1−x/2 Z 2 2
2
16dzdx = 16z dx = 16(1 − x/2)dx = 16x − 4x = 16.
x=0 z=0 x=0 z=0 x=0 0
2
On the left face, x = 0, F = y j and dS = −i thus F · dS = 0 and the contribution to the integral
is zero.
On the bottom face, z = 0, F = y 2 j and dS = −k thus F · dS = 0 and the contribution to the
integral is zero.
On the top right face, z = 1 − x/2, F = y 2 j + ( 12 x2 − x)k and the unit normal n̂ = √15 i + √25 k
h i
Thus dS = √15 i + √25 k dydw where dw measures the distance along the slope for a constant y.
√
As dw = 25 dx, dS = 12 i + k dydx thus F · dS = 16 and the contribution to the integral is

Z 2 Z 4 Z 2 2
1 2 2 2 3 2 8
( x − x)dydx = (2x − 4x)dx = x − 2x =− .
x=0 y=0 2 x=0 3 0 3
ZZ
8 40
Adding together the contributions, F · dS = 0 + 16 + 0 + 0 − = .
S 3 3
ZZ ZZZ
40
Thus F · dS = ∇ · F dV = .
S V 3
Gauss’ theorem also applies using orthogonal curvilinear coordinates.
66 HELM (2006):
®
Field strength around a charged line

Problem in words
Find the electric field strength at a given distance from a uniformly charged line.

Determine the electric field at a distance r from a uniformly charged line (charge per unit length ρL ).
You may assume that the field points directly away from the line, using symmetry arguments.
l
Figure 21: Field strength around a line current

Imagine a cylinder a distance r from the line and of length l. From Gauss’ law
Z Z
Q
E · dS =
S ε0
As the charge per unit length is ρL , then the right-hand side equals ρL l/ε0 . On the left-hand side,
the integral can be expressed as the sum
Z Z Z Z Z Z
E · dS = E · dS + E · dS
S S(ends) S(curved)
Looking first at the circular ends of the cylinder, the fact that the field lines point (radially) away
from the charged line implies that the electric field is in the plane of these circles and has no normal
component. Therefore E · dS will be zero.
Next, over the curved surface of the cylinder, the electric field is normal to it, and the symmetry
of the problem implies that the strength of the electric field will be constant (here denoted E).
Therefore the integral = Total curved surface area × Field strength = 2πrlE.
So, going back to Gauss’ law
Z Z Z Z
Q
E · dS + E · dS =
S(ends) S(curved) ε0
or
ρL l
0 + 2πrlE =
ε0
Interpretation
ρL
Hence, the field strength E is given by E=
2πε0 r
HELM (2006): 67
Field strength on a cylinder
Problem in words
Given the electric field E on the surface of a cylinder, use Gauss’ law to find the charge per unit
length.
On the surface of a long cylinder of radius a, the electric field is given by
ρL (a + b cos θ) r̂ − b sin θ θ
E=
2πε0 (a2 + 2ab cos θ + b2 )
(using cylindrical polar co-ordinates) due to a line of charge a distance b (< a) from the centre of
the cylinder. Using Gauss’ law , find the charge per unit length.
ZZ
Q
Find the integral E · dS and by equating this to , find b in the expression for E, using the
ε0
result
Z 2π
a + b cos θ 2π
dθ =
0 (a2 + 2ab cos θ + b2 ) a
Consider a cylindrical section - as in the previous example, there are no contributions from the ends
of the cylinder since the electric field has no normal component here. However, on the curved surface
dS = a dθ dz r̂
So
ρL a + b cos θ
E · dS = a dθ dz
2πε0 (a2 + 2ab cos θ + b2 )
Integrating over the curved surface of the cylinder
Z Z Z l Z θ=2π
aρL a + b cos θ
E · dS = dθ dz
S z=0 θ=0 2πε0 (a + 2ab cos θ + b2 )
2
aρL l 2π
Z
a + b cos θ
= dθ
2πε0 0 (a2 + 2ab cos θ + b2 )
ρL l
= using the given result.
ε0
From Gauss’ law
ρL l Q
= so Q = ρL l
ε0 ε0
Interpretation
Therefore the charge per unit length on the line of charge is given by ρL (i.e. the charge per unit
length is constant).
68 HELM (2006):
®
Task
Verify Gauss’ theorem for the vector field F = xi − yj + zk and the unit cube
0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1.
(a) Find the vector ∇ · F .

Z 1 Z 1 Z 1
(b) Evaluate the integral ∇ · F dxdydz.
z=0 y=0 x=0
(c) For
Z Z each side, evaluate the normal vector dS and the surface integral
F · dS.
S
(d) Show that the two sides of the statement of the theorem are equal.
Your solution
Answer
(a) 1
(b) 1
(c) −dxdyk, 0; dxdyk, 1; −dxdyj, 0; dxdzj, −1; −dydzi, 0; dydzi, 1
(d) Both sides are 1.
HELM (2006): 69
Exercises
1. Verify Gauss’ theorem for the vector field F = 4xzi − y 2 j + yzk and the cuboid 0 ≤ x ≤ 2,
0 ≤ y ≤ 3, 0 ≤ z ≤ 4.
2. Verify Gauss’ theorem, using cylindrical polar coordinates, for the vector field F = ρ−2 ρ̂ over
the cylinder 0 ≤ ρ ≤ r0 , −1 ≤ z ≤ 1 for
(a) r0 = 1
(b) r0 = 2
3. For S being the surface of the tetrahedron with vertices at (0, 0, 0), (1, 0, 0), (0, 1, 0) and
(0,
Z Z0, 1), find the surface integral
(xi + yzj) · dS
S
(a) directly
(b) by using Gauss’ theorem
Hint :- When evaluating directly, show that the unit normal on the sloping face is √1 (i + j + k)
3
and that dS = (i + j + k)dxdy
Answers
1. Both sides are 156,
2. Both sides equal (a) 4π, (b) 2π,
5
3. Both sides equal .
24
70 HELM (2006):
®
3. Green’s Identities (3D)

Like Gauss’ theorem, Green’s identities equate surface integrals to volume integrals. However, Green’s
identities are concerned with two scalar fields u(x, y, z) and w(x, y, z). Two statements of Green’s
identities are as follows
ZZ ZZZ
∇u · ∇w + u∇2 w dV

(u∇w) · dS = [1]
S V
and
ZZ ZZZ
u∇2 w − w∇2 u dV

{u∇w − v∇u} · dS = [2]
S V
Proof of Green’s identities

Green’s identities can be derived from Gauss’ theorem and a vector derivative identity.
Vector identity (1) from subsection 6 of 28.2 states that ∇ · (φA) = (∇φ) · A + φ(∇ · A).
Letting φ = u and A = ∇w,

∇ · (u∇w) = (∇u) · (∇w) + u(∇ · (∇w)) = (∇u) · (∇w) + u∇2 w
Gauss’ theorem states
ZZ ZZZ
F · dS = ∇ · F dV
S V
Now, letting F = u∇w,

ZZ ZZZ
(u∇w) · dS = ∇ · (u∇w)dV
S V
ZZZ
(∇u) · (∇w) + u∇2 w dV

=
V
This is Green’s identity [1]. Reversing the roles of u and w,

ZZ ZZZ
(∇w) · (∇u) + w∇2 u dV

(w∇u) · dS =
S V
Subtracting the last two equations yields Green’s identity [2].
Key Point 10
Green’s Identities
ZZ ZZZ
∇u · ∇w + u∇2 w dV

[1] (u∇w) · dS =
S V
ZZ ZZZ
u∇2 w − w∇2 u dV

[2] {u∇w − v∇u} · dS =
S V
HELM (2006): 71
Example 37
Verify Green’s first identity for u = (x − x2 )y, w = xy + z 2 and the unit cube,
0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1.
Solution
As w = xy + z 2 , ∇w = yi + xj + 2zk. Thus u∇w = (xy − x2 y)(yi + xj + 2zk) and the surface
integral is of this quantity (scalar product with dS) integrated over the surface of the unit cube.
On the three faces x = 0, x = 1, y = 0, the vector u∇w = 0 and so the contribution to the surface
integral is zero.
On the face y = 1, u∇w = (x − x2 )(i + xj + 2zk) and dS = j so (u∇w) · dS = x2 − x3 and the
contribution to the integral is
Z 1 Z 1 Z 1 3 1
2 3 2 3 x x4 1
(x − x )dzdx = (x − x )dx = − = .
x=0 z=0 0 3 4 0 12
On the face z = 0, u∇w = (x−x2 )y(yi+xj) and dS = −k so (u∇w)·dS = 0 and the contribution
to the integral is zero.
On the face z = 1, u∇w = (x − x2 )y(yi + xj + 2k) and dS = k so (u∇w) · dS = 2y(x − x2 ) and
the contribution to the integral is
Z 1 Z 1 Z 1 Z 1 1 Z 1
2 2 3 2 2 1
2y(x − x )dydx = (x − x )dx = y (x − x ) dx = (x − x2 )dx = .
x=0 y=0 0 x=0 y=0 0 6
ZZ
1 1 1
Thus, (u∇w) · dS = 0 + 0 + 0 + +0+ = .
S 12 6 4
ZZZ
∇u · ∇w + u∇2 w dV .

Now evaluate
V
Note that ∇u = (1 − 2x)yi + (x − x2 )j and ∇2 w = 2 so

∇u · ∇w + u∇2 w = (1 − 2x)y 2 + (x − x2 )x + 2(x − x2 )y = x2 − x3 + 2xy − 2x2 y + y 2 − 2xy 2
and the integral
ZZZ Z 1 Z 1 Z 1
∇u · ∇w + u∇2 w dV (x2 − x3 + 2xy − 2x2 y + y 2 − 2xy 2 )dxdydz

=
V z=0 y=0 x=0
1 1 1
x3 x4
Z Z
2 2 3 2 2 2
= − + x y − x y + xy − x y dydz
z=0 y=0 3 4 3 x=0
Z 1 Z 1 Z 1 1
1 y y y2
= ( + )dydz = + dz
z=0 y=0 12 3 z=0 12 6 y=0
Z 1
1 h z i1 1
= ( )dz = =
z=0 4 4 z=0 4
ZZ ZZZ
1
∇u · ∇w + u∇2 w dV =

Hence (u∇w) · dS =
S V 4
72 HELM (2006):
®
Green’s theorem in the plane

This states that
I ZZ
∂Q ∂P
(P dx + Qdy) = − dxdy
C S ∂x ∂y
S is a 2-D surface with perimeter C; P (x, y) and Q(x, y) are scalar functions.
This should not be confused with Green’s identities.
Justification of Green’s theorem in the plane

Green’s theorem in the plane can be derived from Stokes’ theorem.
ZZ I
(∇ × F ) · dS = F · dr
S C
Now let F be the vector field P (x, y)i + Q(x, y)j i.e. there is no dependence on z and there are no
components in the z− direction. Now

i j k

∂ ∂ ∂ ∂Q ∂P
∇×F = = − k
∂x ∂y ∂z ∂x ∂y

P (x, y) Q(x, y) 0

∂Q ∂P
and dS = dxdyk giving (∇ × F ) · dS = − dxdy.
∂x ∂y
Thus Stokes’ theorem becomes
ZZ I
∂Q ∂P
− dxdy = F · dr
S ∂x ∂y C
and Green’s theorem in the plane follows.
Key Point 11
Green’s Theorem in the Plane
I ZZ
∂Q ∂P
(P dx + Qdy) = − dxdy
C S ∂x ∂y
HELM (2006): 73
Example 38 I
2
(4x + y − 3)dx + (3x2 + 4y 2 − 2)dy around the

Evaluate the line integral
C
rectangle 0 ≤ x ≤ 3, 0 ≤ y ≤ 1.
Solution
The integral could be accomplished by four line integrals but it is easier to note that
[(4x2 + y − 3)dx + (3x2 + 4y 2 − 2)dy] is of the form P dx + Qdy with P = 4x2 + y − 3 and
Q = 3x2 + 4y 2 − 2. It is thus of a suitable form for Green’s theorem in the plane.
∂Q ∂P
Note that = 6x and = 1.
∂x ∂y
Green’s theorem in the plane becomes
I Z 1 Z 3
2 2 2
{(4x + y − 3)dx + (3x + 4y − 2)dy} = (6x − 1) dxdy
C y=0 x=0
Z 1 3 Z 1
2
= 3x − x dy = 24 dy = 24
y=0 x=0 y=0
The same result could be gained by evaluating four line integrals.
Example 39 I
4zdy + (y 2 − 2)dz and

Verify Green’s theorem in the plane for the integral
C
the triangular contour starting at the origin O = (0, 0, 0) and going to A = (0, 2, 0)
and B = (0, 0, 1) before returning to the origin.
Solution
The whole of the contour is in the plane x = 0 and Green’s theorem in the plane becomes
I ZZ
∂Q ∂P
(P dy + Qdz) = − dydz
C S ∂y ∂z
I
4zdy + (y 2 − 2)dz .

(a) Firstly evaluate
C
On OA, z = 0 and dz = 0. As the integrand is zero, the integral will also be zero.
On AB, z = (1 − y2 ) and dz = − 12 dy. The integral is
Z 0 Z 0 0
1 2 1 2 2 1 3 14
(4 − 2y)dy − (y − 2)dy = (5 − 2y − y )dy = 5y − y − y =−
y=2 2 2 2 6 2 3
Z 0 0
On BO, y = 0 and dy = 0. The integral is (−2)dz = − 2z = 2.
I 1 1
2
8
Summing, 4zdy + (y − 2)dz = −
C 3
74 HELM (2006):
®
Solution (contd.)
ZZ
∂Q ∂P
(b) Secondly evaluate − dydz
S ∂y ∂z
∂P ∂Q
In this example, P = 4z and Q = y 2 − 2. Thus = 4 and = 2y. Hence,
∂z ∂y
ZZ Z 2 Z 1−y/2
∂Q ∂P
− dydz = (2y − 4) dzdy
S ∂y ∂z y=0 z=0
Z 2 1−y/2 Z 2
−y 2 + 4y − 4 dy

= 2yz − 4z dy =
y=0 z=0 y=0
2
1 3 2 8
= − y + 2y − 4y = −
3 0 3
Conclusion:
I ZZ
∂Q ∂P 8
(P dy + Qdz) = − dydz = −
C S ∂y ∂z 3
One very useful, special case of Green’s theorem in the plane is when Q = x and P = −y. The
theorem becomes
I ZZ
{−ydx + xdy} = (1 − (−1)) dxdy
C S
ZZ
The right-hand side becomes 2 dxdy i.e. 2A where A is the area inside the contour C. Hence
S
I
1
A= {xdy − ydx}
2 C
This result is known as the area theorem.
HELM (2006): 75
Example 40 I
1
Verify the area theorem A = 2
{xdy − ydx} for the segment of the circle
C
x2 + y 2 = 4 lying above the line y = 1.
Solution
Firstly, the area of the segment ADBC can be found by subtracting
√ the area
√ of the triangle OADB
from the area of the sector OACB. The triangle has area 12 × 2 3 × 1 = 3. The sector has area
π
√
3
× 22 = 43 π. Thus segment I ADBC has area 4
3
π − 3.
Now, evaluate the integral {xdy − ydx} around the segment. Along the line, y = 1, dy = 0 so
C
Z Z √3 Z √3 √
the integral {xdy − ydx} becomes √ (x × 0 − 1 × dx) = √ (−dx) = −2 3.
C √ − 3 − 3
Along the arc of the circle, y = 4 − x 2 = (4 − x2 )1/2 so dy = −x(4 − x2 )−1/2 dx. The integral
Z
{xdy − ydx} becomes
C
√ √
Z − 3 Z 3
4
√
{−x2 (4 − x2 )−1/2 − (4 − x2 )1/2 }dx = √
√dx
3 − 3 4 − x2
Z π/3
1
= 4 2 cos θ dθ
−π/3 2 cos θ
Z π/3
8
= 4dθ = π
−π/3 3
√ √
I
1 1 8 4
So, 2
{xdy − ydx} = π − 2 3 = π − 3.
C 2 3 3
√
Hence both sides of the theorem equal 43 π − 3.
Task
Verify Green’s theorem in the plane when applied to the integral
I
{(5x + 2y − 7)dx + (3x − 4y + 5)dy}
C
where C represents the perimeter of the trapezium with vertices at (0, 0), (3, 0),
(6, 1) and (1, 1).
∂Q ∂P
First let P = 5x + 2y − 7 and Q = 3x − 4y + 5 and find − :
∂x ∂y
Your solution
76 HELM (2006):
®
Answer
1
ZZ
∂Q ∂P
Now find − dxdy over the trapezium:
∂x ∂y
Your solution
Answer
4
Z
Now find (P dx + Qdy) along the four sides of the trapezium:
Your solution
Answer
1.5, 66, −62.5, −1 whose sum is 4.
Finally show that the two sides of the statement of Green’s theorem are equal:
Your solution
Answer
Both sides are 4.
Exercises
1. Verify Green’s identity [1] for the functions u = xyz, w = y 2 and the unit cube 0 ≤ x ≤ 1,
0 ≤ y ≤ 1, 0 ≤ z ≤ 1.
2. Verify the area theorem for
(a) The area above y = 0, but below y = 1 − x2 .

(b) The segment of the circle x2 + y 2 = 1, to the upper left of the line y = 1 − x.
Answer
4 π 1
2. (a) , (b) − .
3 4 2
HELM (2006): 77
Contents 30
Introduction to
Numerical Methods
30.1 Rounding Error and Conditioning 2
30.2 Gaussian Elimination 12
30.3 LU Decomposition 21
30.4 Matrix Norms 34
30.5 Iterative Methods for Systems of Equations 46
Learning outcomes
In this Workbook you will learn about some of the issues involved with using
a computer to carry out numerical calculations for engineering problems.
For example, the effect of rounding error will be discussed.
Most of this Workbook will consider methods for solving systems of equations.
In particular you will see how methods can be adapted so that rounding error
becomes less of a problem.
Rounding Error and
Conditioning 30.1
Introduction
In this first Section concerning numerical methods we will discuss some of the issues involved with
doing arithmetic on a computer. This is an important aspect of engineering. Numbers cannot,
in general, be represented exactly, they are typically stored to a certain number of significant
figures. The associated rounding error and its accumulation are important issues which need to
be appreciated if we are to trust computational output.
We will also look at ill-conditioned problems which can have an unfortunate effect on rounding error.

Prerequisites • recall the formula for solving quadratic

equations

'
$
• round real numbers and know what the
associated rounding error is
Learning Outcomes • understand how rounding error can grow in

calculations
• explain what constitutes an ill-conditioned
problem
& %
2 HELM (2006):
Workbook 30: Introduction to Numerical Methods
®
1. Numerical methods
Many mathematical problems which arise in the modelling of engineering situations are too difficult,
or too lengthy, to tackle by hand. Instead it is often good enough to resort to an approximation given
by a computer. Indeed, the process of modelling a “real world” situation with a piece of mathematics
will involve some approximation, so it may make things no worse to seek an approximate solution of
the theoretical problem.
Evidently there are certain issues here. Computers do not know what a function is, or a vector,
or an integral, or a polynomial. Loosely speaking, all computers can do is remember long lists of
numbers and then process them (very quickly!). Mathematical concepts must be posed as something
numerical if a computer is to be given a chance to help. For this reason a topic known as numerical
analysis has grown in recent decades which is devoted to the study of how to get a machine to address
a mathematical problem.
Key Point 1
“Numerical methods” are methods devised to solve mathematical problems on a computer.
2. Rounding
In general, a computer is unable to store every decimal place of a real number. Real numbers are
rounded. To round a number to n significant figures we look at the (n + 1)th digit in the decimal
expansion of the number.
places. (In other words we neglect the (n + 1)th digit and any digits to its right.)
and then chop to n places.
For example
1
= 0.3333 rounded to 4 significant figures,
3
8
= 2.66667 rounded to 6 significant figures,
3
π = 3.142 rounded to 4 significant figures.

An alternative way of stating the above is as follows
HELM (2006): 3
Section 30.1: Rounding Error and Conditioning
1
= 0.3333 rounded to 4 decimal places,
3
8
= 2.66667 rounded to 5 decimal places,
3
π = 3.142 rounded to 3 decimal places.

Sometimes the phrases “significant figures” and “decimal places” are abbreviated as “s.f.” or
“sig. fig.” and “d.p.” respectively.
Example 1
Write down each of these numbers rounding them to 4 decimal places:
0.12345, −0.44444, 0.5555555, 0.000127351, 0.000005
Solution
0.1235, −0.4444, 0.5556, 0.0001, 0.0000
Example 2
Write down each of these numbers, rounding them to 4 significant figures:
0.12345, −0.44444, 0.5555555, 0.000127351, 25679
Solution
0.1235, −0.4444, 0.5556, 0.0001274, 25680
Task
Write down each of these numbers, rounding them to 3 decimal places:
0.87264, 0.1543, 0.889412, −0.5555
Your solution
Answer
0.873, 0.154, 0.889, −0.556
4 HELM (2006):
®
Rounding error
Clearly, rounding a number introduces an error. Suppose we know that some quantity x is such that
x = 0.762143 6 d.p.
Based on what we know about the rounding process we can deduce that
x = 0.762143 ± 0.5 × 10−6 .
This is typical of what can occur when dealing with numerical methods. We do not know what
value x takes, but we have an error bound describing the furthest x can be from the stated value
0.762143. Error bounds are necessarily pessimistic. It is very likely that x is closer to 0.762143 than
0.5 × 10−6 , but we cannot assume this, we have to assume the worst case if we are to be certain
that the error bound is safe.
Key Point 2
Rounding a number to n decimal places introduces an error that is no larger (in magnitude) than
1
× 10−n
2
Note that successive rounding can increase the associated rounding error, for example
12.3456 = 12.3 (1 d.p.),

12.3456 = 12.346 (3 d.p.) = 12.35 (2 d.p.) = 12.4 (1 d.p.).
Accumulated rounding error

Rounding error can sometimes grow as calculations progress. Consider these examples.
Example 3
22
Let x = and y = π. It follows that, to 9 decimal places
7
x = 3.142857143
y = 3.141592654
x+y = 6.284449797
x−y = 0.001264489
(i) Round x and y to 7 significant figures. Find x + y and x − y.
(ii) Round x and y to 3 significant figures. Find x + y and x − y.
HELM (2006): 5
Solution
(i) To 7 significant figures x = 3.142857 and y = 3.141593 and it follows that, with this rounding
of the numbers
x + y = 6.284450
x − y = 0.001264.
The outputs (x + y and x − y) are as accurate to as many decimal places as the inputs (x
and y). Notice however that the difference x − y is now only accurate to 4 significant figures.
(ii) To 3 significant figures x = 3.14 and y = 3.14 and it follows that, with this rounding of the
numbers
x + y = 6.28
x − y = 0.
This time we have no significant figures accurate in x − y.
In Example 3 there was loss of accuracy in calculating x−y. This shows how rounding error can grow
with even simple arithmetic operations. We need to be careful when developing numerical methods
that rounding error does not grow. What follows is another case when there can be a loss of accurate
significant figures.
Task
This Task involves solving the quadratic equation
x2 + 30x + 1 = 0
2
(a) Use the quadratic
√ formula to show that the two solutions of x +30x+1 = 0
are x = −15 ± 224.
(b) Write down the two solutions to as many decimal places as your calculator
will allow.
√
(c) Now round 224 to 4 significant figures and recalculate the two solutions.
(d) How many accurate significant figures

√ are there in the solutions you obtained
with the rounded approximation to 224?
6 HELM (2006):
®
Your solution
Answer
√
302 − 4
−30 ± √ √
(a) From the quadratic formula x = = −15 ± 152 − 1 = −15 ± 224 as
2
required.
√ √
(b) −15 + 224 = −0.03337045291 is one solution and −15 − 224 = −29.96662955 is the
other, to 10 significant figures.
√
(c) Rounding 224 to 4 significant figures gives
√ √
−15 + 224 = −15 + 14.97 = −0.03 − 15 − 224 = −15 − 14.97 = −29.97
(d) The first of these is only accurate to 1 sig. fig., the second is accurate to 4 sig. fig.
Task
In the previous Task it was found that rounding to 4 sig. fig. led to a result with
a large error for the smaller root of the quadratic equation. Use the fact that for
the general quadratic
ax2 + bx + c = 0
c
the product of the two roots is to determine the smaller root with improved
a
accuracy.
Your solution
HELM (2006): 7
Answer
c
Here a = 1, b = 30, c = 1 so the product of the roots = = 1. So starting from the rounded
a
1
value −29.97 for the larger root we obtain the smaller root to be ≈ −0.03337 with 4 sig.
−29.97
fig. accuracy.
(This indirect method is often built into computer software to increase accuracy.)
3. Well-conditioned and ill-conditioned problems

Suppose we have a mathematical problem that depends on some input data. Now imagine altering
the input data by a tiny amount. If the corresponding solution always varies by a correspondingly
tiny amount then we say that the problem is well-conditioned. If a tiny change in the input results
in a large change in the output we say that the problem is ill-conditioned. The following Example
should help.
Example 4
Show that the evaluation of the function f (x) = x2 − x − 1500 near x = 39
is an ill-conditioned problem.
Solution
Consider f (39) = −18 and f (39.1) = −10.29. In changing x from 39 to 39.1 we have altered
it by about 0.25%. But the percentage change in f is greater than 40%. The demonstrates the
ill-conditioned nature of the problem.
Task
df
Work out the derivative for the function used in Example 4 and so explain why
dx
the numerical results show the calculation of f to be ill-conditioned near x = 39.
Your solution
8 HELM (2006):
®
Answer
df
We have f = x2 − x − 1500 and = 2x − 1. At x = 39 the value of f is −18 and, using calculus,
dx
df
the value of is 77. Thus x = 39 is very close to a zero of f (i.e. a root of the quadratic equation
dx
f (x) = 0). The fractional change in f is thus very large even for a small change in x. The given
values of f (38.6) and f (39.4) lead us to an estimate of
12.96 − (−48.64)
39.4 − 38.6
df
for . This ratio gives the value 77.0, which agrees exactly with our result from the calculus. Note,
dx
however, that an exact result of this kind is not usually obtained; it is due to the simple quadratic
form of f for this example.
One reason that this matters is because of rounding error. Suppose that, in the Example above, we
know is that x is equal to 39 to 2 significant figures. Then we have no chance at all of evaluating f
with confidence, for consider these values
f (38.6) = −48.64
f (39) = −18
f (39.4) = 12.96.
All of the arguments on the left-hand sides are equal to 39 to 2 significant figures so all the values
on the right-hand sides are contenders for f (x). The ill-conditioned nature of the problem leaves us
with some serious doubts concerning the value of f .
It is enough for the time being to be aware that ill-conditioned problems exist. We will discuss this
sort of thing again, and how to combat it in a particular case, in a later Section of this Workbook.
HELM (2006): 9
Exercises
1. Round each of these numbers to the number of places or figures indicated
(a) 23.56712 (to 2 decimal places).

(b) −15432.1 (to 3 significant figures).
2. Suppose we wish to calculate

√ √
x + 1 − x,
for relatively large values of x. The following table gives values of y for a range of x-values
√ √
x x+1− x
100 0.04987562112089
1000 0.01580743742896
10000 0.00499987500625
100000 0.00158113487726
√
(a) For each
√ x shown
√ in the table,
√ and working to 6 significant figures evaluate x+1 and
then x. Find x + 1 − x by taking the difference of your two rounded numbers. Are
your answers accurate to 6 significant figures?
√
(b) For each
√ x shown
√ in the table,
√ and working to 4 significant figures evaluate x+1 and
then x. Find x + 1 − x by taking the difference of your two rounded numbers. Are
your answers accurate to 4 significant figures?
3. The larger solution of the quadratic equation
x2 + 168x + 1 = 0
√
is
√ −84 + 7055 which is equal to −0.0059525919 to 10 decimal places. Round the value
7055 to 4 significant figures and then use this rounded value to calculate the larger solution
of the quadratic equation. How many accurate significant figures does your answer have?
4. Consider the function
f (x) = x2 + x − 1975
and suppose we want to evaluate it for some x.
(a) Let x = 20. Evaluate f (x) and then evaluate f again having altered x by just 1%.
What is the percentage change in f ? Is the problem of evaluating f (x), for x = 20, a
well-conditioned one?
(b) Let x = 44. Evaluate f (x) and then evaluate f again having altered x by just 1%.
What is the percentage change in f ? Is the problem of evaluating f (x), for x = 44, a
well-conditioned one?
(Answer: the problem in part (a) is well-conditioned, the problem in part (b) is ill-conditioned.)
10 HELM (2006):
®
Answers
1. 23.57, −15400.
√ √
2. The answers are tabulated below. The 2nd and 3rd columns give values for x√+ 1 and √x
respectively, rounded to 10 decimal places. The 4th column shows the values of x + 1 − x
also to 10 decimal places. Column (a) deals with part (a) of the question and finds the
difference after rounding the numbers in the 2nd and 3rd columns to 6 significant figures.
Column (b) deals with part (b) of the question and finds the difference after rounding the
numbers in the 2nd and 3rd columns to 4 significant figures.
√ √
x x+1 x (a) (b)
100 10.0498756211 10.0000000000 0.0498756211 0.0499 0.0500
1000 31.6385840391 31.6227766017 0.0158074374 0.0158 0.0200
10000 100.0049998750 100.0000000000 0.0049998750 0.0050 0.0000
100000 316.2293471517 316.2277660168 0.0015811349 0.0010 0.0000
Clearly the answers in columns (a) and (b) are not accurate to 6 and 4 figures respectively.
Indeed the last two figures in column (b) are accurate to no figures at all!
√
3. 7055 = 83.99 to 4 significant figures. Using this value to find the larger solution of the
quadratic equation gives
−84 + 83.99 = −0.01 .
The number of accurate significant figures is 0 because the accurate answer is 0.006 and ‘10
is not the leading digit (it is ‘60 ).
4. (a) f (20) = −1555 and f (20.2) = −1546.76 so the percentage change in f on changing
x = 20 by 1% is
−1555 − (−1546.76)
× 100% = 0.53%
−1555
to 2 decimal places.
(b) f (44) = 5 and f (44.44) = 44.3536 so the percentage change in f on changing x = 44
by 1% is
5 − 44.3536
× 100% = −787.07%
5
Clearly then, the evaluation of f (20) is well-conditioned and that of f (44) is ill-conditioned.
HELM (2006): 11

Gaussian Elimination 30.2
Introduction
In this Section we will reconsider the Gaussian elimination approach discussed in 8, and we
will see how rounding error can grow if we are not careful in our implementation of the approach. A
method called partial pivoting, which helps stop rounding error from growing, will be introduced.
' $
• revise matrices, especially matrix solution of
equations
Prerequisites
• recall Gaussian elimination
• be able to find the inverse of a 2 × 2 matrix
&
%

Learning Outcomes • carry out Gaussian elimination with

partial pivoting

12 HELM (2006):
®
1. Gaussian elimination
Recall from 8 that the basic idea with Gaussian (or Gauss) elimination is to replace the matrix of
coefficients with a matrix that is easier to deal with. Usually the nicer matrix is of upper triangular
form which allows us to find the solution by back substitution. For example, suppose we have
x1 + 3x2 − 5x3 = 2
3x1 + 11x2 − 9x3 = 4
−x1 + x2 + 6x3 = 5
which we can abbreviate using an augmented matrix to

 
1 3 −5 2
 3 11 −9 4  .
−1 1 6 5
We use the boxed element to eliminate any non-zeros below it. This involves the following row
operations
   
1 3 −5 2 1 3 −5 2
 3 11 −9 4  R2 − 3 × R1 ⇒  0 2 6 −2  .
−1 1 6 5 R3 + R1 0 4 1 7
And the next step is to use the 2 to eliminate the non-zero below it. This requires the final row
operation
   
1 3 −5 2 1 3 −5 2
 0 2 6 −2  ⇒ 0
 2 6 −2  .
0 4 1 7 R3 − 2 × R2 0 0 −11 11
This is the augmented form for an upper triangular system, writing the system in extended form we
have
x1 + 3x2 − 5x3 = 2
2x2 + 6x3 = −2
−11x3 = 11
which is easy to solve from the bottom up, by back substitution.
HELM (2006): 13
Section 30.2: Gaussian Elimination
Example 5
Solve the system
x1 + 3x2 − 5x3 = 2
2x2 + 6x3 = −2
−11x3 = 11
Solution
The bottom equation implies that x3 = −1. The middle equation then gives us that
2x2 = −2 − 6x3 = −2 + 6 = 4 ∴ x2 = 2
and finally, from the top equation,
x1 = 2 − 3x2 + 5x3 = 2 − 6 − 5 = −9.
Therefore the solution to the problem stated at the beginning of this Section is
   
x1 −9
 x2  =  2  .
x3 −1
The following Task will act as useful revision of the Gaussian elimination procedure.
Task
Carry out row operations to reduce the matrix
 
2 −1 4
 4 3 −1 
−6 8 −2
into upper triangular form.
Your solution
14 HELM (2006):
®
Answer
The row operations required to eliminate the non-zeros below the diagonal in the first column are
as follows
   
2 −1 4 2 −1 4
 4 3 −1  R2 − 2 × R1 ⇒  0 5 −9 
−6 8 −2 R3 + 3 × R1 0 5 10
Next we use the 5 on the diagonal to eliminate the 5 below it:
   
2 −1 4 2 −1 4
 0 5 −9  ⇒ 0 5 −9 
0 5 10 R3 − R2 0 0 19
which is in the required upper triangular form.
2. Partial pivoting
Partial pivoting is a refinement of the Gaussian elimination procedure which helps to prevent the
growth of rounding error.
An example to motivate the idea

Consider the example
−4
10 1 x1 1
= .
−1 2 x2 1
First of all let us work out the exact answer to this problem
−1
10−4 1

x1 1
=
x2 −1 2 1

1 2 −1 1
=
2 × 10−4 + 1 1 10−4 1

1 1 0.999800...
= = .
2 × 10−4 + 1 1 + 10−4 0.999900...
Now we compare this exact result with the output from Gaussian elimination. Let us suppose, for
sake of argument, that all numbers are rounded to 3 significant figures. Eliminating the one non-zero
element below the diagonal, and remembering that we are only dealing with 3 significant figures, we
obtain
−4
10 1 x1 1
= .
0 104 x2 104
The bottom equation gives x2 = 1, and the top equation therefore gives x1 = 0. Something has
gone seriously wrong, for this value for x1 is nowhere near the true value 0.9998. . . found without
rounding.The problem has been caused by using a small number (10−4 ) to eliminate a number much
larger in magnitude (−1) below it.
The general idea with partial pivoting is to try to avoid using a small number to eliminate much
larger numbers.
HELM (2006): 15
Suppose we swap the rows

−1 2 x1 1
−4 =
10 1 x2 1
and proceed as normal, still using just 3 significant figures. This time eliminating the non-zero below
the diagonal gives

−1 2 x1 1
=
0 1 x2 1
which leads to x2 = 1 and x1 = 1, which is an excellent approximation to the exact values, given
that we are only using 3 significant figures.
Partial pivoting in general

At each step the aim in Gaussian elimination is to use an element on the diagonal to eliminate all
the non-zeros below. In partial pivoting we look at all of these elements (the diagonal and the ones
below) and swap the rows (if necessary) so that the element on the diagonal is not very much smaller
than the other elements.
Key Point 3
Partial Pivoting
This involves scanning a column from the diagonal down. If the diagonal entry is very much smaller
than any of the others we swap rows. Then we proceed with Gaussian elimination in the usual way.
In practice on a computer we swap rows to ensure that the diagonal entry is always the largest
possible (in magnitude). For calculations we can carry out by hand it is usually only necessary to
worry about partial pivoting if a zero crops up in a place which stops Gaussian elimination working.
Consider this example
    
1 −3 2 1 x1 −4
 2 −6 1 4   x2   1 
  = .
 −1 2 3 4   x3   12 
0 −1 1 1 x4 0
The first step is to use the 1 in the top left corner to eliminate all the non-zeros below it in the
augmented matrix
 
1 −3 2 1 −4
 
1 −3 2 1 −4
 2 −6 1 4  R2 − 2 × R1 ⇒ 
1  0 0 −3 2 9 
 

 −1 .
2 3 4 12 R3 + R1

  0 −1 5 5 8 
0 −1 1 1 0 0 −1 1 1 0
What we would like to do now is to use the boxed element to eliminate all the non-zeros below it.
But clearly this is impossible. We need to apply partial pivoting. We look down the column starting
16 HELM (2006):
®
at the diagonal entry and see that the two possible candidates for the swap are both equal to −1.
Either will do so let us swap the second and fourth rows to give
 
1 −3 2 1 −4
 0 −1 1 1 0 
 
 .
 0 −1 5 5 8 
0 0 −3 2 9
That was the partial pivoting step. Now we proceed with Gaussian elimination
 
1 −3 2 1 −4 
1 −3 2 1 −4

 0 −1

1 1 0 
  0 −1 1 1 0 
⇒  0
.
0 4 4 8 
 
 0 −1 5 5 8  R3 − R2
0 0 −3 2 9 0 0 −3 2 9
The arithmetic is simpler if we cancel a factor of 4 out of the third row to give
 
1 −3 2 1 −4
 0 −1 1 1 0 
 .
 0 0 1 1 2 
0 0 −3 2 9
And the elimination phase is completed by removing the −3 from the final row as follows
 
1 −3 2 1 −4
 
1 −3 2 1 −4
 0 −1 1 1 0   0 −1 1 1 0 
⇒ .
 
 0

0 1 1 2 
  0 0 1 1 2 
0 0 −3 2 9 R4 + 3 × R3 0 0 0 5 15
This system is upper triangular so back substitution can be used now to work out that x4 = 3,
x3 = −1, x2 = 2 and x1 = 1.
The Task below is a case in which partial pivoting is required.
[For a large system which can be solved by Gauss elimination see Engineering Example 1 on page
62].
Task
Transform the matrix
 
1 −2 4
 −3 6 −11 
4 3 5
into upper triangular form using Gaussian elimination (with partial pivoting when
necessary).
HELM (2006): 17
Your solution
Answer
The row operations required to eliminate the non-zeros below the diagonal in the first column are
   
1 −2 4 1 −2 4
 −3 6 −11  R2 + 3 × R1 ⇒  0 0 1 
4 3 5 R3 − 4 × R1 0 11 −11
which puts a zero on the diagonal. We are forced to use partial pivoting and swapping the second
and third rows gives
 
1 −2 4
 0 11 −11 
0 0 1
which is in the required upper triangular form.
Key Point 4
When To Use Partial Pivoting
1. When carrying out Gaussian elimination on a computer, we would usually always swap rows
so that the element on the diagonal is as large (in magnitude) as possible. This helps stop
the growth of rounding error.
2. When doing hand calculations (not involving rounding) there are two reasons we might pivot
(a) If the element on the diagonal is zero, we have to swap rows so as to put a non-zero on
the diagonal.
(b) Sometimes we might swap rows so that there is a “nicer” non-zero number on the
diagonal than there would be without pivoting. For example, if the number on the
diagonal can be arranged to be a 1 then no awkward fractions will be introduced when
we carry out row operations related to Gaussian elimination.
18 HELM (2006):
®
Exercises
1. Solve the following system by back substitution
x1 + 2x2 − x3 = 3
5x2 + 6x3 = −2
7x3 = −14
2. (a) Show that the exact solution of the system of equations

−5
10 1 x1 2 x1 −0.99998
= is = .
−2 4 x2 10 x2 2.00001
(b) Working to 3 significant

figures, and using Gaussian elimination without pivoting, find an
x1
approximation to . Show that the rounding error causes the approximation to x1 to be
x2
a very poor one.
(c) Working to 3significant

figures, and using Gaussian elimination with pivoting, find an
x1
approximation to . Show that the approximation this time is a good one.
x2
3. Carry out row operations (with partial pivoting if necessary) to reduce these matrices to upper
triangular form.
     
1 −2 4 0 −1 2 −3 10 1
(a)  −4 −3 −3  , (b)  1 −4 2 , (c)  1 −3 2 .
−1 13 1 −2 5 −4 −2 10 −4
(Hint: before tackling (c) you might like to consider point 2(b) in Key Point 4.)
Answers
1. From the last equation we see that x3 = −2. Using this information in the second equation
gives us x2 = 2. Finally, the first equation implies that x1 = −3.
−1
a b 1 d −b
2. (a) The formula = can be used to show that
c d ad − bc −c a
50000 200005
x1 = − = −0.99998 and x2 = = 2.00001 as required.
50001 100002
(b) Carrying out the elimination without pivoting, and rounding to 3 significant figures we
find that x2 = 2.00 and that, therefore, x1 = 0. This is a very poor approximation to x1 .
(c) To apply partial pivoting we swap the two rows and then eliminate the bottom left element.
Consequently we find that, after rounding the system of equations to 3 significant figures,
x2 = 2.00 and x1 = −1.00. These give excellent agreement with the exact answers.
HELM (2006): 19
Answers
3.
(a) The row operations required to eliminate the non-zeros below the diagonal in the first
column are as follows
   
1 −2 4 1 −2 4
 −4 −3 −3  R2 + 4 × R1 ⇒  0 −11 13 
−1 13 1 R3 + 1 × R1 0 11 5
Next we use the element in the middle of the matrix to eliminate the value underneath
it. This gives
 
1 −2 4
 0 −11 13  which is of the required upper triangular form.
0 0 18
(b) We must swap the rows to put a non-zero in the top left position (this is the partial
pivoting step). Swapping the first and second rows gives the matrix
 
1 −4 2
 0 −1 2 .
−2 5 −4
We carry out one row operation to eliminate the non-zero in the bottom left entry as
follows
   
1 −4 2 1 −4 2
 0 −1 2  ⇒  0 −1 2 
−2 5 −4 R3 + 2 × R1 0 −3 0
Next we use the middle element to eliminate the non-zero value underneath it. This
gives
 
1 −4 2
 0 −1 2  which is of the required upper triangular form.
0 0 −6
(c) If we swap the first and second rows of the matrix then we do not have to deal with
fractions. Having done this the row operations required to eliminate the non-zeros below
the diagonal in the first column are as follows
   
1 −3 2 1 −3 2
 −3 10 1  R2 + 3 × R1 ⇒  0 1 7 
−2 10 −4 R3 + 2 × R1 0 4 0
Next we use the element in the middle of the matrix to eliminate the non-zero value
underneath it. This gives
 
1 −3 2
 0 1 7  which is of the required upper triangular form.
0 0 −28
20 HELM (2006):
®

LU Decomposition 30.3

Introduction
In this Section we consider another direct method for obtaining the solution of systems of equations
in the form AX = B.

• revise matrices and their use in systems of
Prerequisites equations
Before starting this Section you should . . . • revise determinants

'
$
• find an LU decomposition of simple
matrices and apply it to solve systems of
equations
Learning Outcomes
• determine when an LU decomposition is
unavailable and when it is possible to
circumvent the problem
& %
HELM (2006): 21
Section 30.3: LU Decomposition
1. LU decomposition
Suppose we have the system of equations
AX = B.
The motivation for an LU decomposition is based on the observation that systems of equations
involving triangular coefficient matrices are easier to deal with. Indeed, the whole point of Gaussian
elimination is to replace the coefficient matrix with one that is triangular. The LU decomposition is
another approach designed to exploit triangular systems.
We suppose that we can write
A = LU
where L is a lower triangular matrix and U is an upper triangular matrix. Our aim is to find L and
U and once we have done so we have found an LU decomposition of A.
Key Point 5
An LU decomposition of a matrix A is the product of a lower triangular matrix and an upper
triangular matrix that is equal to A.
It turns out that we need only consider lower triangular matrices L that have 1s down the diagonal.
Here is an example. Let
     
1 2 4 1 0 0 U11 U12 U13
A =  3 8 14  = LU where L =  L21 1 0  and U =  0 U22 U23 .
2 6 13 L31 L32 1 0 0 U33
Multiplying out LU and setting the answer equal to A gives
   
U11 U12 U13 1 2 4
 L21 U11 L21 U12 + U22 L21 U13 + U23  =  3 8 14  .
L31 U11 L31 U12 + L32 U22 L31 U13 + L32 U23 + U33 2 6 13
Now we use this to find the entries in L and U . Fortunately this is not nearly as hard as it might at
first seem. We begin by running along the top row to see that
U11 = 1 , U12 = 2 , U13 = 4 .
Now consider the second row
L21 U11 = 3 ∴ L21 × 1 = 3 ∴ L21 = 3 ,
L21 U12 + U22 = 8 ∴ 3 × 2 + U22 = 8 ∴ U22 = 2 ,
L21 U13 + U23 = 14 ∴ 3 × 4 + U23 = 14 ∴ U23 = 2 .
22 HELM (2006):
®
Notice how, at each step, the equation being considered has only one unknown in it, and other
quantities that we have already found. This pattern continues on the last row
L31 U11 = 2 ∴ L31 × 1 = 2 ∴ L31 = 2 ,
L31 U12 + L32 U22 = 6 ∴ 2 × 2 + L32 × 2 = 6 ∴ L32 = 1 ,
L31 U13 + L32 U23 + U33 = 13 ∴ (2 × 4) + (1 × 2) + U33 = 13 ∴ U33 = 3 .
We have shown that

    
1 2 4 1 0 0 1 2 4
A =  3 8 14  =  3 1 0   0 2 2 
2 6 13 2 1 1 0 0 3
and this is an LU decomposition of A.
Task
3 1
Find an LU decomposition of .
−6 −4
Your solution
Answer
Let

3 1 1 0 U11 U12 U11 U12
= LU = =
−6 −4 L21 1 0 U22 L21 U11 L21 U12 + U22
then, comparing the left and right hand sides row by row implies that U11 = 3, U12 = 1, L21 U11 = −6
which implies L21 = −2 and L21 U12 + U22 = −4 which implies that U22 = −2. Hence

3 1 1 0 3 1
=
−6 −4 −2 1 0 −2

3 1
is an LU decomposition of .
−6 −4
HELM (2006): 23
 
Task 3 1 6
Find an LU decomposition of  −6 0 −16 .
0 8 −17
Your solution
Answer
Using material from the worked example in the notes we set
   
3 1 6 U11 U12 U13
 −6 0 −16  =  L21 U11 L21 U12 + U22 L21 U13 + U23 
0 8 −17 L31 U11 L31 U12 + L32 U22 L31 U13 + L32 U23 + U33
and comparing elements row by row we see that
U11 = 3, U12 = 1, U13 = 6,
L21 = −2, U22 = 2, U23 = −4
L31 = 0 L32 = 4 U33 = −1
and it follows that
    
3 1 6 1 0 0 3 1 6
 −6 0 −16  =  −2 1 0   0 2 −4 
0 8 −17 0 4 1 0 0 −1
is an LU decomposition of the given matrix.
24 HELM (2006):
®
2. Using LU decomposition to solve systems of equations

Once a matrix A has been decomposed into lower and upper triangular parts it is possible to obtain
the solution to AX = B in a direct way. The procedure can be summarised as follows
• Given A, find L and U so that A = LU . Hence LU X = B.
• Let Y = U X so that LY = B. Solve this triangular system for Y .
• Finally solve the triangular system U X = Y for X.
The benefit of this approach is that we only ever need to solve triangular systems. The cost is that
we have to solve two of them.
[Here we solve only small systems; a large system is presented in Engineering Example 1 on page 62.]
Example 6       
x1 1 2 4 x1 3
Find the solution of X =  x2  of the system  3 8 14   x2  =  13  .
x3 2 6 13 x3 4
Solution
• The first step is to calculate the LU decomposition of the coefficient matrix on the left-hand
side. In this case that job has already been done since this is the matrix we considered earlier.
We found that
   
1 0 0 1 2 4
L =  3 1 0 , U =  0 2 2 .
2 1 1 0 0 3


y1
• The next step is to solve LY = B for the vector Y =  y2 . That is we consider
y3
    
1 0 0 y1 3
LY =  3 1 0   y2  =  13  = B
2 1 1 y3 4
which can be solved by forward substitution. From the top equation we see that y1 = 3.
The middle equation states that 3y1 + y2 = 13 and hence y2 = 4. Finally the bottom line
says that 2y1 + y2 + y3 = 4 from which we see that y3 = −6.
HELM (2006): 25
Solution (contd.)
• Now that we have found Y we finish the procedure by solving U X = Y for X. That is we
solve
    
1 2 4 x1 3
UX =  0 2 2   x2  =  4  = Y
0 0 3 x3 −6
by using back substitution. Starting with the bottom equation we see that 3x3 = −6 so
clearly x3 = −2. The middle equation implies that 2x2 + 2x3 = 4 and it follows that x2 = 4.
The top equation states that x1 + 2x2 + 4x3 = 3 and consequently x1 = 3.
Therefore we have found that the solution to the system of simultaneous equations
      
1 2 4 x1 3 3
 3 8 14   x2  =  13  is X= 4 .
2 6 13 x3 4 −2
Task
Use the LU decomposition you found earlier in the last Task (page 24) to solve
    
3 1 6 x1 0
 −6 0 −16   x2  =  4 .
0 8 −17 x3 17
Your solution
26 HELM (2006):
®
Answer   
1 0 0 3 1 6
We found earlier that the coefficient matrix is equal to LU =  −2 1 0   0 2 −4 .
0 4 1 0 0 −1
First we solve LY = B for Y , we have
    
1 0 0 y1 0
 −2 1 0   y2  =  4  .
0 4 1 y3 17
The top line implies that y1 = 0. The middle line states that −2y1 + y2 = 4 and therefore y2 = 4.
The last line tells us that 4y2 + y3 = 17 and therefore y3 = 1.
Finally we solve U X = Y for X, we have
    
3 1 6 x1 0
 0 2 −4   x2  =  4  .
0 0 −1 x3 1
The bottom line shows that x3 = −1. The middle line  then shows
 that x2 = 0, and then the top
2
line gives us that x1 = 2. The required solution is X =  0 .
−1
3. Do matrices always have an LU decomposition?

No. Sometimes it is impossible to write a matrix in the form “lower triangular”×“upper triangular”.
Why not?
An invertible matrix A has an LU decomposition provided that all its leading submatrices have
non-zero determinants. The k th leading submatrix of A is denoted Ak and is the k × k matrix found
by looking only at the top k rows and leftmost k columns. For example if
 
1 2 4
A =  3 8 14 
2 6 13
then the leading submatrices are
 
1 2 4
1 2
A1 = 1, A2 = , A3 =  3 8 14  .
3 8
2 6 13
The fact that this matrix A has an LU decomposition can be guaranteed in advance because none
of these determinants is zero:
|A1 | = 1,
|A2 | = (1 × 8) − (2 × 3) = 2,

8 14 3 14 3 8
|A3 | =
− 2 + 4 = 20 − (2 × 11) + (4 × 2) = 6
6 13 2 13 2 6
(where the 3 × 3 determinant was found by expanding along the top row).
HELM (2006): 27
Example 7 
1 2 3
Show that  2 4 5  does not have an LU decomposition.
1 3 4
Solution
The second leading submatrix has determinant equal to

1 2
2 4 = (1 × 4) − (2 × 2) = 0

which means that an LU decomposition is not possible in this case.
Task
Which, if any, of these matrices have an LU decomposition?
 
1 −3 7
3 2 0 1
(a) A = , (b) A = , (c) A =  −2 6 1 .
0 1 3 2
0 3 −2
Your solution
(a)
Answer
|A1 | = 3 and |A2 | = |A| = 3. Neither of these is zero, so A does have an LU decomposition.
Your solution
(b)
Answer
|A1 | = 0 so A does not have an LU decomposition.
Your solution
(c)
Answer
|A1 | = 1, |A2 | = 6 − 6 = 0, so A does not have an LU decomposition.
28 HELM (2006):
®
Can we get around this problem?

Yes. It is always possible to re-order the rows of an invertible matrix so that all of the submatrices
have non-zero determinants.
Example 8  
1 2 3
Reorder the rows of A =  2 4 5  so that the reordered matrix has an LU
1 3 4
decomposition.
Solution
Swapping the first and second rows does not help us since the second leading submatrix will still
have a zero determinant. Let us swap the second and third rows and consider
 
1 2 3
B= 1 3 4 
2 4 5
the leading submatrices are

1 2
B1 = 1, B2 = , B3 = B.
1 3
Now |B1 | = 1, |B2 | = 3 × 1 − 2 × 1 = 1 and (expanding along the first row)
|B3 | = 1(15 − 16) − 2(5 − 8) + 3(4 − 6) = −1 + 6 − 6 = −1.
All three of these determinants are non-zero and we conclude that B does have an LU decomposition.
 
Task 1 −3 7
Reorder the rows of A =  −2 6 1  so that the reordered matrix has an
0 3 −2
LU decomposition.
Your solution
HELM (2006): 29
Answer
Let us swap the second and third rows and consider
 
1 −3 7
B= 0 3 −2 
−2 6 1
the leading submatrices are

1 −3
B1 = 1, B2 = , B3 = B
0 3
which have determinants 1, 3 and 45 respectively. All of these are non-zero and we conclude that
B does indeed have an LU decomposition.
Exercises
1. Calculate LU decompositions for each of these matrices
   
2 1 −4 1 3 2
2 1
(a) A = (b) A =  2 2 −2  (c) A =  2 8 5 
−4 −6
6 3 −11 1 11 4
2. Check each answer in Question 1, by multiplying out LU to show that the product equals A.
3. Using the answers obtained in Question 1, solve the following systems of equations.

2 1 x1 1
(a) =
−4 −6 x2 2
    
2 1 −4 x1 4
(b)  2 2 −2   x2  =  0 
6 3 −11 x3 11
    
1 3 2 x1 2
(c)  2 8 5   x2  =  3 
1 11 4 x3 0
 
1 6 2
4. Consider A =  2 12 5 
−1 −3 −1
(a) Show that A does not have an LU decomposition.

(b) Re-order the rows of A and find an LU decomposition of the new matrix.
(c) Hence solve
x1 + 6x2 + 2x3 = 9
2x1 + 12x2 + 5x3 = −4
−x1 − 3x2 − x3 = 17
30 HELM (2006):
®
Answers
1. (a) We let

2 1 1 0 U11 U12 U11 U12
= LU = = .
−4 −6 L21 1 0 U22 L21 U11 L21 U12 + U22
Comparing the left-hand and right-hand sides row by row gives us that U11 = 2, U12 = 1,
L21 U11 = −4 which implies that L21 = −2 and, finally, L21 U12 + U22 = −6 from which
we see that U22 = −4. Hence

2 1 1 0 2 1
=
−4 −6 −2 1 0 −4
(b) We let
   
2 1 −4 U11 U12 U13
 2 2 −2  = LU =  L21 U11 L21 U12 + U22 L21 U13 + U23  .
6 3 −11 L31 U11 L31 U12 + L32 U22 L31 U13 + L32 U23 + U33
Looking at the top row we see that U11 = 2, U12 = 1 and U13 = −4. Now, from the
second row, L21 = 1, U22 = 1 and U23 = 2. The last three unknowns come from the
bottom row: L31 = 3, L32 = 0 and U33 = 1. Hence
    
2 1 −4 1 0 0 2 1 −4
 2 2 −2  =  1 1 0   0 1 2 
6 3 −11 3 0 1 0 0 1
(c) We let
   
1 3 2 U11 U12 U13
 2 8 5  = LU =  L21 U11 L21 U12 + U22 L21 U13 + U23  .
1 11 4 L31 U11 L31 U12 + L32 U22 L31 U13 + L32 U23 + U33
Looking at the top row we see that U11 = 1, U12 = 3 and U13 = 2. Now, from the
second row, L21 = 2, U22 = 2 and U23 = 1. The last three unknowns come from the
bottom row: L31 = 1, L32 = 4 and U33 = −2. Hence
    
1 3 2 1 0 0 1 3 2
 2 8 5  =  2 1 0  0 2 1 
1 11 4 1 4 1 0 0 −2
2. Direct multiplication provides the necessary check.
HELM (2006): 31
Answers
3.
(a) We begin by solving

1 0 y1 1
=
−2 1 y2 2
Clearly y1 = 1 and therefore y2 = 4. The values y1 and y2 appear on the right-hand side
of the second system we need to solve:

2 1 x1 1
=
0 −4 x2 4
The second equation implies that x2 = −1 and therefore, from the first equation, x1 = 1.
(b) We begin by solving the system
    
1 0 0 y1 4
 1 1 0  y2  =  0  .
3 0 1 y3 11
Starting with the top equation we see that y1 = 4. The second equation then implies
that y2 = −4 and then, from the third equation, y3 = −1. These values now appear on
the right-hand side of the second system
    
2 1 −4 x1 4
 0 1 2   x2  =  −4  .
0 0 1 x3 −1
The bottom equation shows us that x3 = −1. Moving up to the middle equation we
obtain x2 = −2. The top equation yields x1 = 1.
(c) We begin by solving the system
    
1 0 0 y1 2
 2 1 0  y2 = 3  .
 
1 4 1 y3 0
Starting with the top equation we see that y1 = 2. The second equation then implies
that y2 = −1 and then, from the third equation, y3 = 2. These values now appear on
the right-hand side of the second system
    
1 3 2 x1 2
 0 2 1   x2  =  −1  .
0 0 −2 x3 2
obtain x2 = 0. The top equation yields x1 = 4.
32 HELM (2006):
®
Answers
4.
(a) The second leading submatrix has determinant 1 × 12 − 6 × 2 = 0 and this implies that
A has no LU decomposition.
 
1 6 2
(b) Swapping the second and third rows gives  −1 −3 −1  . We let
2 12 5
   
1 6 2 U11 U12 U13
 −1 −3 −1  = LU =  L21 U11 L21 U12 + U22 L21 U13 + U23  .
2 12 5 L31 U11 L31 U12 + L32 U22 L31 U13 + L32 U23 + U33
Looking at the top row we see that U11 = 1, U12 = 6 and U13 = 2. Now, from the
second row, L21 = −1, U22 = 3 and U23 = 1. The last three unknowns come from the
bottom row: L31 = 2, L32 = 0 and U33 = 1. Hence
    
1 6 2 1 0 0 1 6 2
 −1 −3 −1  =  −1 1 0   0 3 1 
2 12 5 2 0 1 0 0 1
(c) We begin by solving the system
    
1 0 0 y1 9
 −1 1 0   y2  =  17  .
2 0 1 y3 −4
(Note that the second and third rows of the right-hand side vector have been swapped
too.) Starting with the top equation we see that y1 = 9. The second equation then
implies that y2 = 26 and then, from the third equation, y3 = −22. These values now
appear on the right-hand side of the second system
    
1 6 2 x1 9
 0 3 1   x2  =  26  .
0 0 1 x3 −22
obtain x2 = 16. The top equation yields x1 = −43.
HELM (2006): 33

Matrix Norms 30.4
Introduction
A matrix norm is a number defined in terms of the entries of the matrix. The norm is a useful
quantity which can give important information about a matrix.
' $
• be familiar with matrices and their use in
writing systems of equations
• revise material on matrix inverses, be able to

find the inverse of a 2 × 2 matrix, and know
when no inverse exists
Prerequisites
Before starting this Section you should . . . • revise Gaussian elimination and partial
pivoting
• be aware of the discussion of ill-conditioned

and well-conditioned problems earlier in
Section 30.1
&
# %
• calculate norms and condition numbers of
small matrices
Learning Outcomes
• adjust certain systems of equations with a
view to better conditioning
" !
34 HELM (2006):
®
1. Matrix norms
The norm of a square matrix A is a non-negative real number denoted kAk. There are several
different ways of defining a matrix norm, but they all share the following properties:
1. kAk ≥ 0 for any square matrix A.
2. kAk = 0 if and only if the matrix A = 0.
3. kkAk = |k| kAk, for any scalar k.
4. kA + Bk ≤ kAk + kBk.
5. kABk ≤ kAk kBk.
The norm of a matrix is a measure of how large its elements are. It is a way of determining the
“size” of a matrix that is not necessarily related to how many rows or columns the matrix has.
Key Point 6
Matrix Norm
The norm of a matrix is a real number which is a measure of the magnitude of the matrix.
Anticipating the places where we will use norms later, it is sufficient at this stage to restrict our
attention to matrices with only real-valued entries. There is no need to consider complex numbers
at this stage.
In the definitions of norms below we will use this notation for the elements of an n × n matrix A
where
 
a11 a12 a13 ... a1n
 a21 a22 a23 ... a2n 
 
A =  a31 a32 a33 ... a3n
 

 .. .. .. .. .. 
 . . . . . 
an1 an2 an3 . . . ann
The subscripts on a have the row number first, then the column number. The fact that
arc
is reminiscent of the word “arc” may be a help in remembering how the notation goes.
In this Section we will define three commonly used norms. We distinguish them with a subscript. All
three of them satisfy the five conditions listed above, but we will not concern ourselves with verifying
that fact.
HELM (2006): 35
Section 30.4: Matrix Norms
The 1-norm
n
!
X
kAk1 = max |aij |
1≤j≤n
i=1
(the maximum absolute column sum). Put simply, we sum the absolute values down each column
and then take the biggest answer.
Example 9
1 −7
Calculate the 1-norm of A = .
−2 −3
Solution
The absolute column sums of A are 1 + | − 2| = 1 + 2 = 3 and | − 7| + | − 3| = 7 + 3 = 10. The
larger of these is 10 and therefore kAk1 = 10.
Example 10  
5 −4 2
Calculate the 1-norm of B =  −1 2 3 .
−2 1 0
Solution
Summing down the columns of B we find that
kBk1 = max (5 + 1 + 2, 4 + 2 + 1, 2 + 3 + 0)
= max (8, 7, 5)
= 8
Key Point 7
The 1-norm of a square matrix is the maximum of the absolute column sums.
(A useful reminder is that “1” is a tall, thin character and a column is a tall, thin quantity.)
36 HELM (2006):
®
The infinity-norm
n
!
X
kAk∞ = max |aij |
1≤i≤n
j=1
(the maximum absolute row sum). Put simply, we sum the absolute values along each row and then
take the biggest answer.
Example 11
1 −7
Calculate the infinity-norm of A = .
−2 −3
Solution
The absolute row sums of A are 1 + | − 7| = 8 and | − 2| + | − 3| = 5. The larger of these is 8 and
therefore kAk∞ = 8.
Example 12  
5 −4 2
Calculate the infinity-norm of B =  −1 2 3 .
−2 1 0
Solution
Summing along the rows of B we find that
kBk∞ = max (5 + 4 + 2, 1 + 2 + 3, 2 + 1 + 0)
= max (11, 6, 3)
= 11
Key Point 8
The infinity-norm of a square matrix is the maximum of the absolute row sums.
(A useful reminder is that “∞” is a short, wide character and a row is a short, wide quantity.)
HELM (2006): 37
The Euclidean norm
v
u n X
n
uX
kAkE = t (aij )2
i=1 j=1
(the square root of the sum of all the squares). This is similar to ordinary “Pythagorean” length
where the size of a vector is found by taking the square root of the sum of the squares of all the
elements.
Example 13
1 −7
Calculate the Euclidean norm of A = .
−2 −3
Solution
p
kAkE = 12 + (−7)2 + (−2)2 + (−3)2
√
= 1 + 49 + 4 + 9
√
= 63 ≈ 7.937.
Example 14  
5 −4 2
Calculate the Euclidean norm of B =  −1 2 3 .
−2 1 0
Solution
√
kBkE = 25 + 16 + 4 + 1 + 4 + 9 + 4 + 1 + 0
√
= 64
= 8.
Key Point 9
The Euclidean norm of a square matrix is the square root of the sum of all the squares of the
elements.
38 HELM (2006):
®
Task
Calculate the norms indicated of these matrices
 
3 6 −1
2 −8
A= (1-norm), B= 3 1 0  (infinity-norm),
3 1
2 4 −7
 
1 7 3
C =  4 −2 −2  (Euclidean-norm).
−2 −1 1
Your solution
Answer
kAk1 = max(2 + 3, 8 + 1) = 9,
kBk∞ = max(3 + 6 + 1, 3 + 1 + 0, 2 + 4 + 7) = 13,
p
kCkE = 12 + 72 + 32 + 42 + (−2)2 + (−2)2 + (−2)2 + (−1)2 + 12
√
= 89 ≈ 9.434
Other norms
Any definition you can think of which satisifes the five conditions mentioned at the beginning of this
Section is a definition of a norm. There are many many possibilities, but the three given above are
among the most commonly used.
HELM (2006): 39
2. Condition numbers
The condition number of an invertible matrix A is defined to be
κ(A) = kAk kA−1 k.
This quantity is always bigger than (or equal to) 1.
We must use the same type of norm twice on the right-hand side of the above equation. Sometimes
the notation is adjusted to make it clear which norm is being used, for example if we use the infinity
norm we might write
κ∞ (A) = kAk∞ kA−1 k∞ .
Example 15
Use the norm indicated to calculate the condition number of the given matrices.

2 3 2 3
(a) A = ; 1-norm. (b) A = ; Euclidean norm.
1 −1 1 −1
 
−3 0 0
(c) B =  0 4 0 ; infinity-norm.
0 0 2
Solution
(a) kAk1 = max(2 + 1, 3 + 1) = 4,
1 3
 

1 −1 −3 5 5
A−1 = = 
−2 − 3 −1 2 1 −2
5 5
∴ kA−1 k1 = max( 15 + 51 , 35 + 25 ) = 1.
Therefore κ1 (A) = kAk1 kA−1 k1 = 4 × 1 = 4.

p √
(b) kAkE = 22 + 32 + 12 + (−1)2 = 15. We can re-use A−1 from above to see that
s
2 2 2 2 r
−1 1 3 1 −2 15
kA kE = + + + = .
5 5 5 5 25
√
r
−1 15 15 15
Therefore κE (A) = kAkE kA kE = 15 × =√ = = 3.
25 25 5
(c) kBk∞ = max(3, 4, 2) = 4.
 1 
−3 0 0
B −1 =  0 14 0 
0 0 12
so kB −1 k∞ = max( 13 , 14 , 12 ) = 21 . Therefore κ∞ (B) = kBk∞ kB −1 k∞ = 4 × 1
2
= 2.
40 HELM (2006):
®
Task
Calculate the condition numbers of these matrices, using the norm indicated

2 −8 3 6
A= (1-norm), B= (infinity-norm).
3 1 1 0
Your solution
Answer

−1 1 1 8
A = so κ1 (A) = kAk1 kA−1 k1 = max(5, 9) × max( 26
4 10 10
, 26 ) = 9 × 26 = 45
.
2 + 24 −3 2 13

−1 1 0 −6
B = so κ∞ (B) = kBk∞ kB −1 k∞ = max(9, 1) × max(1, 46 ) = 9.
0 − 6 −1 3
Condition numbers and conditioning

As the name might suggest, the condition number gives us information regarding how well-
conditioned a problem is. Consider this example
4
1 104 x1 10
= .
−1 2 x2 1
It is not hard to verify that the exact solution to this problem is
 
10000
 10002 
 
x1 0.999800...
= = .
x2  10001  0.999900...
10002
Example 16
1 104
Using the 1-norm find the condition number of .
−1 2
Solution
Firstly, kAk1 = 2 + 104 . Also

−1 1 2 −104 1
A = ∴ kA−1 k1 = (1 + 104 ). Hence κ1 (A) = 1 + 104 = 10001.
2 + 104 1 1 2 + 104
HELM (2006): 41
The fact that this number is large is the indication that the problem involving A is an ill-conditioned
one. Suppose we consider finding its solution by Gaussian elimination, using 3 significant figures
throughout. Eliminating the non-zero in the bottom left corner gives
4
1 104 x1 10
= .
0 104 x2 104
which implies that x2 = 1 and x1 = 0. This is a poor approximation to the true solution and partial
pivoting will not help. We have altered the problem
by a relatively tiny amount (that is, by neglecting
x1
the fourth significant figure) and the result has changed by a large amount. In other words
x2
the problem is ill-conditioned.
One way that systems of equations can be made better conditioned is to fix things so that
all the
1 104
rows have largest elements that are about the same size. In the matrix A = the first
−1 2
row’s largest element is 104 , the second row has largest element equal to 2. This is not a happy
situation.
If we divide the first equation through by 104 then we have
−4
10 1 x1 1
=
−1 2 x2 1
then the top row has largest entry equal to 1, and the bottom row still has 2 as its largest entry.
These two values are of comparable size.
The solution to the system was found via pivoting (using 3 significant figures) in the Section con-
cerning Gaussian elimination to be x1 = x2 = 1, a pretty good approximation to the exact values.
The matrix in this second version of the problem is much better conditioned.
Example 17
10−4 1

Using the 1-norm find the condition number of .
−1 2
Solution
The 1-norm of A is easily seen to be kAk1 = 3. We also need

−1 1 2 −1 3
A = −4 −4 ∴ kA−1 k1 = .
2 × 10 + 1 1 10 2 × 10−4 + 1
Hence
9
κ1 (A) = ≈ 8.998
2 × 10−4 + 1
This condition number is much smaller than the earlier value of 10001, and this shows us that the
second version of the system of equations is better conditioned.
42 HELM (2006):
®
Exercises
1. Calculate the indicated norm of the following matrices

2 −2
(a) A = ; 1-norm.
1 −3

2 −2
(b) A = ; infinity-norm.
1 −3

2 −3
(c) B = ; Euclidean norm.
1 −2
 
1 −2 3
(d) C =  1 5 6 ; infinity-norm.
2 −1 3
 
1 −2 3
(e) C =  1 5 6 ; 1-norm.
2 −1 3
2. Use the norm indicated to calculate the condition number of the given matrices.

4 −2
(a) D = ; 1-norm.
6 0

−1 5
(b) E = ; Euclidean norm.
4 2
 
6 0 0
(c) F =  0 4 0 ; infinity-norm.
0 0 1

−1 3
3. Why is it not sensible to ask what the condition number of is?
2 −6
   
2 4 −1 −7 3 −13
1
4. Verify that the inverse of G =  2 5 2  is 4 −1 6 .
5
−1 −1 1 −3 2 −2
Hence find the condition number of G using the 1-norm.
5. (a) Calculate the condition number (use any norm you choose) of the coefficient matrix of
the system

1 104 x1 1
=
2 3 x2 3
and hence conclude that the problem as stated is ill-conditioned.
(b) Multiply one of the equations through by a suitably chosen constant so as to make the
system better conditioned. Calculate the condition number of the coefficient matrix in
your new system of equations.
HELM (2006): 43
Answers
1. (a) kAk1 = max(2 + 1, 2 + 3) = 5.

(b) kAk∞ = max(2 + −2, 1 + 3) = 4.
√ √
(c) kBkE = 4 + 9 + 1 + 4 = 18
(d) kCk∞ = max(1 + 2 + 3, 1 + 5 + 6, 2 + 1 + 3) = 12.
(e) kCk1 = max(1 + 1 + 2, 2 + 5 + 1, 3 + 6 + 3) = 12.
2. (a) To work out the condition number we need to find

−1 1 0 2
D = .
12 −6 4
Given this we work out the condition number as the product of two norms as follows
κ1 (D) = kDk1 kD−1 k1 = 10 × 1

2
= 5.
(b) To work out the condition number we need to find

−1 1 2 −5
E = .
−22 −4 −1
Given this we work out the condition number as the product of two norms as follows
κE (E) = kEkE kE −1 kE = 6.782330 × 0.308288 = 2.090909.
 1 
6
0 0
(c) Here F −1 =  0 14 0  so that κ∞ (F ) = kF k∞ kF −1 k∞ = 6 × 1 = 6.
0 0 1
3. The matrix is not invertible.
44 HELM (2006):
®
Answers
 
1 0 0
4. Verification is done by a direct multiplication to show that GG−1 =  0 1 0 .
0 0 1
Using the 1-norm we find that κ1 (G) = kGk1 kG−1 k1 = 10 × 21
5
= 42.
5.
(a) The inverse of the coefficient matrix is

1 3 −104 −1 3 −10000
= .
3 − 2 × 104 −2 1 19997 −2 1
Using the 1-norm the condition number of the coefficient matrix is
1
(3 + 104 ) × (1 + 104 ) = 5002.75
19997
to 6 significant figures. This is a large condition number, and the given problem is not
well-conditioned.
(b) Now we multiply the top equation through by 10−4 so that the system of equations
becomes
−4
10 1 x1 1
=
2 3 x2 3
and the inverse of this new coefficient matrix is

1 3 −1 −1 3 −1
= .
3 × 10−4 − 2 −2 10−4 1.9997 −2 .0001
Using the 1-norm again we find that the condition number of the new coefficient matrix
is
1
4× (5) = 10.0015
1.9997
to 6 significant figures. This much smaller condition number implies that the second
problem is better conditioned.
HELM (2006): 45
Iterative Methods for
Systems of Equations 30.5
Introduction
There are occasions when direct methods (like Gaussian elimination or the use of an LU decompo-
sition) are not the best way to solve a system of equations. An alternative approach is to use an
iterative method. In this Section we will discuss some of the issues involved with iterative methods.
' $
• revise matrices, especially the material in
8
Prerequisites
• revise determinants
• revise matrix norms
&
# %
• approximate the solutions of simple
systems of equations by iterative methods
Learning Outcomes
On completion you should be able to . . . • assess convergence properties of iterative
methods
" !
46 HELM (2006):
®
1. Iterative methods
Suppose we have the system of equations
AX = B.
The aim here is to find a sequence of approximations which gradually approach X. We will denote
these approximations
X (0) , X (1) , X (2) , . . . , X (k) , . . .
where X (0) is our initial “guess”, and the hope is that after a short while these successive iterates
will be so close to each other that the process can be deemed to have converged to the required
solution X.
Key Point 10
An iterative method is one in which a sequence of approximations (or iterates) is produced. The
method is successful if these iterates converge to the true solution of the given problem.
It is convenient to split the matrix A into three parts. We write

A=L+D+U
where L consists of the elements of A strictly below the diagonal and zeros elsewhere; D is a diagonal
matrix consisting of the diagonal entries of A; and U consists of the elements of A strictly above
the diagonal. Note that L and U here are not the same matrices as appeared in the LU
decomposition! The current L and U are much easier to find.
For example

3 −4 0 0 3 0 0 −4
= + +
2 1 2 0 0 1 0 0
| {z } | {z } | {z } | {z }
↑ ↑ ↑ ↑
A = L + D + U
and
       
2 −6 1 0 0 0 2 0 0 0 −6 1
 3 −2 0  =  3 0 0  +  0 −2 0  +  0 0 0 
4 −1 7 4 −1 0 0 0 7 0 0 0
| {z } | {z } | {z } | {z }
↑ ↑ ↑ ↑
A = L + D + U
HELM (2006): 47
Section 30.5: Iterative Methods for Systems of Equations
and, more generally for 3 × 3 matrices
       
• • • 0 0 0 • 0 0 0 • •
 • • •  =  • 0 0  +  0 • 0  +  0 0 • .
• • • • • 0 0 0 • 0 0 0
| {z } | {z } | {z } | {z }
↑ ↑ ↑ ↑
A = L + D + U.
The Jacobi iteration

The simplest iterative method is called Jacobi iteration and the basic idea is to use the A =
L + D + U partitioning of A to write AX = B in the form
DX = −(L + U )X + B.
We use this equation as the motivation to define the iterative process
DX (k+1) = −(L + U )X (k) + B
which gives X (k+1) as long as D has no zeros down its diagonal, that is as long as D is invertible.
This is Jacobi iteration.
Key Point 11
The Jacobi iteration for approximating the solution of AX = B where A = L + D + U is given
by
X (k+1) = −D−1 (L + U )X (k) + D−1 B
Example 18  
x1
Use the Jacobi iteration to approximate the solution X =  x2  of
     x3
8 2 4 x1 −16
 3 5 1   x2  =  4  .
2 1 4 x3 −12
 
0
(0)
Use the initial guess X = 0 .

0
48 HELM (2006):
®
Solution
   
8 0 0 0 2 4
In this case D =  0 5 0  and L + U =  3 0 1 .
0 0 4 2 1 0
First iteration.
The first iteration is DX (1) = −(L + U )X (0) + B, or in full
   (1)     (0)     
8 0 0 x1 0 −2 −4 x1 −16 −16
(1)   (0) 
 0 5 0   x2  =  −3 0 −1   x2  +  4  =  4  ,
0 0 4 (1)
x3 −2 −1 0 x3
(0) −12 −12
(0) (0) (0)
since the initial guess was x1 = x2 = x3 = 0.
Taking this information row by row we see that
(1) (1)
8x1 = −16 ∴ x1 = −2
(1) (1)
5x2 = 4 ∴ x2 = 0.8
(1) (1)
4x3 = −12 ∴ x3 = −3
 (1)
  
x1 −2
Thus the first Jacobi iteration gives us X (1) =  x(1)  = 0.8  as an approximation to X.
  
2
(1)
x3 −3
Second iteration.
The second iteration is DX (2) = −(L + U )X (1) + B, or in full
   (2)     (1)   
8 0 0 x1 0 −2 −4 x1 −16
(2)   (1) 
 0 5 0   x2  =  −3 0 −1   x2  +  4  .
0 0 4 x3
(2) −2 −1 0 (1)
x3 −12
(2) (1) (1) (2)

8x1 = −2x2 − 4x3 − 16 = −2(0.8) − 4(−3) − 16 = −5.6 ∴ x1 = −0.7
(2) (1) (1) (2)

5x2 = −3x1 − x3 + 4 = −3(−2) − (−3) + 4 = 13 ∴ x2 = 2.6
(2) (1) (1) (2)

4x3 = −2x1 − x2 − 12 = −2(−2) − 0.8 − 12 = −8.8 ∴ x3 = −2.2
 (2)
  
x1 −0.7
Therefore the second iterate approximating X is X (2) =  x(2) = 2.6 .
  
2
(2)
x3 −2.2
HELM (2006): 49
Solution (contd.)
Third iteration.
The third iteration is DX (3) = −(L + U )X (2) + B, or in full
   (3)     (2)   
8 0 0 x1 0 −2 −4 x1 −16
(3)   (2) 
 0 5 0   x2  =  −3 0 −1   x2  +  4 
0 0 4 (3)
x3 −2 −1 0 (2)
x3 −12
(3) (2) (2) (3)

8x1 = −2x2 − 4x3 − 16 = −2(2.6) − 4(−2.2) − 16 = −12.4 ∴ x1 = −1.55
(3) (2) (2) (3)

5x2 = −3x1 − x3 + 4 = −3(−0.7) − (2.2) + 4 = 8.3 ∴ x2 = 1.66
(3) (2) (2) (3)

4x3 = −2x1 − x2 − 12 = −2(−0.7) − 2.6 − 12 = −13.2 ∴ x3 = −3.3
 (3)
  
x1 −1.55
Therefore the third iterate approximating X is X (3) =  x(3) = 1.66 .
  
2
(3)
x3 −3.3
More iterations ...

Three iterations is plenty when doing these calculations by hand! But the repetitive nature of the
process is ideally suited to its implementation on a computer. It turns out that the next few iterates
are
     
−0.765 −1.277 −0.839
X (4) =  2.39  , X (5) =  1.787  , X (6) =  2.209  ,
−2.64 −3.215 −2.808
 (20)
  
x1 −0.9959
to 3 d.p. Carrying on even further X (20) =  x(20) = 2.0043 , to 4 d.p. After about 40
  
2
(20)
x3 −2.9959
iterations successive iterates are equal to 4 d.p. Continuing the iteration even further causes the
iterates to agree to more and more decimal places. The method converges to the exact answer

−1
X =  2 .
−3
The following Task involves calculating just two iterations of the Jacobi method.
50 HELM (2006):
®
Task
Carry out two iterations of the Jacobi method to approximate the solution of
    
4 −1 −1 x1 1
 −1 4 −1   x2  =  2 
−1 −1 4 x3 3
 
1
(0)
with the initial guess X = 1 .
1
Your solution
First iteration:
Answer
The first DX (1) = −(L + U )X (0) + B, that is,
iteration is
  (1)
    (0)   
4 0 0 x1 0 1 1 x1 1
 0 (1)   (0)   
4 0  x2  = 1 0 1  x2  + 2
  
0 0 4 (1) 1 1 0 (0) 3
x3 x3
 
0.75
from which it follows that X (1) =  1 .
1.25
Your solution
Second iteration:
HELM (2006): 51
Answer
The second iteration is DX (1) = −(L + U )X (0) + B, that is,
   (2)     (0)   
4 0 0 x1 0 1 1 x1 1
 0 4 0  (2)   (0)   
 x2  = 1 0 1  x2  + 2
 
0 0 4 (2) 1 1 0 (0) 3
x3 x3
 
0.8125
from which it follows that X (2) =  1 .
1.1875
Notice that at each iteration the first thing we do is get a new approximation for x1 and then we
continue to use the old approximation to x1 in subsequent calculations for that iteration! Only at
the next iteration do we use the new value. Similarly, we continue to use an old approximation to x2
even after we have worked out a new one. And so on.
Given that the iterative process is supposed to improve our approximations why not use the better
values straight away? This observation is the motivation for what follows.
Gauss-Seidel iteration
The approach here is very similar to that used in Jacobi iteration. The only difference is that we use
new approximations to the entries of X as soon as they are available. As we will see in the Example
below, this means rearranging (L + D + U )X = B slightly differently from what we did for Jacobi.
We write
(D + L)X = −U X + B
and use this as the motivation to define the iteration
(D + L)X (k+1) = −U X (k) + B.
Key Point 12
The Gauss-Seidel iteration for approximating the solution of AX = B is given by
X (k+1) = −(D + L)−1 U X (k) + (D + L)−1 B
Example 19 which follows revisits the system of equations we saw earlier in this Section in Example
18.
52 HELM (2006):
®
Example 19
 
x1
Use the Gauss-Seidel iteration to approximate the solution X =  x2  of
       x3
8 2 4 x1 −16 0
 3 5 1   x2  =  4  . (0)
Use the initial guess X = 0 .

2 1 4 x3 −12 0
Solution
  
8 0 0 0 2 4
In this case D + L =  3 5 0  and U =  0 0 1 .
2 1 4 0 0 0
First iteration.
The first iteration is (D + L)X (1) = −U X (0) + B, or in full

   (1)     (0)     
8 0 0 x1 0 −2 −4 x1 −16 −16
(1)   (0) 
 3 5 0  x2  =  0 0 −1   x2  +  4  =  4  ,
2 1 4 (1)
x3 0 0 0 (0)
x3 −12 −12
(0) (0) (0)

since the initial guess was x1 = x2 = x3 = 0.
(1) (1)
8x1 = −16 ∴ x1 = −2
(1) (1) (1) (1)

3x2 + 5x2 = 4 ∴ 5x2 = −3(−2) + 4 ∴ x2 = 2
(1) (1) (1) (1) (1)

2x1 + x2 + 4x3 = −12 ∴ 4x3 = −2(−2) − 2 − 12 ∴ x3 = −2.5
(Notice how the new approximations to x1 and x2 were used immediately after they were found.)
 (1)
  
x1 −2
Thus the first Gauss-Seidel iteration gives us X (1) =  x(1) = 2  as an approximation to
  
2
(1)
x3 −2.5
X.
HELM (2006): 53
Solution
Second iteration.
The second iteration is (D + L)X (2) = −U X (1) + B, or in full
   (2)     (1)   
8 0 0 x1 0 −2 −4 x1 −16
(2)   (1) 
 3 5 0   x2  =  0 0 −1   x2  +  4 
2 1 4 x3
(2) 0 0 0 (1)
x3 −12
(2) (1) (1) (2)

8x1 = −2x2 − 4x3 − 16 ∴ x1 = −1.25
(2) (2) (1) (2)

3x1 + 5x2 = −x3 + 4 ∴ x2 = 2.05
(2) (2) (2) (2)

2x1 + x2 + 4x3 = −12 ∴ x3 = −2.8875
 (2)
  
x1 −1.25
Therefore the second iterate approximating X is X (2) =  x(2) = 2.05 .
  
2
(2)
x3 −2.8875
Third iteration.
The third iteration is (D + L)X (3) = −U X (2) + B, or in full
   (3)     (2)   
8 0 0 x 1 0 −2 −4 x1 −16
(3)   (2) 
 3 5 0   x2  =  0 0 −1   x2  +  4  .
2 1 4 x3
(3) 0 0 0 (2)
x3 −12
(3) (2) (2) (3)

8x1 = −2x2 − 4x3 − 16 ∴ x1 = −1.0687
(3) (3) (2) (3)

3x1 + 5x2 = −x3 + 4 ∴ x2 = 2.0187
(3) (3) (3) (3)

2x1 + x2 + 4x3 = −12 ∴ x3 = −2.9703
to 4 d.p. Therefore the third iterate approximating X is

 (3)   
x1 −1.0687
X (3) =  x(3) = 2.0187  .
  
2
(3)
x3 −2.9703
More iterations ...
Again, there is little to be learned from pushing this further by hand. Putting the procedure on a
computer and seeing how it progresses is instructive, however, and the iteration continues as follows:
54 HELM (2006):
®
     
−1.0195 −1.0056 −1.0016
X (4) =  2.0058  , X (5) =  2.0017  , X (6) =  2.0005  ,
−2.9917 −2.9976 −2.9993
     
−1.0005 −1.0001 −1.0000
X (7) =  2.0001  , X (8) =  2.0000  , X (9) =  2.0000 
−2.9998 −2.9999 −3.0000
(to 4 d.p.). Subsequent iterates are equal to X (9) to this number of decimal places. The Gauss-Seidel
iteration has converged to 4 d.p. in 9 iterations. It took the Jacobi method almost 40 iterations to
achieve this!
Task
Carry out two iterations of the Gauss-Seidel method to approximate the solution
of
    
4 −1 −1 x1 1
 −1 4 −1   x2  =  2 
−1 −1 4 x3 3
 
1
with the initial guess X (0) =  1 .
1
Your solution
First iteration
Answer
The first iteration is (D + L)X (1) = −U X (0) + B, that is,
   (1)     (0)   
4 0 0 x1 0 1 1 x1 1
 −1 4 0   (1)   (0)   
 x2  = 0 0 1  x2  + 2
 
−1 −1 4 (1)
x3 0 0 0 x3
(0) 3
 
0.75
from which it follows that X (1) =  0.9375 .
1.1719
HELM (2006): 55
Your solution
Second iteration
Answer
The second iteration is (D + L)X (1) = −U X (0) + B, that is,
   (2)     (1)   
4 0 0 x1 0 1 1 x1 1
 −1  (2)    (1)   
4 0  x2  = 0 0 1  x2  + 2
 
−1 −1 4 (2)
x3 0 0 0 (1)
x3 3
 
0.7773
from which it follows that X (2) =  0.9873 .
1.1912
2. Do these iterative methods always work?

No. It is not difficult to invent examples where the iteration fails to approach the solution of AX = B.
The key point is related to matrix norms seen in the preceding Section.
The two iterative methods we encountered above are both special cases of the general form
X (k+1) = M X (k) + N.
1. For the Jacobi method we choose M = −D−1 (L + U ) and N = D−1 B.
2. For the Gauss-Seidel method we choose M = −(D + L)−1 U and N = (D + L)−1 B.
The following Key Point gives the main result.
Key Point 13
For the iterative process X (k+1) = M X (k) + N the iteration will converge to a solution if the norm
of M is less than 1.
56 HELM (2006):
®
Care is required in understanding what Key Point 13 says. Remember that there are lots of different
ways of defining the norm of a matrix (we saw three of them). If you can find a norm (any norm)
such that the norm of M is less than 1, then the iteration will converge. It doesn’t matter if there
are other norms which give a value greater than 1, all that matters is that there is one norm that is
less than 1.
Key Point 13 above makes no reference to the starting “guess” X (0) . The convergence of the iteration
is independent of where you start! (Of course, if we start with a really bad initial guess then we can
expect to need lots of iterations.)
Task
Show that the Jacobi iteration used to approximate the solution of
    
4 −1 −1 x1 1
 1 −5 −2   x2  =  2 
−1 0 2 x3 3
is certain to converge. (Hint: calculate the norm of −D−1 (L + U ).)
Your solution
Answer
The Jacobi iteration matrix is
 −1     
4 0 0 0 1 1 0.25 0 0 0 1 1
−D−1 (L + U ) =  0 −5 0   −1 0 2  =  0 −0.2 0   −1 0 2 
0 0 2 1 0 0 0 0 0.5 1 0 0
 
0 0.25 0.25
=  −0.2 0 0.4 
0.5 0 0
and the infinity norm of this matrix is the maximum of 0.25 + 0.25, 0.2 + 0.4 and 0.5, that is
k − D−1 (L + U )k∞ = 0.6
which is less than 1 and therefore the iteration will converge.
HELM (2006): 57
Guaranteed convergence
If the matrix has the property that it is strictly diagonally dominant, which means that the diagonal
entry is larger in magnitude than the absolute sum of the other entries on that row, then both Jacobi
and Gauss-Seidel are guaranteed to converge. The reason for this is that if A is strictly diagonally
dominant then the iteration matrix M will have an infinity norm that is less than 1.
A small system is the subject of Example 20 below. A large system with slow convergence is the
subject of Engineering Example 1 on page 62.
Example 20 
4 −1 −1
Show that A =  1 −5 −2  is strictly diagonally dominant.
−1 0 2
Solution
Looking at the diagonal entry of each row in turn we see that
4 > | − 1| + | − 1| = 2
| − 5| > 1 + | − 2| = 3
2 > | − 1| + 0 = 1
and this means that the matrix is strictly diagonally dominant.
Given that A above is strictly diagonally dominant it is certain that both Jacobi and Gauss-Seidel
will converge.
What’s so special about strict diagonal dominance?

In many applications we can be certain that the coefficient matrix A will be strictly diagonally
dominant. We will see examples of this in 32 and 33 when we consider approximating
solutions of differential equations.
58 HELM (2006):
®
Exercises
1. Consider the system

2 1 x1 2
=
1 2 x2 −5

(0) 1
(a) Use the starting guess X = in an implementation of the Jacobi method to
−1
1.5
show that X (1) = . Find X (2) and X (3) .
−3

(0) 1
(b) Use the starting guess X = in an implementation of the Gauss-Seidel method
−1
1.5
to show that X (1) = . Find X (2) and X (3) .
−3.25

x1 3
(Hint: it might help you to know that the exact solution is = .)
x2 −4
2. (a) Show that the Jacobi iteration applied to the system

    
5 −1 0 0 x1 7
 −1 5 −1 0   x2   −10 
  = 
 0 −1 5 −1   x3   −6 
0 0 −1 5 x4 16
can be written
   
0 0.2 0 0 1.4
 0.2 0 0.2 0  (k)  −2 
X (k+1) =
 0 0.2 0 0.2  X + 
  .
−1.2 
0 0 0.2 0 3.2
(b) Show that the method is certain to converge and calculate the first three iterations using
zero starting values.


1
 −2 
(Hint: the exact solution to the stated problem is 
 1 .)

HELM (2006): 59
Answers
(1) (0)
1. (a) 2x1 = 2 − 1x2 = 2
(1)
and therefore x1 = 1.5
(1) (0)
2x2 = −5 − 1x1 = −6
(1)
which implies that x2 = −3. These two values give the required entries in X (1) . A
second and third iteration follow in a similar way to give

(2) 2.5 (3) 2.625
X = and X =
−3.25 −3.75
(1) (0)
(b) 2x1 = 2 − 1x2 = 3
(1)
and therefore x1 = 1.5. This new approximation to x1 is used straight away when
(1)
finding a new approximation to x2 .
(1) (1)
2x2 = −5 − 1x1 = −6.5
(1)
which implies that x2 = −3.25. These two values give the required entries in X (1) . A
second and third iteration follow in a similar way to give

(2) 2.625 (3) 2.906250
X = and X =
−3.8125 −3.953125
where X (3) is given to 6 decimal places

   
5 0 0 0 0.2 0 0 0
 0 5 0 0   0 0.2 0 0 
2. (a) In this case D =   and therefore D−1 =  .
 0 0 5 0   0 0 0.2 0 
0 0 0 5 0 0 0 0.2
   
0 −1 0 0 0 0.2 0 0
 −1 0 −1 0   0.2 0 0.2 0 
So the iteration matrix M = D−1 
 0 −1 0 −1  =  0 0.2 0 0.2 
  
0 0 −1 0 0 0 0.2 0
and that the Jacobi iteration takes the form

     
7 0 0.2 0 0 1.4
−10   0.2 0 0.2 0  (k)  −2 
X (k+1) = M X (k) + M −1 
  
 −6  =  0 0.2 0 0.2
X +  
  −1.2 
16 0 0 0.2 0 3.2
as required.
60 HELM (2006):
®
Answers
2(b)
(0) (0) (0) (0)
Using the starting values x1 = x2 = x3 = x4 = 0, the first iteration of the Jacobi method
gives
x11 = 0.2x02 + 1.4 = 1.4

x12 = 0.2(x01 + x03 ) − 2 = −2
x13 = 0.2(x02 + x04 ) − 1.2 = −1.2
x14 = 0.2x03 + 3.2 = 3.2
The second iteration is
x21 = 0.2x12 + 1.4 = 1

x22 = 0.2(x11 + x13 ) − 2 = −1.96
x23 = 0.2(x12 + x14 ) − 1.2 = −0.96
x24 = 0.2x13 + 3.2 = 2.96
And the third iteration is
x31 = 0.2x22 + 1.4 = 1.008

x32 = 0.2(x21 + x23 ) − 2 = −1.992
x33 = 0.2(x22 + x24 ) − 1.2 = −1
x34 = 0.2x23 + 3.2 = 3.008
HELM (2006): 61
Detecting a train on a track
Introduction
One means of detecting trains is the ‘track circuit’ which uses current fed along the rails to detect
the presence of a train. A voltage is applied to the rails at one end of a section of track and a relay is
attached across the other end, so that the relay is energised if no train is present, whereas the wheels
of a train will short circuit the relay, causing it to de-energise. Any failure in the power supply or a
breakage in a wire will also cause the relay to de-energise, for the system is fail safe. Unfortunately,
there is always leakage between the rails, so this arrangement is slightly complicated to analyse.
Problem in words
A 1000 m track circuit is modelled as ten sections each 100 m long. The resistance of 100 m of one
rail may be taken to be 0.017 ohms, and the leakage resistance across a 100 m section taken to be
30 ohms. The detecting relay and the wires to it have a resistance of 10 ohms, and the wires from
the supply to the rail connection have a resistance of 5 ohms for the pair. The voltage applied at
the supply is 4V . See diagram below. What is the current in the relay?
5 ohm
i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11
0.017 ohm 0.017 ohm 0.017 ohm
30 ohm
30 ohm
30 ohm
30 ohm
4 volts
0.017 ohm 0.017 ohm
relay and wires

10 ohm
Figure 1
There are many ways to apply Kirchhoff’s laws to solve this, but one which gives a simple set of
equations in a suitable form to solve is shown below. i1 is the current in the first section of rail (i.e.
the one close to the supply), i2 , i3 , . . . i10 , the current in the successive sections of rail and i11 the
current in the wires to the relay. The leakage current between the first and second sections of rail
is i1 − i2 so that the voltage across the rails there is 30(i1 − i2 ) volts. The first equation below
uses this and the voltage drop in the feed wires, the next nine equations compare the voltage drop
across successive sections of track with the drop in the (two) rails, and the last equation compares
the voltage drop across the last section with that in the relay wires.
30(i1 − i2 ) + (5.034)i1 = 4
30(i1 − i2 ) = 0.034i2 + 30(i2 − i3 )
30(i2 − i3 ) = 0.034i2 + 30(i3 − i4 )
..
.
30(i9 − i10 ) = 0.034i10 + 30(i10 − i11 )
30(i10 − i11 ) = 10i11
62 HELM (2006):
®
These can be reformulated in matrix form as Ai = v, where v is the 11 × 1 column vector with first
entry 4 and the other entries zero, i is the column vector with entries i1 , i2 , . . . , i11 and A is the
matrix
 
35.034 -30 0 0 0 0 0 0 0 0 0

 -30 60.034 -30 0 0 0 0 0 0 0 0 


 0 -30 60.034 -30 0 0 0 0 0 0 0 

 0 0 -30 60.034 -30 0 0 0 0 0 0 
 
 0 0 0 -30 60.034 -30 0 0 0 0 0 
A= 0 0 0 0 -30 60.034 -30 0 0 0 0
 

0 0 0 0 0 -30 60.034 -30 0 0 0
 
 

 0 0 0 0 0 0 -30 60.034 -30 0 0 


 0 0 0 0 0 0 0 -30 60.034 -30 0 

 0 0 0 0 0 0 0 0 -30 60.034 -30 
0 0 0 0 0 0 0 0 0 -30 40
Find the current i1 in the relay when the input is 4V , by Gaussian elimination or by performing an
L-U decomposition of A.
We solve Ai = v as above, although actually we only want to know i11 . Letting M be the matrix A
with the column v added at the right, as in Section 30.2, then performing Gaussian elimination on
M , working to four decimal places gives
 
35.0340 -30.0000 0 0 0 0 0 0 0 0 0 4.0000
 0 34.3447 -30.0000 0 0 0 0 0 0 0 0 3.4252 
 0 0 33.8291 -30.0000 0 0 0 0 0 0 0 2.9919 
 0 0 0 33.4297 -30.0000 0 0 0 0 0 0 2.6532 
0 0 0 0 33.1118 -30.0000 0 0 0 0 0 2.3810
 
M= 
 
0 0 0 0 0 32.8534 -30.0000 0 0 0 0 2.1572 
 0 0 0 0 0 0 32.6396 -30.0000 0 0 0 1.9698 

 0 0 0 0 0 0 0 32.4601 -30.0000 0 0 1.8105 

 0 0 0 0 0 0 0 0 32.3077 -30.0000 0 1.6733 
0 0 0 0 0 0 0 0 0 32.1769 -30.0000 1.5538
0 0 0 0 0 0 0 0 0 0 12.0296 1.4487
from which we can calculate that the solution i is

 
0.5356
0.4921
 
0.4492
 
0.4068
 
0.3649
 
i= 0.3234 

0.2822
 
0.2414
 
0.2008
 
0.1605
0.1204
so the current in the relay is 0.1204 amps, or 0.12 A to two decimal places.
You can alternatively solve this problem by an L-U decomposition by finding matrices L and U such
that M = LU . Here we have
HELM (2006): 63
 
1.0000 0 0 0 0 0 0 0 0 0 0
 -0.8563 1.0000 0 0 0 0 0 0 0 0 0 
 0 -0.8735 1.0000 0 0 0 0 0 0 0 0 
 0 0 -0.8868 1.0000 0 0 0 0 0 0 0 
0 0 0 -0.8974 1.0000 0 0 0 0 0 0
 
L=
 
0 0 0 0 -0.9060 1.0000 0 0 0 0 0 
 0 0 0 0 0 -0.9131 1.0000 0 0 0 0 

 0 0 0 0 0 0 -0.9191 1.0000 0 0 0 

 0 0 0 0 0 0 0 -0.9242 1.0000 0 0 
0 0 0 0 0 0 0 0 -0.9286 1.0000 0
0 0 0 0 0 0 0 0 0 -0.9323 1.0000
and
 
35.0340 -30.0000 0 0 0 0 0 0 0 0 0
 0 34.3447 -30.0000 0 0 0 0 0 0 0 0 
 0 0 33.8291 -30.0000 0 0 0 0 0 0 0 
 0 0 0 33.4297 -30.0000 0 0 0 0 0 0 
0 0 0 0 33.1118 -30.0000 0 0 0 0 0
 
U =
 
0 0 0 0 0 32.8534 -30.0000 0 0 0 0 
 0 0 0 0 0 0 32.6395 -30.0000 0 0 0 

 0 0 0 0 0 0 0 32.4601 -30.0000 0 0 

 0 0 0 0 0 0 0 0 32.3076 -30.0000 0 
0 0 0 0 0 0 0 0 0 32.1768 -30.0000
0 0 0 0 0 0 0 0 0 0 12.0295
   
4.0000 0.5352
3.4240 0.4917
   
2.9892 0.4487
   
2.6514 0.4064
   
2.3783 0.3644
   
2.1547 and hence i = 0.3230 and again the current is found to be 0.12 amps.
Therefore U i =    
1.9673 0.2819
   
1.8079 0.2411
   
1.6705 0.2006
   
1.5519 0.1603
1.4464 0.1202
You can try to solve the equation Ai = v by Jacobi or Gauss-Seidel iteration but in both cases it will
take very many iterations (over 200 to get four decimal places). Convergence is very slow because the
norms of the relevant matrices in the iteration are only just less than 1. Convergence is nevertheless
assured because the matrix A is diagonally dominant.
64 HELM (2006):
Contents 31
Numerical Methods
of Approximation
31.1 Polynomial Approximations 2
31.2 Numerical Integration 28
31.3 Numerical Differentiation 58
31.4 Nonlinear Equations 67
Learning outcomes
In this Workbook you will learn about some numerical methods widely used in
engineering applications.
You will learn how certain data may be modelled, how integrals and derivatives may be
approximated and how estimates for the solutions of non-linear equations may be found.
Polynomial
Approximations 31.1
Introduction
Polynomials are functions with useful properties. Their relatively simple form makes them an ideal
candidate to use as approximations for more complex functions. In this second Workbook on Nu-
merical Methods, we begin by showing some ways in which certain functions of interest may be
approximated by polynomials.

• revise material on maxima and minima of
Prerequisites functions of two variables
Before starting this Section you should . . . • be familiar with polynomials and Taylor series

'
$
• interpolate data with polynomials
Learning Outcomes • find the least squares best fit straight line to
experimental data
& %
2 HELM (2006):
Workbook 31: Numerical Methods of Approximation
®
1. Polynomials
A polynomial in x is a function of the form
p(x) = a0 + a1 x + a2 x2 + . . . an xn (an 6= 0, n a non-negative integer)
where a0 , a1 , a2 , . . . , an are constants. We say that this polynomial p has degree equal to n. (The
degree of a polynomial is the highest power to which the argument, here it is x, is raised.) Such
functions are relatively simple to deal with, for example they are easy to differentiate and integrate. In
this Section we will show ways in which a function of interest can be approximated by a polynomial.
First we briefly ensure that we are certain what a polynomial is.
Example 1
Which of these functions are polynomials in x? In the case(s) where f is a poly-
nomial, give its degree.
(a) f (x) = x2 − 2 − x1 , (b) f (x) = x4 + x − 6, (c) f (x) = 1,
(d) f (x) = mx + c, m and c are constants. (e) f (x) = 1 − x6 + 3x3 − 5x3
Solution
(a) This is not a polynomial because of the x1 term (no negative powers of the argument are
allowed in polynomials).
(b) This is a polynomial in x of degree 4.
(c) This is a polynomial of degree 0.
(d) This straight line function is a polynomial in x of degree 1 if m 6= 0 and of degree 0 if m = 0.
(e) This is a polynomial in x of degree 6.
Task
Which of these functions are polynomials in x? In the case(s) where f is a poly-
nomial, give its degree.
(a) f (x) = (x−1)(x+3) (b) f (x) = 1−x7 (c) f (x) = 2+3ex −4e2x
(d)f (x) = cos(x) + sin2 (x)
Your solution
Answer
(a) This function, like all quadratics, is a polynomial of degree 2.
(b) This is a polynomial of degree 7.
(c) and (d) These are not polynomials in x. Their Maclaurin expansions have infinitely many terms.
HELM (2006): 3
Section 31.1: Polynomial Approximations
We have in fact already seen, in 16, one way in which some functions may be approximated
by polynomials. We review this next.
2. Taylor series
In 16 we encountered Maclaurin series and their generalisation, Taylor series. Taylor series are
a useful way of approximating functions by polynomials. The Taylor series expansion of a function
f (x) about x = a may be stated
f (x) = f (a) + (x − a)f 0 (a) + 12 (x − a)2 f 00 (a) + 3!1 (x − a)3 f 000 (a) + . . . .
(The special case called Maclaurin series arises when a = 0.)
The general idea when using this formula in practice is to consider only points x which are near to a.
Given this it follows that (x − a) will be small, (x − a)2 will be even smaller, (x − a)3 will be smaller
still, and so on. This gives us confidence to simply neglect the terms beyond a certain power, or, to
put it another way, to truncate the series.
Example 2
Find the Taylor polynomial of degree 2 about the point x = 1, for the function
f (x) = ln(x).
Solution
In this case a = 1 and we need to evaluate the following terms
f (a) = ln(a) = ln(1) = 0, f 0 (a) = 1/a = 1, f 00 (a) = −1/a2 = −1.
Hence
1 3 x2
ln(x) ≈ 0 + (x − 1) − (x − 1)2 = − + 2x −
2 2 2
which will be reasonably accurate for x close to 1, as you can readily check on a calculator or
computer. For example, for all x between 0.9 and 1.1, the polynomial and logarithm agree to at
least 3 decimal places.
One drawback with this approach is that we need to find (possibly many) derivatives of f . Also, there
can be some doubt over what is the best choice of a. The statement of Taylor series is an extremely
useful piece of theory, but it can sometimes have limited appeal as a means of approximating functions
by polynomials.
Next we will consider two alternative approaches.
4 HELM (2006):
®
3. Polynomial approximations - exact data

Here and in subsections 4 and 5 we consider cases where, rather than knowing an expression for
the function, we have a list of point values. Sometimes it is good enough to find a polynomial that
passes near these points (like putting a straight line through experimental data). Such a polynomial
is an approximating polynomial and this case follows in subsection 4. Here and in subsection 5 we
deal with the case where we want a polynomial to pass exactly through the given data, that is, an
interpolating polynomial.
Lagrange interpolation
Suppose that we know (or choose to sample) a function f exactly at a few points and that we want
to approximate how the function behaves between those points. In its simplest form this is equivalent
to a dot-to-dot puzzle (see Figure 1(a)), but it is often more desirable to seek a curve that does not
have“corners” in it (see Figure 1(b)).
x x
(a) Linear, or “dot-to-dot”, interpolation, (b) A smoother interpolation of the data points.
with corners at all of the data points.
Figure 1
Let us suppose that the data are in the form (x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . , these are the points
plotted as crosses on the diagrams above. (For technical reasons, and those of common sense, we
suppose that the x-values in the data are all distinct.)
Our aim is to find a polynomial which passes exactly through the given data points. We want to find
p(x) such that
p(x1 ) = f1 , p(x2 ) = f2 , p(x3 ) = f3 , ...
There is a mathematical trick we can use to achieve this. We define Lagrange polynomials L1 ,
L2 , L3 , . . . which have the following properties:
L1 (x) = 1, at x = x1 , L1 (x) = 0, at x = x2 , x3 , x4 . . .
L2 (x) = 1, at x = x2 , L2 (x) = 0, at x = x1 , x3 , x4 . . .
L3 (x) = 1, at x = x3 , L3 (x) = 0, at x = x1 , x2 , x4 . . .
.. ..
. .
Each of these functions acts like a filter which “turns off” if you evaluate it at a data point other
than its own. For example if you evaluate L2 at any data point other than x2 , you will get zero.
Furthermore, if you evaluate any of these Lagrange polynomials at its own data point, the value you
get is 1. These two properties are enough to be able to write down what p(x) must be:
HELM (2006): 5
p(x) = f1 L1 (x) + f2 L2 (x) + f3 L3 (x) + . . .
and this does work, because if we evaluate p at one of the data points, let us take x2 for example,
then
p(x2 ) = f1 L1 (x2 ) +f2 L2 (x2 ) +f3 L3 (x2 ) + · · · = f2
| {z } | {z } | {z }
=0 =1 =0
as required. The filtering property of the Lagrange polynomials picks out exactly the right f -value
for the current x-value. Between the data points, the expression for p above will give a smooth
polynomial curve.
This is all very well as long as we can work out what the Lagrange polynomials are. It is not hard to
check that the following definitions have the right properties.
Key Point 1
Lagrange Polynomials
(x − x2 )(x − x3 )(x − x4 ) . . .
L1 (x) =
(x1 − x2 )(x1 − x3 )(x1 − x4 ) . . .
(x − x1 )(x − x3 )(x − x4 ) . . .
L2 (x) =
(x2 − x1 )(x2 − x3 )(x2 − x4 ) . . .
(x − x1 )(x − x2 )(x − x4 ) . . .
L3 (x) =
(x3 − x1 )(x3 − x2 )(x3 − x4 ) . . .
and so on.
The numerator of Li (x) does not contain (x − xi ).
The denominator of Li (x) does not contain (xi − xi ).
In each case the numerator ensures that the filtering property is in place, that is that the functions
switch off at data points other than their own. The denominators make sure that the value taken at
the remaining data point is equal to 1.
L1
1.5
1 L2
0.5
0
x
-0.5
Figure 2
6 HELM (2006):
®
Figure 2 shows L1 and L2 in the case where there are five data points (the x positions of these data
points are shown as large dots). Notice how both L1 and L2 are equal to zero at four of the data
points and that L1 (x1 ) = 1 and L2 (x2 ) = 1.
In an implementation of this idea, things are simplified by the fact that we do not generally require
an expression for p(x). (This is good news, for imagine trying to multiply out all the algebra in the
expressions for L1 , L2 , . . . .) What we do generally require is p evaluated at some specific value.
The following Example should help show how this can be done.
Example 3
Let p(x) be the polynomial of degree 3 which interpolates the data
x 0.8 1 1.4 1.6
f (x) −1.82 −1.73 −1.40 −1.11
Evaluate p(1.1).
Solution
We are interested in the Lagrange polynomials at the point x = 1.1 so we consider
(1.1 − x2 )(1.1 − x3 )(1.1 − x4 ) (1.1 − 1)(1.1 − 1.4)(1.1 − 1.6)
L1 (1.1) = = = −0.15625.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.8 − 1)(0.8 − 1.4)(0.8 − 1.6)
Similar calculations for the other Lagrange polynomials give
L2 (1.1) = 0.93750, L3 (1.1) = 0.31250, L4 (1.1) = −0.09375,
and we find that our interpolated polynomial, evaluated at x = 1.1 is
p(1.1) = f1 L1 (1.1) + f2 L2 (1.1) + f3 L3 (1.1) + f4 L4 (1.1)

= −1.82 × −0.15625 + −1.73 × 0.9375 + −1.4 × 0.3125 + −1.11 × −0.09375
= −1.670938
= −1.67 to the number of decimal places to which the data were given.
Key Point 2
Quote the answer only to the same number of decimal places as the given data (or to less places).
HELM (2006): 7
Task
Let p(x) be the polynomial of degree 3 which interpolates the data
x 0.1 0.2 0.3 0.4
f (x) 0.91 0.70 0.43 0.52
Evaluate p(0.15).
Your solution
Answer
(0.15 − x2 )(0.15 − x3 )(0.15 − x4 ) (0.15 − 0.2)(0.15 − 0.3)(0.15 − 0.4)
L1 (0.15) = = = 0.3125.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.1 − 0.2)(0.1 − 0.3)(0.1 − 0.4)
L2 (0.15) = 0.9375, L3 (0.15) = −0.3125, L4 (0.15) = 0.0625,
p(0.15) = f1 L1 (0.15) + f2 L2 (0.15) + f3 L3 (0.15) + f4 L4 (0.15)

= 0.91 × 0.3125 + 0.7 × 0.9375 + 0.43 × −0.3125 + 0.52 × 0.0625
= 0.838750
= 0.84, to 2 decimal places.
The next Example is very much the same as Example 3 and the Task above. Try not to let the
specific application, and the slight change of notation, confuse you.
Example 4
A designer wants a curve on a diagram he is preparing to pass through the points
x 0.25 0.5 0.75 1
y 0.32 0.65 0.43 0.10
He decides to do this by using an interpolating polynomial p(x). What is the
y-value corresponding to x = 0.8?
8 HELM (2006):
®
Solution
(0.8 − x2 )(0.8 − x3 )(0.8 − x4 ) (0.8 − 0.5)(0.8 − 0.75)(0.8 − 1)
L1 (0.8) = = = 0.032.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (0.25 − 0.5)(0.25 − 0.75)(0.25 − 1)
L2 (0.8) = −0.176, L3 (0.8) = 1.056, L4 (0.8) = 0.088,
p(0.8) = y1 L1 (0.8) + y2 L2 (0.8) + y3 L3 (0.8) + y4 L4 (0.8)

= 0.32 × 0.032 + 0.65 × −0.176 + 0.43 × 1.056 + 0.1 × 0.088
= 0.358720
= 0.36 to 2 decimal places.
In this next Task there are five points to interpolate. It therefore takes a polynomial of degree 4 to
interpolate the data and this means we must use five Lagrange polynomials.
Task
The hull drag f of a racing yacht as a function of the hull speed, v, is known to
be
v 0.0 0.5 1.0 1.5 2.0
f 0.00 19.32 90.62 175.71 407.11
(Here, the units for f and v are N and m s−1 , respectively.)
Use Lagrange interpolation to fit this data and hence approximate the drag corre-
sponding to a hull speed of 2.5 m s−1 .
Your solution
HELM (2006): 9
Answer
We are interested in the Lagrange polynomials at the point v = 2.5 so we consider
(2.5 − v2 )(2.5 − v3 )(2.5 − v4 )(2.5 − v5 )

L1 (2.5) =
(v1 − v2 )(v1 − v3 )(v1 − v4 )(v1 − v5 )
(2.5 − 0.5)(2.5 − 1.0)(2.5 − 1.5)(2.5 − 2.0)
= = 1.0
(0.0 − 0.5)(0.0 − 1.0)(0.0 − 1.5)(0.0 − 2.0)

L2 (2.5) = −5.0, L3 (2.5) = 10.0, L4 (2.5) = −10.0, L5 (2.5) = 5.0
p(2.5) = f1 L1 (2.5) + f2 L2 (2.5) + f3 L3 (2.5) + f4 L4 (2.5) + f5 L5 (2.5)

= 0.00 × 1.0 + 19.32 × −5.0 + 90.62 × 10.0 + 175.71 × −10.0 + 407.11 × 5.0
= 1088.05
This gives us the approximation that the hull drag on the yacht at 2.5 m s−1 is about 1100 N.
The following Example has time t as the independent variable, and two quantities, x and y, as
dependent variables to be interpolated. We will see however that exactly the same approach as
before works.
Example 5
An animator working on a computer generated cartoon has decided that her main
character’s right index finger should pass through the following (x, y) positions on
the screen at the following times t
t 0 0.2 0.4 0.6
x 1.00 1.20 1.30 1.25
y 2.00 2.10 2.30 2.60
Use Lagrange polynomials to interpolate these data and hence find the (x, y)
position at time t = 0.5. Give x and y to 2 decimal places.
Solution
In this case t is the independent variable, and there are two dependent variables: x and y. We are
interested in the Lagrange polynomials at the time t = 0.5 so we consider
(0.5 − t2 )(0.5 − t3 )(0.5 − t4 ) (0.5 − 0.2)(0.5 − 0.4)(0.5 − 0.6)
L1 (0.5) = = = 0.0625
(t1 − t2 )(t1 − t3 )(t1 − t4 ) (0 − 0.2)(0 − 0.4)(0 − 0.6)
L2 (0.5) = −0.3125, L3 (0.5) = 0.9375, L4 (0.5) = 0.3125
10 HELM (2006):
®
Solution (contd.)
These values for the Lagrange polynomials can be used for both of the interpolations we need to
do. For the x-value we obtain
x(0.5) = x1 L1 (0.5) + x2 L2 (0.5) + x3 L3 (0.5) + x4 L4 (0.5)

= 1.00 × 0.0625 + 1.20 × −0.3125 + 1.30 × 0.9375 + 1.25 × 0.3125
= 1.30 to 2 decimal places
and for the y value we get
y(0.5) = y1 L1 (0.5) + y2 L2 (0.5) + y3 L3 (0.5) + y4 L4 (0.5)

= 2.00 × 0.0625 + 2.10 × −0.3125 + 2.30 × 0.9375 + 2.60 × 0.3125
= 2.44 to 2 decimal places
Error in Lagrange interpolation

When using Lagrange interpolation through n points (x1 , f1 ), (x2 , f2 ), . . . , (xn , fn ) the error, in the
estimate of f (x) is given by
(x − x1 )(x − x2 ) . . . (x − xn ) (n)
E(x) = f (η) where η ∈ [x, x1 , xn ]
n!
N.B. The value of η is not known precisely, only the interval in which it lies. Normally x will lie in
the interval [x1 , xn ] (that’s interpolation). If x lies outside the interval [x1 , xn ] then that’s called
extrapolation and a larger error is likely.
Of course we will not normally know what f is (indeed no f may exist for experimental data).
However, sometimes f can at least be estimated. In the following (somewhat artificial) example we
will be told f and use it to check that the above error formula is reasonable.
Example 6
In an experiment to determine the relationship between power gain (G) and power
output (P ) in an amplifier, the following data were recorded.
P 5 7 8 11
G 0.00 1.46 2.04 3.42
(a) Use Lagrange interpolation to fit an appropriate quadratic, q(x), to estimate

the gain when the output is 6.5. Give your answer to an appropriate accuracy.
(b) Given that G ≡ 10 log10 (P/5) show that the actual error which occurred in
the Lagrange interpolation in (a) lies withing the theoretical error limits.
HELM (2006): 11
Solution
For a quadratic, q(x), we need to fit three points and those most appropriate (nearest 6.5) are for
P at 5, 7, 8:
(6.5 − 7)(6.5 − 8)
q(6.5) = × 0.00
(5 − 7)(5 − 8)
(6.5 − 5)(6.5 − 8)
+ × 1.46
(7 − 5)(7 − 8)
(6.5 − 5)(6.5 − 7)
+ × 2.04
(8 − 5)(8 − 7)
= 0 + 1.6425 − 0.5100
= 1.1325 working to 4 d.p.
≈ 1.1 (rounding to sensible accuracy)
(b) We use the error formula

(x − x1 ) . . . (x − xn ) (n)
E(x) = f (η), η ∈ [x, x1 , . . . , xn ]
n!
Here f (x) ≡ G(x) = log10 (P/5) and n = 3:
d(log10 (P/5)) d(log10 (P ) − log10 (5)) d(log10 (P ))

= =
dP dP
dP
d ln P 1 1
= =
dP ln 10 ln 10 P
d3 1 2
So 3
(log10 (P/5)) = .
dP ln 10 P 3
Substituting for f (3) (η):
(6.5 − 6)(6.5 − 7)(6.5 − 8) 10 2

E(6.5) = × × 3, η ∈ [5, 8]
6 ln 10 η
1.6286
= η ∈ [5, 8]
η3
Taking η = 5: Emax = 0.0131
Taking η = 8: Emin = 0.0031
Taking x = 6.5 : Eactual = G(6.5) − q(6.5) = 10 log10 (6.5/5) − 1.1325

= 1.1394 − 1.1325
= 0.0069
The theory is satisfied because Emin < Eactual < Emax .
12 HELM (2006):
®
Task
(a) Use Lagrange interpolation to estimate f (8) to appropriate accuracy given the
table of values below, by means of the appropriate cubic interpolating polynomial
x 2 5 7 9 10
f (x) 0.980067 0.8775836 0.764842 0.621610 0.540302
Your solution
Answer
The most appropriate cubic passes through x at 5, 7, 9, 10
x = 8 x1 = 5, x2 = 7, x3 = 9, x4 = 10
(8 − 7)(8 − 9)(8 − 10)

p(8) = × 0.877583
(5 − 7)(5 − 9)(5 − 10)
(8 − 5)(8 − 9)(8 − 10)
+ × 0.764842
(7 − 5)(7 − 9)(7 − 10)
(8 − 5)(8 − 7)(8 − 10)
+ × 0.621610
(9 − 5)(9 − 7)(9 − 10)
(8 − 5)(8 − 7)(8 − 9)
+ × 0.540302
(10 − 5)(10 − 7)(10 − 9)
1 1 3 1
= − × 0.877583 + × 0.764842 + × 0.621610 − × 0.540302
20 2 4 5
= 0.6966689
Suitable accuracy is 0.6967 (rounded to 4 d.p.).
HELM (2006): 13
(b) Given that the table in (a) represents f (x) ≡ cos(x/10), calculate theoretical bounds for the
estimate obtained:
Your solution
Answer
(8 − 5)(8 − 7)(8 − 9)(8 − 10) (4)
E(8) = f (η), 5 ≤ η ≤ 10
4!
η 1 η
(4)
f (η) = cos so f (η) = 4 cos
10 10 10
1 η
E(8) = 4
cos , η ∈ [5, 10]
4 × 10 10
1 1
Emin = 4
cos(1) Emax = cos(0.5)
4 × 10 4 × 104
This leads to
0.696689 + 0.000014 ≤ True Value ≤ 0.696689 + 0.000022
⇒ 0.696703 ≤ True Value ≤ 0.696711
We can conclude that the True Value is 0.69670 or 0.69671 to 5 d.p. or 0.6967 to 4 d.p. (actual
value is 0.696707).
14 HELM (2006):
®
4. Polynomial approximations - experimental data

You may well have experience in carrying out an experiment and then trying to get a straight line to
pass as near as possible to the data plotted on graph paper. This process of adjusting a clear ruler
over the page until it looks “about right” is fine for a rough approximation, but it is not especially
scientific. Any software you use which provides a “best fit” straight line must obviously employ a
less haphazard approach.
Here we show one way in which best fit straight lines may be found.
Best fit straight lines
Let us consider the situation mentioned above of trying to get a straight line y = mx + c to be as
near as possible to experimental data in the form (x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . .
y = mx + c
f3
f2
f1
x x x x
1 2 3
Figure 3
We want to minimise the overall distance between the crosses (the data points) and the straight line.
There are a few different approaches, but the one we adopt here involves minimising the quantity
R = (mx1 + c − f1 ) 2 + (mx2 + c − f2 ) 2 + (mx + c − f3 )2 + ...

| {z } | {z } | 3 {z }
vertical distance second data point third data point
between line and distance distance
the point (x1 , f1 )
X 2
= mxn + c − fn .
Each term in the sum measures the vertical distance between a data point and the straight line.
(Squaring the distances ensures that distances above and below the line do not cancel each other
out. It is because we are minimising the distances squared that the straight line we will find is called
the least squares best fit straight line.)
HELM (2006): 15
In order to minimise R we can imagine sliding the clear ruler around on the page until the line looks
right; that is we can imagine varying the slope m and y-intercept c of the line. We therefore think
of R as a function of the two variables m and c and, as we know from our earlier work on maxima
and minima of functions, the minimisation is achieved when
∂R ∂R
=0 and = 0.
∂c ∂m
(We know that this will correspond to a minimum because R has no maximum, for whatever value
R takes we can always make it bigger by moving the line further away from the data points.)
Differentiating R with respect to m and c gives
∂R
= 2 (mx1 + c − f1 ) + 2 (mx2 + c − f2 ) + 2 (mx3 + c − f3 ) + . . .
∂c
X
= 2 mxn + c − fn and
∂R
= 2 (mx1 + c − f1 ) x1 + 2 (mx2 + c − f2 ) x2 + 2 (mx3 + c − f3 ) x3 + . . .
∂m
X
= 2 mxn + c − fn xn ,
respectively. Setting both of these quantities equal to zero (and cancelling the factor of 2) gives a
pair of simultaneous equations for m and c. This pair of equations is given in the Key Point below.
Key Point 3
The least squares best fit straight line to the experimental data
(x1 , f1 ), (x2 , f2 ), (x3 , f3 ), . . . (xn , fn )
is
y = mx + c
where m and c are found by solving the pair of equations
n
! n
! n
X X X
c 1 +m xn = fn ,
1 1 1
n
! n
! n
X X X
c xn +m x2n = xn fn .
1 1 1
n
X
(The term 1 is simply equal to the number of data points, n.)
1
16 HELM (2006):
®
Example 7
An experiment is carried out and the following data obtained:
xn 0.24 0.26 0.28 0.30
fn 1.25 0.80 0.66 0.20
Obtain the least squares best fit straight line, y = mx + c, to these data. Give c
and m to 2 decimal places.
Solution
For a hand calculation, tabulating the data makes sense:
xn fn x2n xn fn
0.24 1.25 0.0576 0.3000
0.26 0.80 0.0676 0.2080
0.28 0.66 0.0784 0.1848
0.30 0.20 0.0900 0.0600
1.08 2.91 0.2936 0.7528
X
The quantity 1 counts the number of data points and in this case is equal to 4.
It follows that the pair of equations for m and c are:
4c + 1.08m = 2.91
1.08c + 0.2936m = 0.7528
Solving these gives c = 5.17 and m = −16.45 and we see that the least squares best fit straight
line to the given data is
y = 5.17 − 16.45x
Figure 4 shows how well the straight line fits the experimental data.
1.2
0.8
0.6
0.4
0.2
0
x
0.24 0.25 0.26 0.27 0.28 0.29 0.3
Figure 4
HELM (2006): 17
Example 8
Find the best fit straight line to the following experimental data:
xn 0.00 1.00 2.00 3.00 4.00
fn 1.00 3.85 6.50 9.35 12.05
Solution
In order to work out all of the quantities appearing in the pair of equations we tabulate our calcu-
lations as follows
xn fn x2n xn fn
0.00 1.00 0.00 0.00
1.00 3.85 1.00 3.85
2.00 6.50 4.00 13.00
3.00 9.35 9.00 28.05
X 4.00 12.05 16.00 48.20
10.00 32.75 30.00 93.10
X
The quantity 1 counts the number of data points and is in this case equal to 5.
Hence our pair of equations is
5c + 10m = 32.95
10c + 30m = 93.10
Solving these equations gives c = 1.03 and m = 2.76 and this means that our best fit straight line
to the given data is
y = 1.03 + 2.76x
18 HELM (2006):
®
Task
An experiment is carried out and the data obtained are as follows:
xn 0.2 0.3 0.5 0.9
fn 5.54 4.02 3.11 2.16
Obtain the least squares best fit straight line, y = mx + c, to these data. Give c
and m to 2 decimal places.
Your solution
Answer
Tabulating the data gives
xn fn x2n xn fn
0.2 5.54 0.04 1.108
0.3 4.02 0.09 1.206
0.5 3.11 0.25 1.555
0.9 2.16 0.81 1.944
X
1.9 14.83 1.19 5.813
X
4c + 1.9m = 14.83
1.9c + 1.19m = 5.813
Solving these gives c = 5.74 and m = −4.28 and we see that the least squares best fit straight line
to the given data is
y = 5.74 − 4.28x
HELM (2006): 19
Task
Power output P of a semiconductor laser diode, operating at 35◦ C, as a function
of the drive current I is measured to be
I 70 72 74 76
P 1.33 2.08 2.88 3.31
(Here I and P are measured in mA and mW respectively.)
It is known that, above a certain threshold current, the laser power increases
linearly with drive current. Use the least squares approach to fit a straight line,
P = mI + c, to these data. Give c and m to 2 decimal places.
Your solution
Answer
Tabulating the data gives
I P I2 I ×P
70 1.33 4900 93.1
72 2.08 5184 149.76
74 2.88 5476 213.12
76 3.31 5776 251.56
292 9.6 21336 707.54
X
4c + 292m = 9.6
292c + 21336m = 707.54
Solving these gives c = −22.20 and m = 0.34 and we see that the least squares best fit straight
line to the given data is
P = −22.20 + 0.34I.
20 HELM (2006):
®
5. Polynomial approximations - splines

We complete this Section by briefly describing another approach that can be used in the case where
the data are exact.
Why are splines needed?

Fitting a polynomial to the data (using Lagrange polynomials, for example) works very well when
there are a small number of data points. But if there were 100 data points it would be silly to try
to fit a polynomial of degree 99 through all of them. It would be a great deal of work and anyway
polynomials of high degree can be very oscillatory giving poor approximations between the data points
to the underlying function.
What are splines?

Instead of using a polynomial valid for all x, we use one polynomial for x1 < x < x2 , then a different
polynomial for x2 < x < x3 then a different one again for x3 < x < x4 , and so on.
We have already seen one instance of this approach in this Section. The “dot to dot” interpolation
that we abandoned earlier (Figure 1(a)) is an example of a linear spline. There is a different straight
line between each pair of data points.
The most commonly used splines are cubic splines. We use a different polynomial of degree three
between each pair of data points. Let s = s(x) denote a cubic spline, then
s(x) = a1 (x − x1 )3 + b1 (x − x1 )2 + c1 (x − x1 ) + d1 (x1 < x < x2 )

..
.
And we need to find a1 , b1 , c1 , d1 , a2 , . . . to determine the full form for the spline s(x). Given the
large number of quantities that have to be assigned (four for every pair of adjacent data points) it
is possible to give s some very nice properties:
• s(x1 ) = f1 , s(x2 ) = f2 , s(x3 ) = f3 , . . . . This is the least we should expect, as it simply states
that s interpolates the given data.
• s0 (x) is continuous at the data points. This means that there are no “corners” at the data
points - the whole curve is smooth.
• s00 (x) is continuous. This reduces the occurrence of points of inflection appearing at the data
points and leads to a smooth interpolant.
Even with all of these requirements there are still two more properties we can assign to s. A natural
cubic spline is one for which s00 is zero at the two end points. The natural cubic spline is, in some
sense, the smoothest possible spline, for it minimises a measure of the curvature.
HELM (2006): 21
How is a spline found?
Now that we have described what a natural cubic spline is, we briefly describe how it is found.
Suppose that there are N data points. For a natural cubic spline we require s00 (x1 ) = s00 (xN ) = 0
and values of s00 taken at the other data points are found from the system of equations in Key Point
4.
Key Point 4
Cubic Spline Equations
    
k2 h2 s00 (x2 ) r2
 h2 k3
 h3 
 s00 (x3 )  
  r3 

. .. . .. .. .. ..
=
    
 .  . . 
   00   
 hN −3 kN −2 hN −2   s (xN −2 )   rN −2 
hN −2 kN −1 s00 (xN −1 ) rN −1
in which
h1 = x2 − x1 , h2 = x3 − x2 , h3 = x4 − x3 , h4 = x5 − x4 , . . .
k2 = 2(h1 + h2 ), k3 = 2(h2 + h3 ), k4 = 2(h3 + h4 ), . . .

f3 − f2 f2 − f1 f4 − f3 f3 − f2
r2 = 6 − , r3 = 6 − ,...
h2 h1 h3 h2
Admittedly the system of equations in Key Point 4 looks unappealing, but this is a “nice” system
of equations. It was pointed out at the end of 30 that some applications lead to systems of
equations involving matrices which are strictly diagonally dominant. The matrix above is of that
type since the diagonal entry is always twice as big as the sum of off-diagonal entries.
Once the system of equations is solved for the second derivatives s00 , the spline s can be found as
follows:
s00 (xi+1 ) − s00 (xi ) s00 (xi )
00
s (xi+1 ) + 2s00 (xi )

fi+1 − fi
ai = , bi = , ci = − hi , di = fi
6hi 2 hi 6
We now present an Example illustrating this approach.
22 HELM (2006):
®
Example 9
Find the natural cubic spline which interpolates the data
xj 1 3 5 8
fj 0.85 0.72 0.34 0.67
Solution
In the notation now established we have h1 = 2, h2 = 2 and h3 = 3. For a natural cubic spline we
require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found from the system
of equations given in Key Point 4. In this case the matrix is just 2 × 2 and the pair of equations are:

00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0

00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0
In this case the equations become

00
8 2 s (x2 ) −0.75
=
2 10 s00 (x3 ) 1.8
Solving the coupled pair of equations leads to
s00 (x2 ) = −0.146053 s00 (x3 ) = 0.209211
We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is given by
s(x) = −0.01217(x − 1)3 − 0.016316(x − 1) + 0.85 (1 < x < 3)

3 2
s(x) = 0.029605(x − 3) − 0.073026(x − 3) − 0.162368(x − 3) + 0.72 (3 < x < 5)
s(x) = −0.01162(x − 5)3 + 0.104605(x − 5)2 − 0.099211(x − 5) + 0.34 (5 < x < 8)
Figure 5 shows how the spline interpolates the data.
0.8
0.6
0.4
0.2
0
2 4
x 6 8
Figure 5
HELM (2006): 23
Task
Find the natural cubic spline which interpolates the data
xj 1 2 3 5
fj 0.1 0.24 0.67 0.91
Your solution
Answer
In the notation now established we have h1 = 1, h2 = 1 and h3 = 2. For a natural cubic spline we
require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found from the system
of equations

00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0

00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0

00
4 1 s (x2 ) 1.74
=
1 6 s00 (x3 ) −1.86
Solving the coupled pair of equations leads to s00 (x2 ) = 0.534783 s00 (x3 ) = −0.399130
We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is
s(x) = 0.08913(x − 1)3 + 0.05087(x − 1) + 0.1 (1 < x < 2)

3 2
s(x) = −0.15565(x − 2) + 0.267391(x − 2) + 0.318261(x − 2) + 0.24 (2 < x < 3)
s(x) = 0.033261(x − 3)3 − 0.199565(x − 3)2 + 0.386087(x − 3) + 0.67 (3 < x < 5)
24 HELM (2006):
®
Exercises
1. A political analyst is preparing a dossier involving the following data
x 10 15 20 25
f (x) 9.23 8.41 7.12 4.13
She interpolates the data with a polynomial p(x) of degree 3 in order to find an approximation
p(22) to f (22). What value does she find for p(22)?
2. Estimate f (2) to an appropriate accuracy from the table of values below by means of an
appropriate quadratic interpolating polynomial.
x 1 3 3.5 6
f (x) 99.8 295.5 342.9 564.6
3. An experiment is carried out and the data obtained as follows
xn 2 3 5 7
fn 2.2 5.4 6.5 13.2
Obtain the least squares best fit straight line, y = mx + c, to these data. (Give c and m to 2
decimal places.)
4. Find the natural cubic spline which interpolates the data
xj 2 4 5 7
fj 1.34 1.84 1.12 0.02
HELM (2006): 25
Answers
1. We are interested in the Lagrange polynomials at the point x = 22 so we consider
(22 − x2 )(22 − x3 )(22 − x4 ) (22 − 15)(22 − 20)(22 − 25)

L1 (22) = = = 0.056.
(x1 − x2 )(x1 − x3 )(x1 − x4 ) (10 − 15)(10 − 20)(10 − 25)
L2 (22) = −0.288, L3 (22) = 1.008, L4 (22) = 0.224,
and we find that our interpolated polynomial, evaluated at x = 22 is
p(22) = f1 L1 (22) + f2 L2 (22) + f3 L3 (22) + f4 L4 (22)

= 9.23 × 0.056 + 8.41 × −0.288 + 7.12 × 1.008 + 4.13 × 0.224
= 6.197
= 6.20, to 2 decimal places,
which serves as the approximation to f (22).
2.
(2 − 1)(2 − 3) (2 − 1)(2 − 3.5) (2 − 3)(2 − 3.5)
f (2) = × 342.9 + × 295.5 + × 99.8
(3.5 − 1)(3.5 − 3) (3 − 1)(3 − 3.5) (1 − 3)(1 − 3.5)
= −274.32 + 443.25 + 29.94
= 198.87
Estimate is 199 (to 3 sig. fig.)
3. We tabulate the data for convenience:

xn fn x2n xn fn
2 2.2 4 4.4
3 5.4 9 16.2
5 6.5 25 32.5
P 7 13.2 49 92.4
17 27.3 87 145.5
X
It follows that the pair of equations for m and c are as follows:
4c + 17m = 27.3
17c + 87m = 145.5
Solving these gives c = −1.67 and m = 2.00, to 2 decimal places, and we see that the least
squares best fit straight line to the given data is
y = −1.67 + 2.00x
26 HELM (2006):
®
Answers
4. In the notation now established we have h1 = 2, h2 = 1 and h3 = 2. For a natural cubic

spline we require s00 to be zero at x1 and x4 . Values of s00 at the other data points are found
from the system of equations

00 00 00 f3 − f2 f2 − f1
h1 s (x1 ) +2(h1 + h2 )s (x2 ) + h2 s (x3 ) = 6 −
| {z } h2 h1
=0

00 00 00 f4 − f3 f3 − f2
h2 s (x2 ) + 2(h2 + h3 )s (x3 ) + h3 s (x4 ) = 6 −
| {z } h3 h2
=0

00
6 1 s (x2 ) −5.82
=
1 6 s00 (x3 ) 1.02
Solving the coupled pair of equations leads to
s00 (x2 ) = −1.026857 s00 (x3 ) = 0.341143
We now find the coefficients a1 , b1 , etc. from the formulae and deduce that the spline is given
by
s(x) = −0.08557(x − 2)3 + 0.592286(x − 2) + 1.34 (2 < x < 4)

3 2
s(x) = 0.228(x − 4) − 0.513429(x − 4) − 0.434571(x − 4) + 1.84 (4 < x < 5)
3 2
s(x) = −0.02843(x − 5) + 0.170571(x − 5) − 0.777429(x − 5) + 1.12 (5 < x < 7)
HELM (2006): 27

Numerical Integration 31.2
Introduction
In this Section we will present some methods that can be used to approximate integrals. Attention
will be paid to how we ensure that such approximations can be guaranteed to be of a certain level
of accuracy.

Prerequisites • review previous material on integrals and

integration

'
$
• approximate certain integrals
Learning Outcomes • be able to ensure that these approximations

are of some desired accuracy
& %
28 HELM (2006):
1. Numerical integration
The aim in this Section is to describe numerical methods for approximating integrals of the form
Z b
f (x) dx
a
One motivation for this is in the material on probability that appears in 39. Normal distributions
can be analysed by working out
Z b
1 2
√ e−x /2 dx
a 2π
for certain values of a and b. It turns out that it is not possible, using the kinds of functions most
2
engineers would care to know about, to write down a function with derivative equal to √12π e−x /2 so
values of the integral are approximated instead. Tables of numbers giving the value of this integral
for different interval widths appeared at the end of 39, and it is known that these tables are
accurate to the number of decimal places given. How can this be known? One aim of this Section
is to give a possible answer to that question.
It is clear that, not only do we need a way of approximating integrals, but we also need a way of
working out the accuracy of the approximations if we are to be sure that our tables of numbers are
to be relied on.
In this Section we address both of these points, begining with a simple approximation method.
2. The simple trapezium rule

The first approximation we shall look at involves finding the area under a straight line, rather than
the area under a curve f . Figure 6 shows it best.
f(x)
f(b)
b −a
f(a)
a b x
Figure 6
HELM (2006): 29
Section 31.2: Numerical Integration
We approximate as follows
Z b
f (x) dx = grey shaded area
a
≈ area of the trapezium surrounding the shaded region
= width of trapezium × average height of the two sides
1
= (b − a) f (a) + f (b)
2
Key Point 5
Simple Trapezium Rule
Z b
The simple trapezium rule for approximating f (x) dx is given by approximating the area under
a
the graph of f by the area of a trapezium.
The formula is:

Z b
1
f (x) dx ≈ (b − a) f (a) + f (b)
a 2
Or, to put it another way that may prove helpful a little later on,
Z b
1
f (x) dx ≈ × (interval width) × f (left-hand end) + f (right-hand end)
a 2
Next we show some instances of implementing this method.
Example 10
Approximate each of these integrals using the simple trapezium rule
Z π/4 Z 2 Z 2
−x2 /2
(a) sin(x) dx (b) e dx (c) cosh(x) dx
0 1 0
Solution
Z π/4
1 1 π 1
(a) sin(x) dx ≈ (b − a)(sin(a) + sin(b)) = −0 0+ √ = 0.27768,
2 2 4 2
Z0 2
−x2 /2 1 2
−a /2 −b2 /2
1
= (1 − 0) e−1/2 + e−2 = 0.37093,

(b) e dx ≈ (b − a) e +e
Z 12 2 2
1 1
(c) cosh(x) dx ≈ (b − a) (cosh(a) + cosh(b)) = (2 − 0) (1 + cosh(2)) = 4.76220,
0 2 2
where all three answers are given to 5 decimal places.
30 HELM (2006):
It is important to note that, although we have given these integral approximations to 5 decimal places,
this does not mean that they are accurate to that many places. We will deal with the accuracy of
our approximations later in this Section. Next are some Tasks for you to try.
Task
Approximate the following integrals using the simple trapezium method
Z 5 Z 2
√
(a) x dx (b) ln(x) dx
1 1
Your solution
Answer
Z 5
√ 1 √ √ 1 √
(a) x dx ≈ (b − a) a + b = (5 − 1) 1 + 5 = 6.47214
Z1 2 2 2
1 1
(b) ln(x) dx ≈ (b − a)(ln(a) + ln(b)) = (1 − 0) (0 + ln(2)) = 0.34657
1 2 2
The answer you obtain for this next Task can be checked against the table of results in 39
concerning the Normal distribution or in a standard statistics textbook.
Task Z 1
1 2
Use the simple trapezium method to approximate √ e−x /2 dx
0 2π
Your solution
Answer
We find that
Z 1
1 2 1 1
√ e−x /2 dx ≈ (1 − 0) √ (1 + e−1/2 ) = 0.32046
0 2π 2 2π
Z b
So we have a means of approximating f (x) dx. The question remains whether or not it is a good
a
approximation.
HELM (2006): 31
How good is the simple trapezium rule?
We define eT , the error in the simple trapezium rule to be the difference between the actual value of
the integral and our approximation to it, that is
Z b
1
eT = f (x) dx − (b − a) f (a) + f (b)
a 2
It is enough for our purposes here to omit some theory and skip straight to the result of interest. In
many different textbooks on the subject it is shown that
1
eT = − (b − a)3 f 00 (c)
12
where c is some number between a and b. (The principal drawback with this expression for eT is
that we do not know what c is, but we will find a way to work around that difficulty later.)
It is worth pausing to ask what meaning we can attach to this expression for eT . There are two
factors which can influence eT :
1. If b − a is small then, clearly, eT will most probably also be small. This seems sensible enough -
if the integration interval is a small one then there is “less room” to accumulate a large error.
(This observation forms part of the motivation for the composite trapezium rule discussed later
in this Section.)
2. If f 00 is small everywhere in a < x < b then eT will be small. This reflects the fact that we
worked out the integral of a straight line function, instead of the integral of f . If f is a long
way from being a straight line then f 00 will be large and so we must expect the error eT to be
large too.
We noted above that the expression for eT is less useful than it might be because it involves the
unknown quantity c. We perform a trade-off to get around this problem. The expression above gives
an exact value for eT , but we do not know enough to evaluate it. So we replace the expression with
one we can evaluate, but it will not be exact. We replace f 00 (c) with a worst case value to obtain
an upper bound on eT . This worst case value is the largest (positive or negative) value that f 00 (x)
achieves for a ≤ x ≤ b. This leads to
(b − a)3
|eT | ≤ max f 00 (x) .

a≤x≤b 12
We summarise this in Key Point 6.
Key Point 6
Error in the Simple Trapezium Rule
Z b
The error, |eT |, in the simple trapezium approximation to f (x) dx is bounded above by
a
(b − a)3
max f 00 (x)

a≤x≤b 12
32 HELM (2006):
Example 11
Work out the error bound (to 6 decimal places) for the simple trapezium method
approximations to
Z π/4 Z 2
(a) sin(x) dx (b) cosh(x) dx
0 0
Solution
In each case the trickiest part is working out the maximum value of f 00 (x).
(a) Here f (x) = sin(x), therefore f 0 (x) = − cos(x) and f 00 (x) = − sin(x). The function sin(x)
takes values between 0 and √12 when x varies between 0 and π/4. Hence
1 (π/4)2
eT < √ × = 0.028548 to 6 decimal places.
2 12
(b) If f (x) = cosh(x) then f 00 (x) = cosh(x) too. The maximum value of cosh(x) for x between 0
and 2 will be cosh(2) = 3.762196, to 6 decimal places. Hence, in this case,
(2 − 0)3
eT < (3.762196) × = 2.508130 to 6 decimal places.
12
(In Example 11 we used a rounded value of cosh(2). To be on the safe side, it is best to round this
number up to make sure that we still have an upper bound on eT . In this case, of course, rounding
up is what we would naturally do, because the seventh decimal place was a 6.)
Task
Work out the error bound (to 5 significant figures) for the simple trapezium method
approximations to
Z 5 Z 2
√
(a) x dx (b) ln(x) dx
1 1
Your solution
(a)
HELM (2006): 33
Answer √
If f (x) = x = x1/2 then f 0 (x) = 12 x−1/2 and f 00 (x) = − 41 x−3/2 .
The negative power here means that f 00 takes its biggest value in magnitude at the left-hand end
of the interval [1, 5] and we see that max1≤x≤5 |f 00 (x)| = f 00 (1) = 14 . Therefore
1 43
eT < × = 1.3333 to 5 s.f.
4 12
Your solution
(b)
Answer
Here f (x) = ln(x) hence f 0 (x) = 1/x and f 00 (x) = −1/x2 .
It follows then that max1≤x≤2 |f 00 (x)| = 1 and we conclude that
13
eT < 1 × = 0.083333 to 5 s.f.
12
One deficiency in the simple trapezium rule is that there is nothing we can do to improve it. Having
computed an error bound to measure the quality of the approximation we have no way to go back and
work out a better approximation to the integral. It would be preferable if there were a parameter we
could alter to tune the accuracy of the method. The following approach uses the simple trapezium
method in a way that allows us to improve the accuracy of the answer we obtain.
34 HELM (2006):
3. The composite trapezium rule
The general idea here is to split the interval [a, b] into a sequence of N smaller subintervals of equal
width h = (b − a)/N . Then we apply the simple trapezium rule to each of the subintervals.
Figure 7 below shows the case where N = 2 (and ∴ h = 12 (b − a)). To simplify notation later on we
let f0 = f (a), f1 = f (a + h) and f2 = f (a + 2h) = f (b).
f(x)
f2
f1
f
0
a b x
Figure 7
Applying the simple trapezium rule to each subinterval we get

Z b
f (x) dx ≈ (area of first trapezium) + (area of second trapezium)
a
1 1 1
= h(f0 + f1 ) + h(f1 + f2 ) = h f0 + 2f1 + f2
2 2 2
where we remember that the width of each of the subintervals is h, rather than the b − a we had in
the simple trapezium rule.
The next improvement will come from taking N = 3 subintervals (Figure 8). Here h = 13 (b − a)
is smaller than in Figure 7 above and we denote f0 = f (a), f1 = f (a + h), f2 = f (a + 2h) and
f3 = f (a + 3h) = f (b). (Notice that f1 and f2 mean something different from what they did in the
N = 2 case.)
f(x)
f3
f
2
f
f 1
0
a b x
Figure 8
HELM (2006): 35
As Figure 8 shows, the approximation is getting closer to the grey shaded area and in this case we
have
Z b
1 1 1
f (x) dx ≈ h(f0 + f1 ) + h(f1 + f2 ) + h(f2 + f3 )
a 2 2 2
1
= h f0 + 2 {f1 + f2 } + f3 .
2
The pattern is probably becoming clear by now, but here is one more improvement. In Figure 9
N = 4, h = 41 (b − a) and we denote f0 = f (a), f1 = f (a + h), f2 = f (a + 2h), f3 = f (a + 3h)
and f4 = f (a + 4h) = f (b).
f(x)
f4
f3
f2
f f1
0
a b x
Figure 9
This leads to
Z b
1 1 1 1
f (x) dx ≈ h(f0 + f1 ) + h(f1 + f2 ) + h(f2 + f3 ) + + h(f3 + f4 )
a 2 2 2 2
1
= h f0 + 2 {f1 + f2 + f3 } + f4 .
2
We generalise this idea into the following Key Point.
36 HELM (2006):
Key Point 7
Composite Trapezium Rule
Z b
The composite trapezium rule for approximating f (x) dx is carried out as follows:
a
1. Choose N , the number of subintervals,

Z b
1
2. f (x) dx ≈ h f0 + 2{f1 + f2 + · · · + fN −1 } + fN ,
a 2
where
b−a
h= , f0 = f (a), f1 = f (a + h), . . . , fn = f (a + nh), . . . ,
N
and fN = f (a + N h) = f (b).
Example 12
Using 4 subintervals in the composite trapezium rule, and working to 6 decimal
places, approximate
Z 2
cosh(x) dx
0
Solution
In this case h = (2 − 0)/4 = 0.5.
We require cosh(x) evaluated at five x-values and the results are tabulated below to 6 d.p.
xn fn = cosh(xn )
0 1.000000
0.5 1.127626
1 1.543081
1.5 2.352410
2 3.762196
It follows that
Z 2
1
cosh(x) dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
0 2
1
= (0.5) (1 + 3.762196 + 2{1.127626 + 1.543081 + 2.35241})
2
= 3.452107
HELM (2006): 37
Task
Using 4 subintervals in the composite trapezium rule approximate
Z 2
ln(x) dx
1
Your solution
Answer
In this case h = (2 − 1)/4 = 0.25.
We require ln(x) evaluated at five x-values and the results are tabulated below t0 6 d.p.
xn fn = ln(xn )
1 0.000000
1.25 0.223144
1.5 0.405465
1.75 0.559616
2 0.693147
It follows that
Z 2
1
ln(x) dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
1 2
1
= (0.25) (0 + 0.693147 + 2{0.223144 + 0.405465 + 0.559616})
2
= 0.383700
38 HELM (2006):
How good is the composite trapezium rule?
We can work out an upper bound on the error incurred by the composite trapezium method. For-
tunately, all we have to do here is apply the method for the error in the simple rule over and over
again. Let eNT denote the error in the composite trapezium rule with N subintervals. Then
00 h3 00 h3 00 h3
N
eT ≤ max f (x) + max f (x) + ... + max f (x)
1st subinterval 12 2nd subinterval 12 last subinterval 12
h3
= max f 00 (x) + max f 00 (x) + . . . + max f 00 (x) .

12 | 1st subinterval 2nd subinterval
{z last subinterval
}
N terms
This is all very well as a piece of theory, but it is awkward to use in practice. The process of working
out the maximum value of |f 00 | separately in each subinterval is very time-consuming. We can obtain
a more user-friendly, if less accurate, error bound by replacing each term in the last bracket above
with the biggest one. Hence we obtain
N h3

eT ≤ 00
N max f (x)
12 a≤x≤b
This upper bound can be rewritten by recalling that N h = b − a, and we now summarise the result
in a Key Point.
Key Point 8
Error in the Composite Trapezium Rule

Z b
The error, eN , in the N -subinterval composite trapezium approximation to
T f (x) dx is bounded
a
above by
(b − a)h2
max f 00 (x)

a≤x≤b 12
Note: the special case when N = 1 is the simple trapezium rule, in which case b − a = h (refer to
Key Point 6 to compare).
The formula in Key Point 8 can be used to decide how many subintervals to use to guarantee a
specific accuracy.
HELM (2006): 39
Example 13
The function f is known to have a second derivative with the property that
|f 00 (x)| < 12
for x between 0 and 4.
Using the error bound given in Key Point 8 determine how many subintervals are
required so that the composite trapezium rule used to approximate
Z 4
f (x) dx
0
can be guaranteed to be in error by less than 1

2
× 10−3 .
Solution
We require that
(b − a)h2
12 × < 0.0005
12
that is
4h2 < 0.0005.
This implies that h2 < 0.000125 and therefore h < 0.0111803.
Now N = (b − a)/h = 4/h and it follows that
N > 357.7708
Clearly, N must be a whole number and we conclude that the smallest number of subintervals which
guarantees an error smaller than 0.0005 is N = 358.
It is worth remembering that the error bound we are using here is a pessimistic one. We effectively
use the same (worst case) value for f 00 (x) all the way through the integration interval. Odds are that
fewer subintervals will give the required accuracy, but the value for N we found here will guarantee
a good enough approximation.
Next are two Tasks for you to try.
40 HELM (2006):
Task
The function f is known to have a second derivative with the property that
|f 00 (x)| < 14
for x between −1 and 4.
Using the error bound given in Key Point 8 determine how many subintervals are
required so that the composite trapezium rule used to approximate
Z 4
f (x) dx
−1
can be guaranteed to have an error less than 0.0001.
Your solution
Answer
We require that
(b − a)h2
14 × < 0.0001
12
that is
70h2
< 0.0001
12
N > 1207.6147
Clearly, N must be a whole number and we conclude that the smallest number of subintervals which
HELM (2006): 41
Task
2 /2
It is given that the function e−x has a second derivative that is never greater
than 1 in absolute value.
(a) Use this fact to determine how many subintervals are required for the com-
posite trapezium method to deliver an approximation to
Z 1
1 2
√ e−x /2 dx
0 2π
that is guaranteed to have an error less than 1

2
× 10−2 .
(b) Find an approximation to the integral that is in error by less than 1

2
× 10−2 .
Your solution
(a)
Answer
1 (b − a)h2
We require that √ < 0.005. This means that h2 < 0.150398 and therefore, since
2π 12
N = 1/h, it is necessary for N = 3 for the error bound to be less than ± 21 × 10−2 .
Your solution
(b)
42 HELM (2006):
Answer
1 2
To carry out the composite trapezium rule, with h = 1
3
we need to evaluate f (x) = √ e−x /2 at
2π
x = 0, h, 2h, 1. This evaluation gives
f (0) = f0 = 0.39894, f (h) = f1 = 0.37738, f (2h) = f2 = 0.31945
and f (1) = f3 = 0.24197,
all to 5 decimal places. It follows that
Z 1
1 2 1
√ e−x /2 dx ≈ h(f0 + f3 + 2{f1 + f2 }) = 0.33910
0 2π 2
We know from part (a) that this approximation is in error by less than 1
2
× 10−2 .
Example 14
Determine the minimum number of steps needed to guarantee an error not
exceeding ±0.001, when evaluating
Z 1
cosh(x2 ) dx
0
using the trapezium rule.
Solution
f (x) = cosh(x2 ) f 0 (x) = 2x sinh(x2 ) f 00 (x) = 2 sinh(x2 ) + 4x2 cosh(x2 )
Using the error formula in Key Point 8

1 2 2 2

2
E = − h {2 sinh(x ) + 4x cosh(x )}
x ∈ [0, 1]
12
|E|max occurs when x = 1
h2
0.001 > {2 sinh(1) + 4 cosh(1)}
12
h2 < 0.012/{(2 sinh(1) + 4 cosh(1)}
⇒ h2 < 0.001408
⇒h < 0.037523
⇒n ≥ 26.651
⇒n = 27 needed
HELM (2006): 43
Task
Determine the minimum of strips, n, needed to evaluate by the trapezium rule:
Z π/4
{3x2 − 1.5 sin(2x)} dx
0
such that the error is guaranteed not to exceed ±0.005.
Your solution
Answer
f (x) = 3x2 − 1.5 sin(2x) f 00 (x) = 6 + 6 sin(2x)
π
|Error| will be maximum at x = so that sin(2x) = 1
4
(b − a) 2 (2) π
E=− h f (x) x ∈ [0, ]
12 4
π 2 π
E=− h 6{1 + sin(2x)}, x ∈ [0, ]
48 4
π πh2
|E|max = h2 (12) =
48 4
2
πh 0.02
We need < 0.005 ⇒ h2 < ⇒ h < 0.07979
4 π
π π
Now nh = (b − a) = so n =
4 4h
π
We need n > = 9.844 so n = 10 required
4 × 0.07979
44 HELM (2006):
4. Other methods for approximating integrals
Here we briefly describe other methods that you may have heard, or get to hear, about. In the end
they all amount to the same sort of thing, that is we sample the integrand f at a few points in the
integration interval and then take a weighted average of all these f values. All that is needed to
implement any of these methods is the list of sampling points and the weight that should be attached
to each evaluation. Lists of these points and weights can be found in many books on the subject.
Simpson’s rule
This is based on passing a quadratic through three equally spaced points, rather than passing a
straight line through two points as we did for the simple trapezium rule. The composite Simpson’s
rule is given in the following Key Point.
Key Point 9
Composite Simpson’s Rule
Z b
The composite Simpson’s rule for approximating f (x) dx is carried out as follows:
a
1. Choose N , which must be an even number of subintervals,

Z b
2. Calculate f (x) dx
a

≈ 13 h f0 + 4{f1 + f3 + f5 + · · · + fN −1 } + 2{f2 + f4 + f6 + · · · + fN −2 } + fN
where
b−a
h= , f0 = f (a), f1 = f (a + h), . . . , fn = f (a + nh), . . . ,
N
and fN = f (a + N h) = f (b).
The formula in Key Point 9 is slightly more complicated than the corresponding one for composite
trapezium rule. One way of remembering the rule is the learn the pattern
1 4 2 4 2 4 2 ... 4 2 4 2 4 1
which show that the end point values are multiplied by 1, the values with odd-numbered subscripts
are multiplied by 4 and the interior values with even subscripts are multiplied by 2.
HELM (2006): 45
Example 15
Using 4 subintervals in the composite Simpson’s rule approximate
Z 2
cosh(x) dx.
0
Solution
In this case h = (2 − 0)/4 = 0.5.
We require cosh(x) evaluated at five x-values and the results are tabulated below to 6 d.p.
xn fn = cosh(xn )
0 1.000000
0.5 1.127626
1 1.543081
1.5 2.352410
2 3.762196
It follows that
Z 2
1
cosh(x) dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
0 2
1
= (0.5) (1 + 4 × 1.127626 + 2 × 1.543081 + 4 × 2.352410 + 3.762196)
2
= 3.628083,
where this approximation is given to 6 decimal places.
Z 2
This approximation to cosh(x) dx is closer to the true value of sinh(2) (which is 3.626860
0
to 6 d.p.) than we obtained when using the composite trapezium rule with the same number of
subintervals.
Task
Using 4 subintervals in the composite Simpson’s rule approximate
Z 2
ln(x) dx.
1
Your solution
46 HELM (2006):
Answer
In this case h = (2 − 1)/4 = 0.25. There will be five x-values and the results are tabulated below
to 6 d.p.
xn fn = ln(xn )
1.00 0.000000
1.25 0.223144
1.50 0.405465
1.75 0.559616
2.00 0.693147
It follows that
Z 2
1
ln(x) dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
1 3
1
= (0.25) (0 + 4 × 0.223144 + 2 × 0.405465 + 4 × 0.559616 + 0.693147)
3
= 0.386260 to 6 d.p.
How good is the composite Simpson’s rule?

On page 39 (Key Point 8) we saw a formula for an upper bound on the error in the composite
trapezium method. A corresponding result for the composite Simpson’s rule exists and is given in
the following Key Point.
Key Point 10
Error in Composite Simpson’s Rule
Z b
The error in the N -subinterval composite Simpson’s rule approximation to f (x) dx is bounded
a
above by
(iv) (b − a)h4

max f (x)
a≤x≤b 180
(Here f (iv) is the fourth derivative of f and h is the subinterval width, so N × h = b − a.)
The formula in Key Point 10 can be used to decide how many subintervals to use to guarantee a
specific accuracy.
HELM (2006): 47
Example 16
The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 5
for x between 1 and 5. Determine how many subintervals are required so that the
composite Simpson’s rule used to approximate
Z 5
f (x) dx
1
incurs an error that is guaranteed less than 0.005 .
Solution
We require that
4h4
5× < 0.005
180
Now N = 4/h and it follows that
N > 8.684741
For the composite Simpson’s rule N must be an even whole number and we conclude that the
smallest number of subintervals which guarantees an error smaller than 0.005 is N = 10.
Task
The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 12
for x between 2 and 6. Determine how many subintervals are required so that the
composite Simpson’s rule used to approximate
Z 6
f (x) dx
2
incurs an error that is guaranteed less than 0.0005 .
Your solution
48 HELM (2006):
Answer
We require that
4h4
12 × < 0.0005
180
Now N = 4/h and it follows that
N > 19.222491
N must be an even whole number and we conclude that the smallest number of subintervals which
The following Task is similar to one that we saw earlier in this Section (page 42). Using the composite
Simpson’s rule we can achieve greater accuracy, for a similar amount of effort, than we managed
using the composite trapezium rule.
Task
2 /2
It is given that the function e−x has a fourth derivative that is never greater
than 3 in absolute value.
(a) Use this fact to determine how many subintervals are required for the composite Simpson’s rule
to deliver an approximation to
Z 1
1 2
√ e−x /2 dx
0 2π
that is guaranteed to have an error less than 1
2
× 10−4 .
Your solution
Answer
3 (b − a)h4
We require that √ < 0.00005.
2π 180
This means that h4 < 0.00751988 and therefore h < 0.294478. Since N = 1/h it is necessary for
N = 4 for the error bound to be guaranteed to be less than ± 21 × 10−4 .
HELM (2006): 49
(b) Find an approximation to the integral that is in error by less than 1
2
× 10−4 .
Your solution
Answer
1 2
In this case h = (1 − 0)/4 = 0.25. We require √ e−x /2 evaluated at five x-values and the
2π
results are tabulated below to 6 d.p.
1 2
xn √ e−xn /2
2π
0 0.398942
0.25 0.386668
0.5 0.352065
0.75 0.301137
1 0.241971
It follows that
Z 1
1 2 1
√ e−x /2 dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
0 2π 3
1
= (0.25) (0.398942 + 4 × 0.386668 + 2 × 0.352065
3
+4 × 0.301137 + 0.241971)
= 0.341355 to 6 d.p.
We know from part (a) that this approximation is in error by less than 1
2
× 10−4
50 HELM (2006):
Example 17
Find out how many strips are needed to be sure that
Z 4
sinh(2t) dt
0
is evaluated by Simpson’s rule with error less than ±0.0001
Solution
(b − a) 4
E=− h (16) sinh(2x) 0<x<4
180
64h2 sinh(8)
|E| ≤ ≤ 0.0001
180
0.0180
⇒ h4 ≤ ⇒ h ≤ 0.0208421
64 sinh(8)
4
nh = b − a ⇒ n≥ = 191.92
0.0208421
So n = 192 is needed (minimum even number).
HELM (2006): 51
Plastic bottle design
Introduction
Manufacturing containers is a large and varied industry and optimum packaging can save companies
millions of pounds. Although determining the capacity of a container and amount of material needed
can be done by physical experiment, mathematical modelling provides a cost-effective and efficient
means for the designer to experiment.
Problem in words
A manufacturer is designing a new plastic bottle to contain 900 ml of fabric softener. The bottle is
circular in cross section, with a varying radius given by
r = 4 + 0.5z − 0.07z 2 + 0.002z 3
where z is the height above the base in cm.
(a) Find an expression for the volume of the bottle and hence show that the fill level needs to be
approximately 18 cm.
(b) If the wall thickness of the plastic is 1 mm, show that this is always small compared to the
bottle radius.
(c) Hence, find the volume of plastic required to manufacture a bottle which is 20 cm tall (include
the plastic in the base and side walls), using a numerical method.
A graph the radius against z is shown below:
20
18
16
14
z 12
10
0
−5 0 5 10 15
r
Figure 10
Calculate all lengths in centimetres.
(a) The formula for the volume of a solid of revolution, revolved round the z axis between z = 0
Z d
and z = d is πr2 dz. We have to evaluate this integral.
0
(b) To show that the thickness is small relative to the radius we need to find the minimum radius.
52 HELM (2006):
(c) Given that the thickness is small compared with the radius, the volume can be taken to be the
surface area times the thickness. Now the surface area of the base is easy to calculate being
π × 42 , but we also need to calculate the surface area for the sides, which is much harder.
For an element of height dz this is s

2πz×(the slantheight) of the surface between z and z + dz.
2
dr 
The slant height is, analytically  1+ × dz, or equivalently the distance between
dz
(r(z), z) and (r(z + dz), z + dz), which is easier to use numerically.
s 2
Z 20
dr
Analytically the surface area to height 20 is 2πr 1 + dz; we shall approximate this
0 dz
numerically. This will give the area of the side surface.
Z d
(a) We could calculate this integral exactly, as the volume is π(4+0.5z −0.07z 2 +0.002z 3 )2 dz
0
but here we do this numerically (which can often be a simpler approach and possibly is so here).
To do that we need to keep an eye on the likely error, and for this problem we shall ensure
the error in the integrals is less than 1 ml. The formula for the error with the trapezium
rule, with step h and integrated from 0 to 20 (assuming from the problem that we shall
20 2
not integrate over a larger range) is h max|f 00 |. Doing this crudely with f = πg 2 where
12
g(z) = 4 + 0.5z − 0.07z 2 + 0.002z 3 we see that
|g(z)| ≤ 4 + 10 + 28 + 16 = 58 (using only positive signs and |z| ≤ 20)
and |g 0 (z)| ≤ 0.5 + 0.14z + 0.006z 2 ≤ 0.5 + 2.8 + 2.4 = 5.7 < 6,
and |g 00 (z)| ≤ 0.14 + 0.012z ≤ 0.38.
Therefore
20 2
f 00 = 2π(gg 00 + (g 0 )2 ) ≤ 2(58 × 0.38 + 62 )π < 117π, so h max|f 00 | ≤ 613h2 .
12
We need h2 < 1/613, or h < 0.0403. We will use h = 0.02, and the error will be at most 0.25.
The approximation to the integral from 0 to 18 is

899
1 2 X 1
πg (0)0.02 + πg 2 (0.02i)0.02 + πg 2 (18)0.02
2 i=1
2
(recalling the multiplying factor is a half for the first and last entries in the trapezium rule).
This yields a value of 899.72, which is certainly within 1 ml of 900.
(b) From the graph the minimum radius looks to be about 2 at about z = 18. Looking more
exactly (found by solving the quadratic to find the places where the derivative is zero, or by
plotting the values and by inspection), the minimum is at z = 18.93, when r = 1.948 cm. So
the thickness is indeed small (always less than 0.06 of the radius at all places.)
HELM (2006): 53
s 2
Z 20
dr
(c) For the area of the side surface we shall calculate 2πr 1+ dz numerically, using
0 dz
s 2
dr p
the trapezium rule with step 0.02 as before. 1+ dz = (dz)2 + (dr)2 , which we
dz
p
shall approximate at point zn by (zn+1 − zn ) + (rn+1 − rn )2 , so evaluating r(z) at intervals
2
of 0.02 gives the approximation

999
X
p p
πr(0) (0.02)2 + (r(0.02) − r(0))2 + 2πr(0.02i) (0.02)2 + (r(0.02(i + 1)) − r(0.02i))2
i=1
p
+πr(20) (0.02)2 + (r(20) − r(19.98))2 .
Calculating this gives 473 cm2 . Approximating the analytical expression by a direct numerical
calculation gives 474 cm2 . (The answer is between 473.5 and 473.6 cm2 , so this variation is
understandable and does not indicate an error.) The bottom surface area is 16π = 50.3 cm2 ,
so the total surface area we may take to be 474 + 50 = 524 cm2 , and hence the volume of
plastic is 524 × 0.1 = 52.4 cm3 .
An alternative to using the trapezium rule is Simpson’s rule which will require many fewer steps.
When using a computer program such as Microsoft Excel having an efficient method may not be
important for a small problem but could be significant when many calculations are needed or com-
putational power is limited (such as if using a programmable calculator).
The reader is invited to repeat the calculations for (a) and (c) using Simpson’s rule.
The analytical answer to (a) is given by
Z 18
π(16 + 4z − 0.31z 2 − 0.054z 3 + 0.0069z 4 − 0.00028z 5 + 0.000004z 6 ) dz
0
which gives 899.7223 to 4 d.p.
54 HELM (2006):
Exercises
1. Using 4 subintervals in the composite trapezium rule approximate
Z 5
√
x dx.
1
2. The function f is known to have a second derivative with the property that
|f 00 (x)| < 12
for x between 2 and 3. Using the error bound given earlier in this Section determine how many
subintervals are required so that the composite trapezium rule used to approximate
Z 3
f (x) dx
2
can be guaranteed to have an error in it that is less than 0.001.
3. Using 4 subintervals in the composite Simpson rule approximate

Z 5
√
x dx.
1
4. The function f is known to have a fourth derivative with the property that
(iv)
f (x) < 6
for x between −1 and 5 . Determine how many subintervals are required so that the composite
Simpson’s rule used to approximate
Z 5
f (x) dx
−1
incurs an error that is less than 0.001.
5. Determine the minimum number of steps needed to guarantee an error not exceeding
±0.000001 when numerically evaluating
Z 4
ln(x) dx
2
using Simpson’s rule.
HELM (2006): 55
Answers
√
1. In this case h = (5 − 1)/4 = 1. We require x evaluated at five x-values and the results are
tabulated below
√
xn fn = xn
1 1
2 1.414214
3 1.732051
4 2.000000
5 2.236068
It follows that
Z 5
√ 1
x dx ≈ h (f0 + f4 + 2{f1 + f2 + f3 })
1 2
1
= (1) 1 + 2.236068 + 2{1.414214 + 1.732051 + 2}
2
= 6.764298.
(b − a)h2
2. We require that 12 × < 0.001. This implies that h < 0.0316228.
12
N > 31.6228
Clearly, N must be a whole number and we conclude that the smallest number of subintervals
which guarantees an error smaller than 0.001 is N = 32.
h = (5 − 1)/4 = 1.
3. In this case √
We require x evaluated at five x-values and the results are as tabulated in the solution to
Exercise 1. It follows that
Z 5
√ 1
x dx ≈ h (f0 + 4f1 + 2f2 + 4f3 + f4 )
1 3
1
= (1) (1 + 4 × 1.414214 + 2 × 1.732051 + 4 × 2.000000 + 2.236068)
3
= 6.785675.
6h4
4. We require that 6 × < 0.001. This implies that h4 < 0.005 and therefore h < 0.265915.
180
Now N = 6/h and it follows that N > 22.563619. We know that N must be an even whole
number and we conclude that the smallest number of subintervals which guarantees an error
smaller than 0.001 is N = 24.
56 HELM (2006):
Answers
6
5. f (x) = ln(x) f (4) (x) = −
x4
(b − a)h4 f (4) (x)
Error = − a = 2, b = 4
180
2h4 (6/x4 )
|E| = x ∈ [2, 4]
180
h4 1
|E|max = ≤ 0.000001
15 24
⇒ h4 ≤ 15 × 24 × 0.000001 ⇒ h ≤ 0.124467
Now nh = (b − a) so
2
n≥ ⇒ n ≥ 16.069568 ⇒ n = 18 (minimum even number)
0.124467
HELM (2006): 57
Numerical
Introduction
In this Section we will look at ways in which derivatives of a function may be approximated numerically.

Prerequisites • review previous material concerning

differentiation

'
$
• obtain numerical approximations to the first
and second derivatives of certain functions
Learning Outcomes
& %
58 HELM (2006):
1. Numerical differentiation
This Section deals with ways of numerically approximating derivatives of functions. One reason for
dealing with this now is that we will use it briefly in the next Section. But as we shall see in these
next few pages, the technique is useful in itself.
2. First derivatives
Our aim is to approximate the slope of a curve f at a particular point x = a in terms of f (a) and
the value of f at a nearby point where x = a + h. The shorter broken line Figure 11 may be thought
of as giving a reasonable approximation to the required slope (shown by the longer broken line), if h
is small enough.
This slope approximates f ! (a)
Slope of line is f ! (a)
a a+h x
Figure 11
So we might approximate
difference in the y-values f (a + h) − f (a)
f 0 (a) ≈ slope of short broken line = = .
difference in the x-values h
This is called a one-sided difference or forward difference approximation to the derivative of f .
A second version of this arises on considering a point to the left of a, rather than to the right as we
did above. In this case we obtain the approximation
f (a) − f (a − h)
f 0 (a) ≈
h
This is another one-sided difference, called a backward difference, approximation to f 0 (a).
A third method for approximating the first derivative of f can be seen in Figure 12.
HELM (2006): 59
Section 31.3: Numerical Differentiation
f
This slope approximates f ! (a)
Slope of line is f ! (a)
a−h a a+h x
Figure 12
Here we approximate as follows
difference in the y-values f (x + h) − f (x − h)
f 0 (a) ≈ slope of short broken line = =
difference in the x-values 2h
This is called a central difference approximation to f 0 (a).
Key Point 11
First Derivative Approximations
Three approximations to the derivative f 0 (a) are
f (a + h) − f (a)
1. the one-sided (forward) difference
h
f (a) − f (a − h)
2. the one-sided (backward) difference
h
f (a + h) − f (a − h)
3. the central difference
2h
In practice, the central difference formula is the most accurate.
These first, rather artificial, examples will help fix our ideas before we move on to more realistic
applications.
60 HELM (2006):
Example 18
Use a forward difference, and the values of h shown, to approximate the derivative
of cos(x) at x = π/3.
(a) h = 0.1 (b) h = 0.01 (c) h = 0.001 (d) h = 0.0001
Work to 8 decimal places throughout.
Solution
cos(a + h) − cos(a) 0.41104381 − 0.5
(a) f 0 (a) ≈ = = −0.88956192
h 0.1
0 cos(a + h) − cos(a) 0.49131489 − 0.5
(b) f (a) ≈ = = −0.86851095
h 0.01
0 cos(a + h) − cos(a) 0.49913372 − 0.5
(c) f (a) ≈ = = −0.86627526
h 0.001
0 cos(a + h) − cos(a) 0.49991339 − 0.5
(d) f (a) ≈ = = −0.86605040
h 0.0001
One advantage of doing a simple example first is that we can compare these approximations with
the ‘exact’ value which is
√
0 3
f (a) = − sin(π/3) = − = −0.86602540 to 8 d.p.
2
Note that the accuracy levels of the four approximations in Example 15 are:
(a) 1 d.p. (b) 2 d.p. (c) 3 d.p. (d) 3 d.p. (almost 4 d.p.)
The errors to 6 d.p. are:
(a) 0.023537 (b) 0.002486 (c) 0.000250 (d) 0.000025
Notice that the errors reduce by about a factor of 10 each time.
Example 19
Use a central difference, and the value of h shown, to approximate the derivative
of cos(x) at x = π/3.
(a) h = 0.1 (b) h = 0.01 (c) h = 0.001 (d) h = 0.0001
Work to 8 decimal places throughout.
HELM (2006): 61
Solution
cos(a + h) − cos(a − h) 0.41104381 − 0.58396036
(a) f 0 (a) ≈ = = −0.86458275
2h 0.2
0 cos(a + h) − cos(a − h) 0.49131489 − 0.50863511
(b) f (a) ≈ = = −0.86601097
2h 0.02
0 cos(a + h) − cos(a − h) 0.49913372 − 0.50086578
(c) f (a) ≈ = = −0.86602526
2h 0.002
cos(a + h) − cos(a − h) 0.49991339 − 0.50008660
(d) f 0 (a) ≈ = = −0.86602540
2h 0.0002
This time successive approximations generally have two extra accurate decimal places indicating a
superior formula. This is illustrated again in the following Task.
Task
Let f (x) = ln(x) and a = 3. Using both a forward difference and a central
difference, and working to 8 decimal places, approximate f 0 (a) using h = 0.1 and
h = 0.01.
(Note that this is another example where we can work out the exact answer, which
in this case is 13 .)
Your solution
62 HELM (2006):
Answer
Using the forward difference we find, for h = 0.1
ln(a + h) − ln(a) 1.13140211 − 1.09861229
f 0 (a) ≈ = = 0.32789823
h 0.1
and for h = 0.01 we obtain
ln(a + h) − ln(a) 1.10194008 − 1.09861229
f 0 (a) ≈ = = 0.33277901
h 0.01
Using central differences the two approximations to f 0 (a) are
ln(a + h) − ln(a − h) 1.13140211 − 1.06471074
f 0 (a) ≈ = = 0.33345687
2h 0.2
and
ln(a + h) − ln(a − h) 1.10194008 − 1.09527339
f 0 (a) ≈ = = 0.33333457
2h 0.02
The accurate answer is, of course, 0.33333333
There is clearly little point in studying this technique if all we ever do is approximate quantities we
could find exactly in another way. The following example is one in which this so-called differencing
method is the best approach.
Example 20
The distance x of a runner from a fixed point is measured (in metres) at intervals
of half a second. The data obtained are
t 0.0 0.5 1.0 1.5 2.0
x 0.00 3.65 6.80 9.90 12.15
Use central differences to approximate the runner’s velocity at times t = 0.5 s and
t = 1.25 s.
Solution
Our aim here is to approximate x0 (t). The choice of h is dictated by the available data given in the
table.
Using data with t = 0.5 s at its centre we obtain
x(1.0) − x(0.0)
x0 (0.5) ≈ = 6.80 m s−1 .
2 × 0.5
Data centred at t = 1.25 s gives us the approximation
x(1.5) − x(1.0)
x0 (1.25) ≈ = 6.20 m s−1 .
2 × 0.25
Note the value of h used.
HELM (2006): 63
Task
The velocity v (in m s−1 ) of a rocket measured at half second intervals is
t 0.0 0.5 1.0 1.5 2.0
v 0.000 11.860 26.335 41.075 59.051
Use central differences to approximate the acceleration of the rocket at times
t = 1.0 s and t = 1.75 s.
Your solution
Answer
v(1.5) − v(0.5)
v 0 (1.0) ≈ = 29.215 m s−2 .
1.0
Data centred at t = 1.75 s gives us the approximation
v(2.0) − v(1.5)
v 0 (1.75) ≈ = 35.952 m s−2 .
0.5
3. Second derivatives
An approach which has been found to work well for second derivatives involves applying the notion
of a central difference three times. We begin with
f 0 (a + 12 h) − f 0 (a − 21 h)
f 00 (a) ≈ .
h
Next we approximate the two derivatives in the numerator of this expression using central differences
as follows:
f (a + h) − f (a) f (a) − f (a − h)
f 0 (a + 21 h) ≈ and f 0 (a − 21 h) ≈ .
h h
64 HELM (2006):
Combining these three results gives
00 f 0 (a + 12 h) − f 0 (a − 21 h)
f (a) ≈
h

1 f (a + h) − f (a) f (a) − f (a − h)
≈ −
h h h
f (a + h) − 2f (a) + f (a − h)
=
h2
Key Point 12
Second Derivative Approximation
A central difference approximation to the second derivative f 00 (a) is
f (a + h) − 2f (a) + f (a − h)
f 00 (a) ≈
h2
Example 21
The distance x of a runner from a fixed point is measured (in metres) at intervals
of half a second. The data obtained are
t 0.0 0.5 1.0 1.5 2.0
x 0.00 3.65 6.80 9.90 12.15
Use a central difference to approximate the runner’s acceleration at t = 1.5 s.
Solution
Our aim here is to approximate x00 (t).
x(2.0) − 2x(1.5) + x(1.0)

x00 (1.5) ≈
0.52
−2
= −3.40 m s ,
from which we see that the runner is slowing down.
HELM (2006): 65
Exercises
1. Let f (x) = cosh(x) and a = 2. Let h = 0.01 and approximate f 0 (a) using forward, backward
and central differences. Work to 8 decimal places and compare your answers with the exact
result, which is sinh(2).
2. The distance x, measured in metres, of a downhill skier from a fixed point is measured at
intervals of 0.25 s. The data gathered are
t 0 0.25 0.5 0.75 1 1.25 1.5

x 0 4.3 10.2 17.2 26.2 33.1 39.1
Use a central difference to approximate the skier’s velocity and acceleration at the times
t =0.25 s, 0.75 s and 1.25 s. Give your answers to 1 decimal place.
Answers
cosh(a + h) − cosh(a) 3.79865301 − 3.76219569
1. Forward: f 0 (a) ≈ = = 3.64573199
h 0.01
cosh(a) − cosh(a − h) 3.76219569 − 3.72611459
Backward: f 0 (a) ≈ = = 3.60810972
h 0.01
cosh(a + h) − cosh(a − h) 3.79865301 − 3.72611459
Central: f 0 (a) ≈ = = 3.62692086
2h 0.02
The accurate result is sinh(2) = 3.62686041.
2. Velocities at the given times approximated by a central difference are:
20.4 m s−1 , 32.0 m s−1 and 25.8 m s−1 .
Accelerations at these times approximated by a central difference are:
25.6 m s−2 , 32.0 m s−2 and −14.4 m s−2 .
66 HELM (2006):

Nonlinear Equations 31.4

Introduction
In this Section we briefly discuss nonlinear equations (what they are and what their solutions might
be) before noting that many such equations which crop up in applications cannot be solved exactly.
The remainder (and majority) of the Section then goes on to discuss methods for approximating
solutions of nonlinear equations.
#
• understand derivatives of simple functions
Prerequisites • understand the quadratic formula
• understand exponentials and logarithms
"
' !
$
• approximate roots of equations by the
bisection method and by the
Learning Outcomes Newton-Raphson method
On completion you should be able to . . . • implement an approximate Newton-Raphson
method
& %
HELM (2006): 67
Section 31.4: Nonlinear Equations
1. Nonlinear Equations
A linear equation is one related to a straight line, for example f (x) = mx + c describes a straight
line with slope m and the linear equation f (x) = 0, involving such an f , is easily solved to give
x = −c/m (as long as m 6= 0). If a function f is not represented by a straight line in this way we
say it is nonlinear.
The nonlinear equation f (x) = 0 may have just one solution, like in the linear case, or it may have
no solutions at all, or it may have many solutions. For example if f (x) = x2 − 9 then it is easy to
see that there are two solutions x = −3 and x = 3. The nonlinear equation f (x) = x2 + 1 has no
solutions at all (unless the application under consideration makes it appropriate to consider complex
numbers).
Our aim in this Section is to approximate (real-valued) solutions of nonlinear equations of the form
f (x) = 0. The definitions of a root of an equation and a zero of a function have been gathered
together in Key Point 13.
Key Point 13
If the value x is such that f (x) = 0 we say that
1. x is a root of the equation f (x) = 0
2. x is a zero of the function f .
Example 22
Find any (real valued) zeros of the following functions. (Give 3 decimal places if
you are unable to give an exact numerical value.)
(a) f (x) = x2 + x − 20 (b) f (x) = x2 − 7x + 5 (c) f (x) = 2x − 3
(d) f (x) = ex + 1 (e) f (x) = sin(x)
Solution
(a) This quadratic factorises easily into f (x) = (x − 4)(x + 5) and so the two zeros of this f are
x = 4, x = −5.
(b) The nonlinear equation x2 − 7x +√5 = 0 requires the quadratic

√ formula and we find that the
2
7± 7 −4×1×5 7 ± 29
two zeros of this f are x = = which are equal to x = 0.807
2 2
and x = 6.193, to 3 decimal places.
68 HELM (2006):
Solution (contd.)
(c) Using the natural logarithm function we see that
x ln(2) = ln(3)
from which it follows that x = ln(3)/ ln(2) = 1.585, to 3 decimal places.
(d) This f has no zeros because ex + 1 is always positive.
(e) sin(x) has an infinite number of zeros at x = 0, ±π, ±2π, ±3π, . . . . To 3 decimal places these
are x = 0.000, ±3.142, ±6.283, ±9.425, . . . .
Task
Find any (real valued) zeros of the following functions.
(a) f (x) = x2 + 2x − 15, (b) f (x) = x2 − 3x + 3,
(c) f (x) = ln(x) − 2, (d) f (x) = cos(x).
For parts (a) to (c) give your answers to 3 decimal places if you cannot give an
exact answer; your answers to part (d) may be left in terms of π.
Your solution
HELM (2006): 69
Answer
(a) This quadratic factorises easily into f (x) = (x − 3)(x + 5) and so the two zeros of this
f are x = 3, x = −5.
(b) The equation x2 − 3x + 3 = 0 requires the quadratic formula and the two zeros of this
f are
√ √
3 ± 32 − 4 × 1 × 3 3 ± −3
x= =
2 2
which are complex values. This f has no real zeros.
(c) Solving ln(x) = 2 gives x = e2 = 7.389, to 3 decimal places.
π π π
(d) cos(x) has an infinite number of zeros at x = , ± π, ± 2π, . . . .
2 2 2
Many functions that crop up in engineering applications do not lend themselves to finding zeros
directly as was achieved in the examples above. Instead we approximate zeros of functions, and this
Section now goes on to describe some ways of doing this. Some of what follows will involve revision
of material you have seen in 12 concerning Applications of Differentiation.
2. The bisection method

Suppose that, by trial and error for example, we know that a single zero of some function f lies
between x = a and x = b. The root is said to be bracketed by a and b. This must mean that f (a)
and f (b) are of opposite signs, that is that f (a)f (b) < 0.
Example 23
The single positive zero of the function f (x) = x tanh( 12 x) − 1 models the wave
number of water waves at a certain frequency in water of depth 0.5 (measured
in some units we need not worry about here). Find two points which bracket the
zero of f .
Solution
We simply evaluate f at a selection of x-values.
x f (x) = x tanh( 12 x) − 1
0 0 × tanh(0) − 1 = −1
0.5 0.5 × tanh(0.25) − 1 = 0.5 × 0.2449 − 1 = −0.8775
1 1 × tanh(0.5) − 1 = 1 × 0.4621 − 1 = −0.5379
1.5 1.5 × tanh(0.75) − 1 = 1.5 × 0.6351 − 1 = −0.0473
2 2 × tanh(1) − 1 = 2 × 0.7616 − 1 = 0.5232
From this we can see that f changes sign between 1.5 and 2. Thus we can take a = 1.5 and b = 2
as the bracketing points. That is, the zero of f is in the bracketing interval 1.5 < x < 2.
70 HELM (2006):
Task
The function f (x) = cos(x) − x has a single positive zero. Find bracketing points
a and b for the zero of f . Arrange for the difference between a and b to be equal
to 12 .
(NB - be careful to use radians on your calculator!)
Your solution
Answer
We evaluate f for a range of values:
x f (x)
0 1
0.5 0.37758
1 −0.459698
Clearly f changes sign between the bracketing values a = 0.5 and b = 1.
(Other answers are valid of course, it depends which values of f you tried.)
The aim with the bisection method is to repeatedly reduce the width of the bracketing interval
a < x < b so that it “pinches” the required zero of f to some desired accuracy. We begin by
describing one iteration of the bisection method in detail.
Let m = 21 (a + b), the mid-point of the interval a < x < b. All we need to do now is to see in
which half (the left or the right) of the interval a < x < b the zero is in. We evaluate f (m). There
is a (very slight) chance that f (m) = 0, in which case our job is done and we have found the zero
of f . Much more likely is that we will be in one of the two situations shown in Figure 13 below. If
f (m)f (b) < 0 then we are in the situation shown in (a) and we replace a < x < b with the smaller
bracketing interval m < x < b. If, on the other hand, f (a)f (m) < 0 then we are in the situation
shown in (b) and we replace a < x < b with the smaller bracketing interval a < x < m.
a m b x a m b x
(a) (b)
Figure 13
HELM (2006): 71
Either way, we now have a bracketing interval that is half the size of the one we started with. We
have carried out one iteration of the bisection method. By successively reapplying this approach we
can make the bracketing interval as small as we wish.
Example 24
Carry out one iteration of the bisection method so as to halve the width of the
bracketing interval 1.5 < x < 2 for
f (x) = x tanh( 12 x) − 1.
Solution
The mid-point of the bracketing interval is m = 12 (a + b) = 21 (1.5 + 2) = 1.75. We evaluate
f (m) = 1.75 × tanh( 12 × 1.75) − 1 = 0.2318,
to 4 decimal places. We found earlier (Example 20, page 63) that f (a) < 0 and f (b) > 0, the fact
that f (m) is of the opposite sign to f (a) means that the zero of f lies in the bracketing interval
1.5 < x < 1.75.
Task
Carry out one iteration of the bisection method so as to halve the width of the
bracketing interval 0.5 < x < 1 for
f (x) = cos(x) − x.
Your solution
Answer
Here a = 0.5, b = 1. The mid-point of the bracketing interval is m = 12 (a + b) = 12 (0.5 + 1) = 0.75.
We evaluate
f (m) = cos(0.75) − 0.75 = −0.0183
We found earlier (Task, pages 58-59) that f (a) > 0 and f (b) < 0, the fact that f (m) is of the
opposite sign to f (a) means that the zero of f lies in the bracketing interval 0.5 < x < 0.75.
So we have a way of halving the size of the bracketing interval. By repeatedly applying this approach
we can make the interval smaller and smaller.
The general procedure, involving (possibly) many iterations, is best described as an algorithm:
72 HELM (2006):
1. Choose an error tolerance.
2. Let m = 12 (a + b), the mid-point of the bracketing interval.
3. There are three possibilities:
(a) f (m) = 0, this is very unlikely in general, but if it does happen then we have found the
zero of f and we can go to step ??,
(b) the zero is between m and b,
(c) the zero is between a and m.
4. If the zero is between m and b, that is if f (m)f (b) < 0 (as in Figure 13(a)) then let a = m.
5. Otherwise the zero must be between a and m (as in Figure 13(b)) so let b = m.
6. If b − a is greater than the required tolerance then go to step ??.
7. End.
One feature of this method is that we can predict in advance how much effort is required to achieve
a certain level of accuracy.
Example 25
A given problem using the bisection method starts with the bracketing points
a = 1.5 and b = 2. How many iterations will be required so that the error in the
approximation is less that 12 × 10−6 ?
Solution
Before we carry out any iterations we can write that the zero to be approximated is 1.75 ± 0.25 so
that the maximum magnitude of the error in 1.75 may be taken to be equal to 0.25.
Each successive iteration will halve the size of the error, so that after n iterations the error is equal
to
1
× 0.25
2n
We require that this quantity be less than 1
2
× 10−6 . Now,
1 1 1
n
× 0.25 < × 10−6 implies that 2n > × 106 .
2 2 2
The smallest value of n which satisfies this inequality can be found by trial and error, or by using
logarithms to see that n > (ln( 21 ) + 6 ln(10))/ ln(2). Either way, the smallest integer which will do
the trick is
n = 19.
It takes 19 iterations of the bisection method to ensure the required accuracy.
HELM (2006): 73
Task
A function f is known to have a single zero between the points a = 3.2 and b = 4.
If these values were used as the initial bracketing points in an implementation of
the bisection method, how many iterations would be required to ensure an error
less than 12 × 10−3 ?
Your solution
Answer
We require that

1 4 − 3.2 1
n
× < × 10−3
2 2 2
or, after a little rearranging,
4
2n > × 103 .
5
The smallest value of n which satisfies this is n = 10. (This can be found by trial-and-error or by
using logarithms.)
Pros and cons of the bisection method
Pros Cons
• the method is easy to understand and re- • the method is very slow
member
• the method cannot find roots where the
• the method always works (once you find val-
curve just touches the x-axis but does not
ues a and b which bracket a single zero)
cross it (e.g. double roots)
• the method allows us to work out how many
iterations it will take to achieve a given error
tolerance because we know that the interval
will exactly halve at each step
The slowness of the bisection method will not be a surprise now that you have worked through an
example or two! Significant effort is involved in evaluating f and then all we do is look at this f -value
and see whether it is positive or negative! We are throwing away hard won information.
74 HELM (2006):
Let us be realistic here, the slowness of the bisection method hardly matters if all we are saying is
that it takes a few more fractions of a second of computing time to finish, when compared with a
competing approach. But there are applications in which f may be very expensive (that is, slow) to
calculate and there are applications where engineers need to find zeros of a function many thousands
of times. (Coastal engineers, for example, may employ mathematical wave models that involve finding
the wave number we saw in Example 20 at many different water depths.) It is quite possible that
you will encounter applications where the bisection method is just not good enough.
3. The Newton-Raphson method

You may recall (e.g. 13.3) that the Newton-Raphson method (often simply called Newton’s
method) for approximating a zero of the function f is given by
f (xn )
xn+1 = xn −
f 0 (xn )
where f 0 denotes the first derivative of f and where x0 is an initial guess to the zero of f . A graphical
way of interpreting how this method works is shown in Figure 14.
x3
x x x
2 1 0
Figure 14
At each approximation to the zero of f we extrapolate so that the tangent to the curve meets the
x-axis. This point on the x-axis is the new approximation to the zero of f . As is clear from both
the figure and the mathematical statement of the method above, we require that f 0 (xn ) 6= 0 for
n = 0, 1, 2, . . . .
HELM (2006): 75
Example 26
Let us consider the example we met earlier in Example 24. We know that the
single positive zero of
f (x) = x tanh( 12 x) − 1
lies between 1.5 and 2. Use the Newton-Raphson method to approximate the zero
of f .
Solution
We must work out the derivative of f to use Newton-Raphson. Now
f 0 (x) = tanh( 12 x) + x 21 sech2 ( 12 x)

d
on differentiating a product and recalling that tanh(x) = sech2 (x). (To evaluate sech on a
dx
1
calculator recall that sech(x) = .)
cosh(x)
We must choose a starting value x0 for the iteration and, given that we know the zero to be between
1.5 and 2, we take x0 = 1.75. The first iteration of Newton-Raphson gives
f (x0 ) f (1.75) 0.231835
x1 = x0 − = 1.75 − = 1.75 − = 1.547587,
f 0 (x0 ) f 0 (1.75) 1.145358
where 6 decimal places are shown. The second iteration gives
f (x1 ) f (1.547587) 0.004585
x2 = x1 − 0
= 1.547587 − 0 = 1.547587 − = 1.543407.
f (x1 ) f (1.547587) 1.09687
Clearly this method lends itself to implementation on a computer and, for example, using a spread-
sheet package, it is not hard to compute a few more iterations. Here is output from Microsoft Excel
where we have included the two lines of hand-calculation above:
n xn f (xn ) f 0 (xn ) xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.09687 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 7.69E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
and all subsequent lines are equal to the last line here. The method has converged (very quickly!)
to 1.543405, to six decimal places.
Earlier, in Example 25, we found that the bisection method would require 19 iterations to achieve 6
decimal place accuracy. The Newton-Raphson method gave an answer good to this number of places
in just two or three iterations.
76 HELM (2006):
Task
Use the starting value x0 = 0 in an implementation of the Newton-Raphson
method for approximating the zero of
f (x) = cos(x) − x.
(If you are doing these calculations by hand then just perform two or three itera-
tions. Don’t forget to use radians.)
Your solution
Answer
The derivative of f is f 0 (x) = − sin(x) − 1. The first iteration is
f (x0 ) 1−0
x1 = x0 − 0
=0− =1
f (x0 ) −0 − 1
and the second iteration is
f (x1 ) cos(1) − 1 −0.459698
x2 = x1 − 0 =1− =1− = 0.750364,
f (x1 ) − sin(1) − 1 −1.841471
and so on. There is little to be gained in our understanding by doing more iterations by hand, but
using a spreadsheet we find that the method converges rapidly:
0 0 1 −1 1
1 1 −0.4597 −1.84147 0.750364
2 0.750364 −0.01892 −1.6819 0.739113
3 0.739113 −4.6E − 05 −1.67363 0.739085
4 0.739085 −2.8E − 10 −1.67361 0.739085
5 0.739085 0 −1.67361 0.739085
It is often necessary to find zeros of polynomials when studying transfer functions. Here is a Task
involving a polynomial.
HELM (2006): 77
Task
The function f (x) = x3 + 2x + 4 has a single zero near x0 = −1. Use this value
of x0 to perform two iterations of the Newton-Raphson method.
Your solution
Answer
Using the starting value x0 = −1 you should find that f (x0 ) = 1 and f 0 (x0 ) = 5. This leads to
f (x0 ) 1
x1 = x0 − 0
= −1 − = −1.2.
f (x0 ) 5
f (x1 ) −0.128
The second iteration should give you x2 = x1 − 0
= −1.2 − = −1.17975.
f (x1 ) 6.32
Subsequent iterations will home in on the zero of f . Using a computer spreadsheet gives:
n xn f (x) f 0 (x) xn+1
0 −1 1 5 −1.2
1 −1.2 −0.128 6.32 −1.17975
2 −1.17975 −0.00147 6.175408 −1.17951
3 −1.17951 −2E − 07 6.173725 −1.17951
4 −1.17951 0 6.173725 −1.17951
where we have recomputed the hand calculations for the first two iterations.
We see that the method converges to the value −1.17951.
78 HELM (2006):
Pressure in an ideal multi-component mixture
Introduction
An ideal multi-component mixture consists of
1. n-pentane (5%)
2. n-hextane (15%)
3. n-heptane (50%)
4. n-octane (30%)
In general, the total pressure, P (Pa) of an ideal four-component mixture is related to the boiling
point, T (K) through the formula:
P = x1 p∗1 + x2 p∗2 + x3 p∗3 + x4 p∗4
where, for component i, the mole fraction is xi and the vapour pressure is p∗i , given by the formula:

∗ Bi
pi = exp Ai − i = 1, 2, 3, 4
(T + Ci )
Here p∗i is in mm Hg (1 mm Hg = 133.32 Pa), T is the absolute temperature (K) and the constants
Ai , Bi and Ci are given in the table below.
i component xi Ai Bi Ci
1 n-pentane 0.05 15.8333 2477.07 −39.94
2 n-hexane 0.15 15.8366 2697.55 −48.78
3 n-heptane 0.50 15.8737 2911.32 −56.51
4 n-octane 0.30 15.9426 3120.29 −63.63
Problem 1
For the liquid compositions, xi given in the table above, plot a graph of the total pressure, P (Pa)
against temperature (K) over the range 250 to 500 K.
Solution

Bi
p∗i= exp Ai − , expressed in millimetres of mercury, and so it is 133.32 times that in
T + Ci
pascals. Therefore, expressed in pascals, we have
4
X Bi
P = 133.32 xi exp Ai −
i=1
T + Ci
Plotting this from T = 250 to 500 gives the following graph
HELM (2006): 79
× 105
18
16
Pressure 14
Pa 12
10
8
6
4
2
0
250 300 350 400 450 500 Temperature K
Figure 15
Problem 2
Using the Newton-Raphson method, solve the equations to find the boiling points at total pressures
of 1, 2, 5 and 10 bars. Show the sequence of iterations and perform sufficient calculations for
convergence to three significant figures. Display these solutions on the graph of the total pressure,
P (Pa) against temperature T (K).
Solution
We wish to find T when P = 1, 2, 5 and 10 bars, that is, 105 , 2 × 105 , 5 × 105 and 10 × 105 Pa.
Reading crude approximations to T from the graph gives a starting point for the Newton-Raphson
process. We see that for 105 , 2×105 , 5×105 and 10×105 Pa, temperature T is roughly 365, 375, 460
and 485 degrees K, respectively, so we shall use these values as the start of the iteration.
In this case it is easy to calculate the derivative of P with respect to T exactly, rather than numerically,
giving
4
0
X Bi Bi
P (T ) = 133.32 xi exp Ai − ×
i=1
T + Ci (T + Ci )2
Therefore to solve the equation P (T ) = y, we set T0 to be the starting value above and use the
iteration
P (Tn ) − y
Tn+1 = Tn −
P 0 (Tn )
For y = 100000 this gives the iterations
T0 T1 T2 T3 T4
365 362.7915 362.7349 362.7349 362.7349
We conclude that, to three significant figures T = 363◦ K when P = 100000 Pa.
80 HELM (2006):
T0 T1 T2 T3 T4
375 390.8987 388.8270 388.7854 388.7854
T0 T1 T2 T3 T4 T5
460 430.3698 430.4640 430.2824 430.2821 430.2821
T0 T1 T2 T3 T4 T5
475 469.0037 468.7875 468.7873 468.7873 468.7873
An approximate Newton-Raphson method

The Newton-Raphson method is an excellent way of approximating zeros of a function, but it requires
you to know the derivative of f . Sometimes it is undesirable, or simply impossible, to work out the
derivative of a function and here we show a way of getting around this.
We approximate the derivative of f . From Section 31.3 we know that
f (x + h) − f (x)
f 0 (x) ≈
h
is a one-sided (or forward) approximation to f 0 and another one, using a central difference, is
f (x + h) − f (x − h)
f 0 (x) ≈ .
2h
The advantage of the forward difference is that only one extra f -value has to be computed. If f
is especially complicated then this can be a considerable saving when compared with the central
difference which requires two extra evaluations of f . The central difference does have the advantage,
as we saw when we looked at truncation errors, of being a more accurate approximation to f 0 .
The spreadsheet program Microsoft Excel has a built in “solver” command which can use Newton’s
method. (It may be necessary to use the “Add in” feature of Excel to access the solver.) In reality
Excel has no way of working out the derivative of the function and must approximate it. Excel gives
you the option of using a forward or central difference to estimate f 0 .
We now reconsider the problem we met in Examples 24 to 26.
HELM (2006): 81
Example 27
We know that the single positive zero of f (x) = x tanh( 21 x) − 1 lies between
1.5 and 2. Use the Newton-Raphson method, with an approximation to f 0 , to
approximate the zero of f .
Solution
There is no requirement for f 0 this time, but the nature of this method is such that we will resort
to a computer straight away. Let us choose h = 0.1 in our approximations to the derivative.
Using the one-sided difference to approximate f 0 (x) we obtain this sequence of results from the
spreadsheet program:
f (x+h)−f (x)
n xn f (xn ) h
xn+1
0 1.75 0.231835 1.154355 1.549165
1 1.549165 0.006316 1.110860 1.543479
2 1.543479 8.16E − 05 1.109359 1.543406
3 1.543406 1.01E − 06 1.109339 1.543405
4 1.543405 1.24E − 08 1.109339 1.543405
5 1.543405 1.53E − 10 1.109339 1.543405
6 1.543405 1.89E − 12 1.109339 1.543405
7 1.543405 2.31E − 14 1.109339 1.543405
8 1.543405 0 1.109339 1.543405
And using the (more accurate) central difference gives
f (x+h)−f (x−h)
n xn f (xn ) 2h
xn+1
0 1.75 0.231835 1.144649 1.547462
1 1.547462 0.004448 1.095994 1.543404
2 1.543404 −1E − 06 1.094818 1.543405
3 1.543405 7.95E − 10 1.094819 1.543405
4 1.543405 −6.1E − 13 1.094819 1.543405
5 1.543405 0 1.094819 1.543405
We see that each of these approaches leads to the same value (1.543405) that we found with the
Newton-Raphson method.
82 HELM (2006):
Task
Use a spreadsheet to recompute the approximations shown in Example 24, for the
following values of h:
h = 0.001, 0.00001, 0.000001.
Your solution
Answer
You should find that as h decreases, the numbers get closer and closer to those shown earlier for the
Newton-Raphson method. For example, when h = 0.0000001 we find that for a one-sided difference
the results are
f (x+h)−f (x)
n xn f (xn ) h
xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.096870 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 8.08E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
and those for a central difference with h = 0.0000001 are
f (x+h)−f (x−h)
n xn f (xn ) 2h
xn+1
0 1.75 0.231835 1.145358 1.547587
1 1.547587 0.004585 1.096870 1.543407
2 1.543407 2.52E − 06 1.095662 1.543405
3 1.543405 7.7E − 13 1.095661 1.543405
4 1.543405 0 1.095661 1.543405
It is clear that these two tables very closely resemble the Newton-Raphson results seen earlier.
HELM (2006): 83
Exercises
1. It is given that the function
f (x) = x3 + 2x + 8
has a single negative zero.
(a) Find two integers a and b which bracket the zero of f .

(b) Perform one iteration of the bisection method so as to halve the size of the bracketing
interval.
2. Consider a simple electronic circuit with an input voltage of 2.0 V, a resistor of resistance 1000
Ω and a diode. It can be shown that the voltage across the diode can be found as the single
positive zero of
x 2−x
f (x) = 1 × 10−14 exp − .
0.026 1000
Use one iteration of the Newton-Raphson method, and an initial value of x0 = 0.75 to show
that
x1 = 0.724983
and then work out a second iteration.
3. It is often necessary to find the zeros of polynomials as part of an analysis of transfer functions.
The function
f (x) = x3 + 5x − 4
has a single zero near x0 = 1. Use this value of x0 in an implementation of the Newton-Raphson
method performing two iterations. (Work to at least 6 decimal place accuracy.)
4. The smallest positive zero of
f (x) = x tan(x) + 1
is a measure of how quickly certain evanescent water waves decay, and its value, x0 , is near 3.
Use the forward difference
f (3.01) − f (3)
0.01
to estimate f 0 (3) and use this value in an approximate version of the Newton-Raphson method
to derive one improvement on x0 .
84 HELM (2006):
Answers
1. (a) By trial and error we find that f (−2) = −4 and f (−1) = 5, from which we see that the
required bracketing interval is a < x < b where a = −2 and b = −1.
(b) For an iteration of the bisection method we find the mid-point m = −1.5. Now f (m) =
1.625 which is of the opposite sign to f (a) and hence the new smaller bracketing interval
is a < x < m.
1 × 10−14 x 1
2. The derivative of f is f 0 (x) = exp + , and therefore the first iteration
0.026 0.026 1000
0.032457
of Newton-Raphson gives x1 = 0.75 − = 0.724983.
1.297439
0.011603
The second iteration gives x2 = 0.724983 − = 0.701605.
0.496319
Using a spreadsheet we can work out some more iterations. The result of this process is
tabulated below
2 0.701605 0.003942 0.202547 0.682144
3 0.682144 0.001161 0.096346 0.670092
4 0.670092 0.000230 0.060978 0.666328
5 0.666328 1.56E − 05 0.052894 0.666033
6 0.666033 8.63E − 08 0.052310 0.666031
7 0.666031 2.68E − 12 0.052306 0.666031
8 0.666031 0 0.052306 0.666031
and we conclude that the required zero of f is equal to 0.666031, to 6 decimal places.
3. Using the starting value x0 = 1 you should find that f (x0 ) = 2 and f 0 (x0 ) = 8. This leads to
f (x0 ) 2
x1 = x0 − = 1 − = 0.75.
f 0 (x0 ) 8
f (x1 ) 0.171875
The second iteration should give you x2 = x1 − 0
= 0.75 − = 0.724299.
f (x1 ) 6.6875
Subsequent iterations can be used to ‘home in’ on the zero of f and, using a computer
spreadsheet program, we find that
n xn f (x) f 0 (x) xn+1

2 0.724299 0.001469 6.573827 0.724076
3 0.724076 1.09E − 07 6.572856 0.724076
4 0.724076 0 6.572856 0.724076
We see that the method converges to the value 0.724076.
HELM (2006): 85
Answers
4. We begin with
f (3.01) − f (3) 0.02924345684

f 0 (3) ≈ = = 2.924345684,
0.01 0.01
to the displayed number of decimal places, and hence an improvement on x0 = 0.75 is
f (3)
x1 = 3 − = 2.804277,
2.924345684
to 6 decimal places. (It can be shown that the root of f is 2.798386, to 6 decimal places.)
86 HELM (2006):
Contents 32
Numerical
Initial Value Problems
32.1 Initial Value Problems 2
32.2 Linear Multistep Methods 20
32.3 Predictor-Corrector Methods 39
32.4 Parabolic PDEs 45
32.5 Hyperbolic PDEs 69
Learning outcomes
In this Workbook you will learn about numerical methods for approximating solutions
relating to a certain type of application area. Specifically you will see methods that
approximate solutions to differential equations.

Initial Value Problems 32.1

Introduction
Many engineering applications describe the evolution of some process with time. In order to define
such an application we require two distinct pieces of information: we need to know what the process
is and also when or where the application started.
In this Section we begin with a discussion of some of these so-called initial value problems. Then
we look at two numerical methods that can be used to approximate solutions of certain initial value
problems. These two methods will serve as useful instances of a fairly general class of methods which
we will describe in Section 32.2.
#
• revise the trapezium method for
approximating integrals in 31.2
Prerequisites
• review the material concerning
approximations to derivatives in 31.3
"
# !
• recognise an initial value problem
Learning Outcomes • implement the Euler and trapezium
method to approximate the solutions of
certain initial value problems
" !
2 HELM (2006):
Workbook 32: Numerical Initial Value Problems
®
1. Initial value problems

In 19.4 we saw the following initial value problem which arises from Newton’s law of cooling
dθ
= −k(θ − θs ), θ(0) = θ0 .
dt
Here θ = θ(t) is the temperature of some liquid at time t, θ0 is the initial temperature at t = 0 and
θs is the surrounding temperature. The constant of proportion k has units s−1 and depends on the
properties of the liquid.
dθ
This initial value problem has two parts: the differential equation = −k(θ − θs ), which models
dt
the physical process, and the initial condition θ(0) = θ0 .
Key Point 1
An initial value problem may be made up of two components
1. A mathematical model of the process, stated in the form of a differential equation.
2. An initial value, given at some value of the independent variable.
It should be noted that there are applications in which initial value problems do not model processes
that are time dependent, but we will not dwell on this fact here.
The initial value problem above is such that we can write down an exact or analytic solution (it is
θ(t) = θs + (θ0 − θs )e−kt ) but there are many applications where it is impossible or undesirable to
seek such a solution. The aim of this Section is to begin to describe numerical methods that can be
used to find approximate solutions of initial value problems.
Rather than using the application-specific notation given above involving θ we will consider the
following initial value problem in this Section. We seek y = y(t) (or an approximation to it) that
satisfies the differential equation
dy
= f (t, y), (t > 0)
dt
and which is subject to the initial condition
y(0) = y0 ,
a known quantity.
Some of the examples we will consider will be such that an analytic solution is readily available, and
this fact can be used as a check on the accuracy of the numerical methods that follow.
HELM (2006): 3
Section 32.1: Initial Value Problems
2. Numerical solutions
We suppose that the initial value problem
dy
= f (t, y) y(0) = y0
dt
is such that we are unable (or unwilling) to seek a solution analytically (that is, by hand) and that we
prefer to use a computer to approximate y instead. We begin by asking what we expect a numerical
solution to look like.
Numerical solutions to initial value problems discussed in this Workbook will be in the form of a
sequence of numbers approximating y(t) at a sequence of values of t. The simplest methods choose
the t-values to be equally spaced, and we will stick to these methods. We denote the common
distance between consecutive t-values as h.
Key Point 2
A numerical approximation to the initial value problem
dy
= f (t, y), y(0) = y0
dt
is a sequence of numbers y0 , y1 , y2 , y3 , . . . .
The value y0 will be exact, because it is defined by the initial condition.
For n ≥ 1, yn is the approximation to the exact value y(t) at t = nh.
In Figure 1 the exact solution y(t) is shown as a thick curve and approximations to y(nh) are shown
as crosses.
y3
y2
y1
y0
t
t1 = h t2 = 2h t3 = 3h
Figure 1
4 HELM (2006):
®
The general idea is to take the given initial condition y0 and then use it together with what we know
about the physical process (from the differential equation) to obtain an approximation y1 to y(h).
We will have then carried out the first time step.
Then we use the differential equation to obtain y2 , an approximation to y(2h). Thus the second
time step is completed.
And so on, at the nth time step we find yn , an approximation to y(nh).
Key Point 3
A time step is the procedure carried out to move a numerical approximation one increment forward
in time.
The way in which we choose to “use the differential equation” will define a particular numerical
method, and some ways are better than others. We begin by looking at the simplest method.
3. An explicit method
Guided by the fact that we only seek approximations to y(t) at t-values that are a distance h apart we
could use a forward difference formula to approximate the derivative in the differential equation.
This leads to
y(t + h) − y(t)
≈ f (t, y)
h
and we use this as the inspiration for the numerical method
yn+1 − yn = hf (nh, yn )
For clarity we denote f (nh, yn ) as fn . The procedure for implementing the method (called Euler’s
method - pronounced “Oil-er’s method” - is summarised in the following Key Point.
HELM (2006): 5
Key Point 4
Euler’s method for approximating the solution of
dy
= f (t, y), y(0) = y0
dt
is as follows. We choose a time step h, then
y(h) ≈ y1 = y0 + hf (0, y0 )
y(2h) ≈ y2 = y1 + hf (h, y1 )
y(3h) ≈ y3 = y2 + hf (2h, y2 )
y(4h) ≈ y4 = y3 + hf (3h, y3 )
..
.
In general, y(nh) is approximated by yn = yn−1 + hfn−1 .
This is called an explicit method, but the reason why will be clearer in a page or two when we
encounter an implicit method. First we look at an Example.
Example 1
Suppose that y = y(t) is the solution to the initial value problem
dy
= −1/(t + y)2 , y(0) = 0.9
dt
Carry out two time steps of Euler’s method with a step size of h = 0.125 so as to
obtain approximations to y(0.125) and y(0.25).
Solution
In general, Euler’s method may be written yn+1 = yn + hfn and here f (t, y) = −1/(t + y)2 .
For the first time step we require f0 = f (0, y0 ) = f (0, 0.9) = −1.23457 and therefore
y1 = y0 + hf0 = 0.9 + 0.125 × (−1.23457) = 0.745679
For the second time step we require f1 = f (h, y1 ) = f (0.125, 0.745679) = −1.31912 and therefore
y2 = y1 + hf1 = 0.745679 + 0.125 × (−1.31912) = 0.580789
We conclude that
y(0.125) ≈ 0.745679 y(0.25) ≈ 0.580789
where these approximations are given to 6 decimal places.
6 HELM (2006):
®
The simple, repetitive nature of this process makes it ideal for computational implementation, but
this next exercise can be carried out by hand.
Task
dy
= −y 2 , y(0) = 0.5
dt
Carry out two time steps of Euler’s method with a step size of h = 0.01 so as to
Your solution
Answer
For the first time step we require f0 = f (0, y0 ) = f (0, 0.5) = −(0.5)2 = −0.25 and therefore
y1 = y0 + hf0 = 0.5 + 0.01 × (−0.25) = 0.4975
For the second time step we require f1 = f (h, y1 ) = f (0.01, 0.4975) = −(0.4975)2 = −0.24751
and therefore
y2 = y1 + hf1 = 0.4975 + 0.01 × (−0.24751) = 0.495025
We conclude that
y(0.01) ≈ 0.497500 y(0.02) ≈ 0.495025 to six decimal places.
The following Task involves the so-called logistic approximation that may be used in modelling
population dynamics.
HELM (2006): 7
Task
Given the logistic population dynamic model
dy
= 2y(1 − y), y(0) = 1.2
dt
carry out two time steps of Euler’s method with a step size of h = 0.125 to obtain
approximations to y(0.125) and y(0.25).
Your solution
Answer
For the first time step we require f0 = f (0, y0 ) = f (0, 1.2) = 2×1.2(1−1.2) = −0.48 and therefore
y1 = y0 + hf0
= 1.2 + 0.125 × (−0.48)
= 1.14
y2 = y1 + hf1
= 1.14 + 0.125 × (−0.3192)
= 1.1001
We conclude that
y(0.125) ≈ 1.14
y(0.25) ≈ 1.1001
8 HELM (2006):
®
Task
The following initial value problem models the population of the United Kingdom,
suppose that
dP
= 2.5 × 10−3 P, P (0) = 58.043
dt
where P is the population in millions, t is measured in years and t = 0 corresponds
to the year 1996.
(a) Show that Euler’s method applied to this initial value problem leads to
Pn+1 = (1 + 2.5 × 10−3 h)n × 58.043
where Pn is the approximation to P (nh).
(b) Use a time step of h equal to 6 months to approximate the predicted popu-
lation for the year 2050.
Your solution
(a)
Answer
In general Pn+1 = Pn + hfn where, in this case, f (h, Pn ) = 2.5 × 10−3 Pn hence
Pn+1 = Pn + 2.5 × 10−3 hPn and so Pn+1 = (1 + 2.5 × 10−3 h)Pn
But Pn will have come from the previous time step (Pn = (1 + 2.5 × 10−3 h)Pn−1 ) and Pn−1 will
have come from the time step before that (Pn−1 = (1 + 2.5 × 10−3 h)Pn−2 ). Repeatedly applying
this observation leads to
Pn+1 = (1 + 2.5 × 10−3 h)n × 58.043
since P0 = P (0) = 58.043.
Your solution
(b)
HELM (2006): 9
Answer
1
For a time step of 6 months we take h = 2
(in years) and we require 108 time steps to cover the
54 years from 1996 to 2050. Hence
UK population (in millions) in 2050 ≈ P (54) ≈ P108 = (1 + 2.5 × 10−3 × 12 )108 × 58.043 =
66.427
where this approximation is given to 3 decimal places.
Accuracy of Euler’s method

There are two issues to consider when concerning ourselves with the accuracy of our results.
1. How accurately does the differential equation model the physical process?
2. How accurately does the numerical method approximate the solution of the differential equa-
tion?
Our aim here is to address only the second of these two questions.
Let us now consider an example with a known solution and consider just how accurate Euler’s method
is. Suppose that
dy
=y y(0) = 1.
dt
We know that the solution to this problem is y(t) = et , and we now compare exact values with the
values given by Euler’s method. For the sake of argument, let us consider approximations to y(t) at
t = 1. The exact value is y(1) = 2.718282 to 6 decimal places. The following table shows results to
6 decimal places obtained on a spreadsheet program for a selection of choices of h.
h Euler approximation Difference between exact
to y(1) = 2.718282 value and Euler approximation
0.2 y5 = 2.488320 0.229962
0.1 y10 = 2.593742 0.124539
0.05 y20 = 2.653298 0.064984
0.025 y40 = 2.685064 0.033218
0.0125 y80 = 2.701485 0.016797
Notice that the smaller h is, the more time steps we have to take to get to t = 1. In the table
above each successive implementation of Euler’s method halves h. Interestingly, the error halves
(approximately) as h halves. This observation verifies something we will see in Section 32.2, that
is that the error in Euler’s method is (approximately) proportional to the step size h. This sort of
behaviour is called first-order, and the reason for this name will become clear later.
Key Point 5
Euler’s method is first order. In other words, the error it incurs is approximately proportional to h.
10 HELM (2006):
®
4. An implicit method
Another approach that can be used to address the initial value problem
dy
= f (t, y), y(0) = y0
dt
is to consider integrating the differential equation
dy
= f (t, y)
dt
from t = nh to t = nh + h. This leads to
h it=nh+h Z (n+1)h
y(t) = f (t, y) dt
t=nh nh
that is,
Z (n+1)h
y(nh + h) − y(nh) = f (t, y) dt
nh
and the problem now becomes one of approximating the integral on the right-hand side.
If we approximate the integral using the simple trapezium rule and replace the terms by their approx-
imations we obtain the numerical method
yn+1 − yn = 21 h (fn + fn+1 )
The procedure for time stepping with this method is much the same as that used for Euler’s method,
but with one difference. Let us imagine applying the method, we are given y0 as the initial condition
and now aim to find y1 from
y1 = y0 + h2 (f0 + f1 )
= y0 + h2 {f (0, y0 ) + f (h, y1 )}
And here is the problem: the unknown y1 appears on both sides of the equation. We cannot, in
general, find an explicit expression for y1 and for this reason the numerical method is called an
implicit method.
In practice the particular form of f may allow us to find y1 fairly simply, but in general we have to
approximate y1 for example by using the bisection method, or Newton-Raphson. (Another approach
that can be used involves what is called a predictor-corrector method, in other words, a “guess
and improve” method, and we will discuss this again later in this Workbook.)
And then, of course, we encounter the problem again in the second time step, when calculating y2 .
And again for y3 and so on. There is, in general, a genuine cost in implementing implicit methods,
but they are popular because they have desirable properties, as we will see later in this Workbook.
HELM (2006): 11
Key Point 6
The trapezium method for approximating the solution of
dy
= f (t, y) y(0) = y0
dt
is as follows. We choose a time step h, then

1
y(h) ≈ y1 = y0 + 2
h f (0, y0 ) + f (h, y1 )

1
y(2h) ≈ y2 = y1 + 2
h f (h, y1 ) + f (2h, y2 )

1
y(3h) ≈ y3 = y2 + 2
h f (2h, y2 ) + f (3h, y3 )

1
y(4h) ≈ y4 = y3 + 2
h f (3h, y3 ) + f (4h, y4 )
..
.

1
In general, y(nh) is approximated by yn = yn−1 + 2
h fn−1 + fn
In Example 2 the implicit nature of the method is not a problem because y does not appear on the
right-hand side of the differential equation. In other words, f = f (t).
Example 2
dy
= 1/(t + 1), y(0) = 1
dt
Carry out two time steps of the trapezium method with a step size of h = 0.2 so
as to obtain approximations to y(0.2) and y(0.4).
Solution
For the first time step we require f0 = f (0) = 1 and f1 = f (0.2) = 0.833333 and therefore
y1 = y0 + 12 h(f0 + f1 ) = 1 + 0.1 × 1.833333 = 1.183333
For the second time step we also require f2 = f (2h) = f (0.4) = 0.714286 and therefore
y2 = y1 + 21 h(f1 + f2 ) = 1.183333 + 0.1 × 1.547619 = 1.338095
We conclude that
y(0.1) ≈ 1.183333 y(0.2) ≈ 1.338095
12 HELM (2006):
®
Example 3 has f dependent on y, so the implicit nature of the trapezium method could be a problem.
However in this case the way in which f depends on y is simple enough for us to be able to rearrange
for an explicit expression for yn+1 .
Example 3
dy
= 1/(t2 + 1) − 2y, y(0) = 2
dt
Carry out two time steps of the trapezium method with a step size of h = 0.1 so
as to obtain approximations to y(0.1) and y(0.2).
Solution
The trapezium method is yn+1 = yn + h2 (fn + fn+1 ) and in this case yn+1 will appear on both sides
because f depends on y. We have

h 1 1
yn+1 = yn + − 2yn + 2 − 2yn+1
2 t2n + 1 tn+1 + 1
h
= yn + {g(tn ) − 2yn + g(tn+1 ) − 2yn+1 }
2
1
where g(t) ≡ which is the part of f that depends on t. On rearranging to get all yn+1
(t2 + 1)
terms on the left, we get
n o
1
(1 + h)yn+1 = yn + 2 h g(tn ) − 2yn + g(tn+1 )
In this case h = 0.1.

For the first time step we require g(0) = 1 and g(0.1) = 0.990099 and therefore
1.1y1 = 2 + 0.05 (1 − 2 × 2 + 0.990099)
Hence y1 = 1.726823, to six decimal places.
For the second time step we also require g(2h) = g(0.2) = 0.961538 and therefore
1.1y2 = 1.726823 + 0.05 (0.990099 − 2 × 1.726823 + 0.961538)
Hence y2 = 1.501566. We conclude that y(0.1) ≈ 1.726823 and y(0.2) ≈ 1.501566 to 6 d.p.
HELM (2006): 13
Task
dy
= t − y, y(0) = 2
dt
Carry out two time steps of the trapezium method with a step size of h = 0.125
so as to obtain approximations to y(0.125) and y(0.25).
Your solution
Answer
The trapezium method is yn+1 = yn + h2 (fn + fn+1 ) and in this case yn+1 will appear on both sides
because f depends on y. However, we can rearrange for yn+1 to give
1.0625yn+1 = yn + 21 h (g(tn ) − yn + g(tn+1 ))
where g(t) = t is the part of f that depends on t.
1.0625y1 = 2 + 0.0625 (0 − 2 + 0.125)
Hence y1 = 1.772059 to 6 d.p.
1.0625y2 = 1.772059 + 0.0625 (0.125 − 1.772059 + 0.25)
Hence y2 = 1.58564, to 6 d.p.
14 HELM (2006):
®
Example 4
The current i in a simple circuit involving a resistor of resistance R and an in-
ductance loop of inductance L with applied voltage E satisfies the differential
equation
di
L + Ri = E
dt
Consider the case where L = 1, R = 100 and E = 1000. Given that i(0) = 0 use
a value of h = 0.001 in implementation of the trapezium method to approximate
the current i at times t = 0.001 and t = 0.002.
Solution
The current i satisfies
di
= 1000 − 100i
dt
and the trapezium approximation to this is
h
in+1 − in = (2000 − 100in+1 − 100in )
2
Rearranging this for in+1 gives
in+1 = 0.904762in + 0.952381
It follows that
i(0.001) ≈ 0.904762 × 0 + 0.952381 = 0.952381

i(0.002) ≈ 0.904762 × 0.952381 + 0.952381 = 1.814059
where these approximations are given to 6 decimal places.
Accuracy of the trapezium method

Let us now consider an example with a known solution and consider just how accurate the trapezium
method is. Suppose that we look at the same test problem we considered when looking at Euler’s
method
dy
= y, y(0) = 1.
dt
We know that the solution to this problem is y(t) = et , and we now compare exact values with the
values given by the trapezium method. For the sake of argument, let us consider approximations to
y(t) at t = 1. The exact value is y(1) = 2.718282 to 6 decimal places. The following table shows
results to 6 decimal places obtained on a spreadsheet program for a selection of choices of h.
HELM (2006): 15
h Trapezium approximation Difference between exact
to y(1) = 2.718282 value and trapezium approximation
0.2 y5 = 2.727413 0.009131
0.1 y10 = 2.720551 0.002270
0.05 y20 = 2.718848 0.000567
0.025 y40 = 2.718423 0.000142
0.0125 y80 = 2.718317 0.000035
Notice that each time h is reduced by a factor of 21 , the error reduces by a factor of (approximately) 14 .
This observation verifies something we will see in Section 32.2, that is that the error in the trapezium
approximation is (approximately) proportional to h2 . This sort of behaviour is called second-order.
Key Point 7
The trapezium approximation is second order. In other words, the error it incurs is approximately
proportional to h2 .
16 HELM (2006):
®
Exercises
1. Suppose that y = y(t) is the solution to the initial value problem
dy
=t+y y(0) = 3
dt
Carry out two time steps of Euler’s method with a step size of h = 0.05 so as to obtain
dy
= 1/(t2 + 1) y(0) = 2
dt
Carry out two time steps of the trapezium method with a step size of h = 0.1 so as to obtain
dy
= t2 − y y(0) = 1.5
dt
Carry out two time steps of the trapezium method with a step size of h = 0.125 so as to obtain
4. The current i in a simple circuit involving a resistor of resistance R, an inductance loop of

inductance L with applied voltage E satisfies the differential equation
di
L + Ri = E
dt
Consider the case where L = 1.5, R = 120 and E = 600. Given that i(0) = 0 use a value of
h = 0.0025 in implementation of the trapezium method to approximate the current i at times
t = 0.0025 and t = 0.005.
HELM (2006): 17
Answers
1. For the first time step we require f0 = f (0, y0 ) = f (0, 3) = 3 and therefore
y1 = y0 + hf0
= 3 + 0.05 × 3
= 3.15
For the second time step we require f1 = f (h, y1 ) = f (0.05, 3.15) = 3.2 and therefore
y2 = y1 + hf1
= 3.15 + 0.05 × 3.2
= 3.31
We conclude that
y(0.05) ≈ 3.15
y(0.1) ≈ 3.31
2. For the first time step we require f0 = f (0) = 1 and f1 = f (0.1) = 0.990099 and therefore
y1 = y0 + 21 h(f0 + f1 )
= 2 + 0.05 × 1.990099
= 2.099505
For the second time step we also require f2 = f (2h) = f (0.2) = 0.961538 and therefore
y2 = y1 + 21 h(f1 + f2 )
= 2.099505 + 0.05 × 1.951637
= 2.197087
We conclude that
y(0.05) ≈ 2.099505
y(0.1) ≈ 2.197087
to six decimal places.
18 HELM (2006):
®
Answers
3. The trapezium method is yn+1 = yn + h2 (fn + fn+1 ) and in this case yn+1 will appear on both
sides because f depends on y. However, we can rearrange for yn+1 to give
1.0625yn+1 = yn + 12 h {g(tn ) − yn + g(tn+1 )}
where g(t) = t2 is the part of f that depends on t.

1.0625y1 = 1.5 + 0.0625 (0 − 1.5 + 0.015625)
Hence y1 = 1.324449.
1.0625y2 = 1.324449 + 0.0625 (0.015625 − 1.324449 + 0.0625)
Hence y2 = 1.173227.
4. Dividing through by L = 1.5 we find that the current i satisfies

di
= 400 − 80i
dt
and the trapezium approximation to this is
h
in+1 − in = (800 − 80in+1 − 80in )
2
Rearranging this for in+1 gives
in+1 = 0.818182in + 0.909091
It follows that
i(0.0025) ≈ 0.818182 × 0 + 0.909091 = 0.909091

i(0.005) ≈ 0.818182 × 0.909091 + 0.909091 = 1.652893
HELM (2006): 19
Linear Multistep
Methods 32.2

Introduction
In the previous Section we saw two methods (Euler and trapezium) for approximating the solutions of
certain initial value problems. In this Section we will see that those two methods are special cases of
a more general collection of techniques called linear multistep methods. Techniques for determining
the properties of these methods will be presented.
Another class of approximations, called Runge-Kutta methods, will also be discussed briefly. These
are not linear multistep methods, but the two are sometimes used in conjunction.

Prerequisites • review Section 32.1


'
$
• implement linear multistep methods to carry
out time steps of numerical methods
• evaluate the zero stability of linear multistep

Learning Outcomes methods
On completion you should be able to . . . • establish the order of linear multistep
methods
• implement a Runge-Kutta method

& %
20 HELM (2006):
1. General linear multistep methods
Euler’s method and the trapezium method are both special cases of a wider class of so-called linear
multistep methods. The following Key Point gives the most general situation that we will look at.
Key Point 8
The general k-step linear multistep method is given by

αk yn+k + · · · + α1 yn+1 + α0 yn = h βk fn+k + · · · + β1 fn+1 + β0 fn
or equivalently
k
X k
X
αj yn+j = h βj fn+j .
j=0 j=0
It is always the case that αk 6= 0. Also, at least one of α0 and β0 will be non-zero.
A linear multistep method is defined by the choice of the quantities
k, α0 , α1 , . . . , αk , β0 , β1 , . . . , βk
• If βk = 0 the method is called explicit. (Because at each step, when we are trying to find
the newest yn+k , there is no appearance of this unknown on the right-hand side.)
• If βk 6= 0 the method is called implicit. (Because yn+k now appears on both sides of the
equation (on the right-hand side it appears through fn+k = f ((n + k)h, yn+k ), and we cannot,
in general, rearrange to get an explicit formula for yn+k .)
The next Example shows one such choice.
HELM (2006): 21
Section 32.2: Linear Multistep Methods
Example 5
Write down the linear multistep scheme defined by the choices k = 1, α0 = −1,
α1 = 1, β0 = β1 = 12 .
Solution
Here k = 1 so that
α1 yn+1 + α0 yn = h(β1 fn+1 + β0 fn )
and substituting the values in for the four coefficients gives
yn+1 − yn = h 12 fn+1 + 21 fn

which, as we know, is the trapezium method.
Task
Write down the linear multistep scheme defined by the choices k = 1, α0 = −1,
α1 = 1, β0 = 1 and β1 = 0.
Your solution
Answer
Here k = 1 and we have
α1 yn+1 + α0 yn = h(β1 fn+1 + β0 fn )
and substituting the values in for the four coefficients gives
yn+1 − yn = hfn
which, as we know, is Euler’s method.
22 HELM (2006):
Task
Write down the linear multistep scheme defined by the choices k = 2, α0 = 0,
α1 = −1, α2 = 1, β2 = 0, β1 = 32 and β0 = − 12 .
Your solution
Answer
Here k = 2 (so we are looking at a 2-step scheme) and we have
α2 yn+1 + α1 yn+1 + α0 yn = h(β2 fn+2 + β1 fn+1 + β0 fn )
Substituting the values in for the six coefficients gives
h
yn+2 − yn+1 = (3fn+1 − fn )
2
which is an example of a scheme that is explicit (because βk = β2 is zero).
In the preceding Section we saw several examples implementing the Euler and trapezium methods.
The next Example deals with the explicit 2-step that was the subject of the Task above.
Example 6
A numerical scheme has been used to approximate the solution of
dy
= t + y, y(0) = 3
dt
and has produced the following estimates, to 6 decimal places,
y(0.4) ≈ 4.509822, y(0.45) ≈ 4.755313
Now use the 2-step, explicit linear multistep scheme
yn+2 − yn+1 = h (1.5fn+1 − 0.5fn )
to approximate y(0.5).
HELM (2006): 23
Solution
Evidently the value h = 0.05 will serve our purposes and we seek y10 ≈ y(0.5). The values we will
need to use in our implementation of the 2-step scheme are y9 = 4.755313 and
f9 = f (0.45, y9 ) = 5.205313 f8 = f (0.4, y8 ) = 4.909822
to 6 decimal places since f (t, y) = t + y. It follows that
y10 = y9 + 0.05 × (1.5f9 − 0.5f8 )

= 5.022966
And we conclude that y(0.5) ≈ 5.022966, where this approximation has been given to 6 decimal
places.
Notice that in this implementation of a 2-step method we needed to use the values of the two y
values preceding the one currently being sought. Both y8 and y9 were used in finding y10 .
Similarly, a k-step method will use, in general, k previous y values at each time step.
This means that there is an issue to be resolved in implementing methods that are 2- or higher-step,
because when we start we are only given one starting value y0 . This issue will be dealt with towards
the end of this Section. The following exercise involves a 2-step method, but (like the example
above) it does not encounter the difficulty relating to starting values as it assumes that the numerical
procedure is already underway.
Task
A numerical scheme has been used to approximate the solution of
dy
= t/y y(0) = −2
dt
and has produced the following estimates, to 6 decimal places,
y(0.24) ≈ −2.013162, y(0.26) ≈ −2.015546
yn+2 − 12 yn+1 − 12 yn = 23 hfn+1
24 HELM (2006):
Your solution
Answer
Evidently the value h = 0.02 will serve our purposes and we seek y14 ≈ y(0.28). The values we will
need to use in our implementation of the 2-step scheme are y13 = −2.015546, y12 = −2.013162
and
f13 = f (0.26, y13 ) = −0.128997
to 6 decimal places since f (t, y) = t/y. It follows that
1
y14 = y
2 13
+ 21 y12 + 0.02 × 23 f13
= −2.018224
And we conclude that y(0.28) ≈ −2.018224, to 6 decimal places.
Zero stability
We now begin to classify linear multistep methods. Some choices of the coefficients give rise to
schemes that work well, and some do not. One property that is required if we are to obtain reliable
approximations is that the scheme be zero stable. A scheme that is zero stable will not produce
approximations which grow unrealistically with t.
We define the first characteristic polynomial
ρ(z) = α0 + α1 z + α2 z 2 + . . . αk z k
where the αi are the coefficients of the linear multistep method as defined in Key Point 8 (page 21).
This polynomial appears in the definition of zero stability given in the following Key Point.
HELM (2006): 25
Key Point 10
The linear multistep scheme
k
X k
X
αj yn+j = h βj fn+j .
j=0 j=0
is said to be zero stable if the zeros of the first characteristic polynomial are such that
1. none is larger than 1 in magnitude
2. any zero equal to 1 in magnitude is simple (that is, not repeated)
The second characteristic polynomial is defined in terms of the coefficients on the right-hand
side (the βj ), but its use is beyond the scope of this Workbook.
Example 7
Find the roots of the first characteristic polynomial for each of the examples below
and determine whether or not the method is zero stable.
(a) yn+1 − yn = hfn
(b) yn+1 − 2yn = hfn
(c) yn+2 + 3yn+1 − 4yn = h (2fn+2 + fn+1 + 2fn )
(d) yn+2 − yn+1 = 23 hfn+1
(e) yn+2 − 2yn+1 + yn = h(fn+2 − fn )
(f) yn+2 + 2yn+1 + 5yn = h (fn+2 − fn+1 + 2fn )
Solution
(a) In this case ρ(z) = z − 1 and the single zero of ρ is z = 1. This is a simple (that is, not
repeated) root with magnitude equal to 1, so the method is zero stable.
(b) ρ(z) = z − 2 which has one zero, z = 2. This has magnitude 2 > 1 and therefore the method
is not zero stable.
(c) ρ(z) = z 2 + 3z − 4 = (z − 1)(z + 4). One root is z = −4 which has magnitude greater than
1 and the method is therefore not zero stable.
26 HELM (2006):
Solution (contd.)
(d) Here α2 = 1, α1 = −1 and α0 = 0, therefore
ρ(z) = z 2 − z = z(z − 1)
which has two zeros, z = 0 and z = 1. These both have magnitude less than or equal to 1
and there is no repeated zero with magnitude equal to 1, so the method is zero stable.
(e) ρ(z) = z 2 − 2z + 1 = (z − 1)2 . Here z = 1 is not a simple root, it is repeated and, since it
has magnitude equal to 1, the method is not zero stable.
(f) ρ(z) = z 2 + 2z + 5 and the roots of ρ(z) = 0 can be found from the quadratic formula. In this
case the roots are complex and are equal to Zero-stability requires that the absolute values
have magnitude less than or equal to 1. Consequently we conclude that the method is not
zero stable.
Task
Find the roots of the first characteristic polynomial for the linear multistep scheme
yn+2 − 2yn+1 + yn = h (fn+2 + 2fn+1 + fn )
and hence determine whether or not the scheme is zero stable.
Your solution
Answer
The first characteristic polynomial is
ρ(z) = α2 z 2 + α1 z + α0 = z 2 − 2z + 1
and the roots of ρ(z) = 0 are both equal to 1. In the case of roots that are equal, zero-stability
requires that the absolute value has magnitude less than 1. Consequently we conclude that the
method is not zero stable.
At this stage, the notion of zero stability is rather abstract, so let us try using a zero unstable
method and see what happens. We consider the simple test problem
HELM (2006): 27
dy
= −y, y(0) = 1
dt
which we know to have analytic solution y(t) = e−t , a quantity which decays with increasing t.
Implementing the zero unstable scheme
yn+1 − 2yn = hfn
on a spreadsheet package with h = 0.05 gives the following results
n t = nh yn ≈ y(nh)
0 0.00 1.00000
1 0.05 1.95000
2 0.10 3.80250
3 0.15 7.41488
4 0.20 14.45901
5 0.25 28.19506
6 0.30 54.98037
7 0.35 107.21172
8 0.40 209.06286
9 0.45 407.67258
10 0.50 794.96153
11 0.55 1550.17499
12 0.60 3022.84122
13 0.65 5894.54039
14 0.70 11494.35376
15 0.75 22413.98982
where 5 decimal places have been given for yn . The dramatic growth in the values of yn is due to
the zero instability of the method. (There are in fact other things than zero instability wrong with
the scheme yn+1 − 2yn = hfn , but it is the zero instability that is causing the large numbers.)
Consistency and order

A scheme that is zero stable will produce approximations that do not grow in size in a way that is
not present in the exact, analytic solution. Zero stability is a required property, but it is not enough
on its own. There remains the issue of whether the approximations are close to the exact values.
The truncation error of the general linear multistep method is a measure of how well the differential
equation and the numerical method agree with each other. It is defined by
∞
1 c0 0 00 2 000
1 X
τj = y(jh) + c1 y (jh) + c2 hy (jh) + c3 h y (jh) + . . . = cp hp y (p) (jh)
β h βh p=0
P
where β = βj is a normalising factor.
It is the first few terms in this expression that will matter most in what follows, and it helps us that
there are formulae for the coefficients which appear
X X X j2 X j3 j2

c0 = αj , c1 = (jαj − βj ), c2 = αj − jβj , c3 = αj − βj
2 3! 2
X jp j p−1

and so on, the general formula for p ≥ 2 is cp = αj − βj .
(p)! (p − 1)!
28 HELM (2006):
Recall that the truncation error is intended to be a measure of how well the differential equation and
its approximation agree with each other. We say that the numerical method is consistent with the
differential equation if τj tends to zero as h → 0. The following Key Point says this in other words.
Key Point 11
The linear multistep scheme is said to be consistent if c0 = 0 and c1 = 0.
Example 8
Show that Euler’s method (yn+1 = yn + hfn ) is consistent.
Solution
In this case α1 = 1, α0 = −1, β1 = 0 and β0 = 1. It follows that
X X
c0 = αj = 1 − 1 = 0 and c1 = jαj − βj = 1α1 − (β0 + β1 ) = 1 − (1 + 0) = 0
and therefore Euler’s method is consistent.
Task
Show that the trapezium method (yn+1 = yn + h2 (fn+1 + fn )) is consistent.
Your solution
Answer
In this case α1 = 1, α0 = −1, β1 = 12 and β0 = 12 . It follows that
X X
c0 = αj = 1 − 1 = 0 and c1 = jαj − βj = 1α1 − (β0 + β1 ) = 1 − ( 21 + 12 ) = 0
and therefore the trapezium method is consistent.
HELM (2006): 29
Task
Determine the consistency (or otherwise) of the following 2-step linear multistep
schemes
(a) yn+2 − 2yn+1 + yn = h(fn+2 − fn )
(b) yn+2 − yn+1 = h(fn+1 − 2fn )
(c) yn+2 − yn+1 = h(2fn+2 − fn+1 )
Your solution
Answer
(a) c0 = α2 + α1 + α0 = 1 − 2 + 1 = 0, c1 = 2α2 + 1 × α1 + 0 × α0 − (β2 + β1 + β0 ) =
2(1) + 1(−2) + 0 − (1 − 1) = 0. Therefore the method is consistent.
(b) c0 = 1 − 1 + 0 = 0, c1 = 2 − 1 − (1 − 2) = 2 so the method is inconsistent.
(c) This method is consistent, because c0 = 1 − 1 = 0 and c1 = 2 − 1 − (2 − 1) = 0.
(Notice also that the first characteristic polynomial ρ(z), defined on page 6 of this Section, evaluated
at z = 1 is equal to α0 + α1 + · · · + αk = c0 . It follows that a consistent scheme must always have
z = 1 as one of the roots of its ρ(z).)
Assuming that the method is consistent, the order of the scheme tells us how quickly the truncation
error tends to zero as h → 0. For example, if c0 = 0, c1 = 0, c2 = 0 and c3 6= 0 then the first non-
zero term in τj will be the one involving h2 and the linear multistep method is called second-order.
This means that if h is small then τj is dominated by the h2 term (because the h3 and subsequent
terms will be tiny in comparison) and halving h will cause τj to decrease by a factor of approximately
1
4
. The decrease is only approximately known because the h3 and other terms will have a small effect.
We summarise the general situation in the following Key Point.
Key Point 12
A linear multistep method is said to be of order p if
c0 = c1 = c2 = · · · = cp = 0 and cp+1 6= 0
30 HELM (2006):
Combining the last two Key Points gives us another way of describing consistency: “A linear multistep
method is consistent if it is at least first order”.
Example 9
Find the order of
(a) Euler’s method
(b) The trapezium method.
Solution
(a) We have already found that c0 = c1 = 0 so the first quantity to calculate is

X j2
c2 = αj − jβj = 21 α1 − β1 = 12
2
which is not zero and therefore Euler’s method is of order 1. (Or, in other words, Euler’s
method is first order.)
(b) We have already found that c0 = c1 = 0 so the first quantity to calculate is

X j2
c2 = αj − jβj = 21 α1 − β1 = 12 − 12 = 0
2
this is equal to zero, so we must calculate the next coefficient

X j3 j2

c3 = αj − βj = 16 α1 − 12 β1 = 16 − 14 = − 121
3! 2
which is not zero. Hence the trapezium method is of order 2 (that is, it is second order).
This finally explains some of the results we saw in the first Section of this Workbook. We saw that
the errors incurred by the Euler and trapezium methods, for a particular test problem, were roughly
proportional to h and h2 respectively. This behaviour is dictated by the first non-zero term in the
truncation error which is the one involving c2 h for Euler and the one involving c3 h2 for trapezium.
We now apply the method to another linear multistep scheme.
HELM (2006): 31
Example 10
Find the order of the 4-step, explicit linear multistep scheme
h
yn+4 − yn+3 = 55fn+3 − 59fn+2 + 37fn+1 − 9fn
24
Solution
In the established notation we have α4 = 1, α3 = −1, α2 = 0, α1 = 0 and α0 = 0. The β terms
similarly come from the coefficients on the right hand side (remembering the denominator of 24).
Now
X X
c0 = αj = 0 and c1 = jαj − βj = 0
from which we conclude that the method is consistent.
We also find that
X X
1 2 1 3
c2 = 2
j αj − jβ j = 0, c 3 = 6
j αj − 12 j 2 βj = 0,
X X
1 4 1 3 1 5 1 4
c4 = 24
j α j − 6
j βj = 0 c 5 = 120
j αj − 24 j βj = 0.348611 to 6 d.p.
251
(The exact value of c5 is 720
.)
Because c5 is the first non-zero coefficient we conclude that the method is of order 4.
So the scheme in Example 10 has the property that the truncation error will tend to zero proportional
to h4 (approximately) as h → 0. This is a good thing, as it says that the error will decay to zero
very quickly, when h is decreased.
Task
Find the order of the 2-step linear multistep scheme
h
yn+2 − yn+1 = fn+2 + 8fn+1 − fn
12
Your solution
32 HELM (2006):
Answer
5
In the established notation we have α2 = 1, α1 = −1 and α0 = 0. Also β2 = 12 , β1 = 23 and
1
β0 = − 12 . Now
X X
c0 = αj = 1 − 1 + 0 = 0 and c1 = jαj − βj = 2α2 + α1 − (β2 + β1 + β0 ) = 0
We also find that
X
1 2
c2 = 2
j αj − jβj = 12 (4α2 + α1 ) − (2β2 + β1 ) = 0
X
1 3
c3 = 6
j αj − 12 j 2 βj = 61 (8α2 + α1 ) − 21 (4β2 + β1 ) = 0
X
1 4
c4 = 24
j αj − 16 j 3 βj = 241
(16α2 + α1 ) − 61 (8β2 + β1 ) = − 24
1
so that the method is of order 3.
Convergence
The key result concerning linear multistep methods is given in the following Key Point.
Key Point 13
The numerical approximation to the initial value problem converges to the actual solution as h → 0
if
1. the scheme is zero stable
2. the scheme is consistent
The proof of this result lies beyond the scope of this Workbook. It is worth pointing out that this
is not the whole story. The convergence result is useful, but only deals with h as it tends to zero.
In practice we use a finite, non-zero value of h and there are ways of determining how big an h it
is possible to “get away with” for a particular linear multistep scheme applied to a particular initial
value problem.
If, when implementing the methods described above, it is found that the numerical approximations
behave in an unexpected way (for example, if the numbers are very large when they should not be, or
if decreasing h does not seem to lead to results that converge) then one topic to look for in further
reading is that of “absolute stability”.
HELM (2006): 33
2. An example of a Runge-Kutta method
A full discussion of the so-called Runge-Kutta methods is not required here, but we do need to touch
on them to resolve a remaining issue in the implementation of linear multistep schemes.
The problem with linear multistep methods is that a zero-stable, 1-step method can never be better
than second order (you need not worry about why this is true, it was proved in the latter half of the
last century by a man called Dahlquist). We have seen methods of higher order than 2, but they
were all at least 2-step methods. And the problem with 2-step methods is that we need 2 starting
values to implement them and we are only ever given 1 starting value: the initial condition y(0).
One way out of this “Catch 22” is to use a Runge-Kutta method to generate the extra starting
value(s) we need. Runge-Kutta methods are not linear multistep methods and do not suffer from
the problem mentioned above. There is no such thing as a free lunch, of course, and Runge-Kutta
methods are generally more expensive in effort to implement than linear multistep methods because
of the number of evaluations of f required at each time step.
The following Key Point gives a statement of what is, perhaps, the most popular Runge-Kutta
method (sometimes called “RK4”).
Key Point 14
Runge Kutta method (RK4)
Consider the usual initial value problem
dy
= f (t, y), y(0) = y0 .
dt
Calculate K1 = f (nh, yn )
then K2 = f ((n + 21 )h, yn + 12 hK1 )
then K3 = f ((n + 21 )h, yn + 12 hK2 )
then K4 = f ((n + 1)h, yn + hK3 )
h
finally yn+1 = yn + (K1 + 2K2 + 2K3 + K4 )
6
Notice that each calculation is explicit, all of the right-hand sides in the formulae in the Key Point
above involve known quantities.
34 HELM (2006):
Example 11
dy
= cos(y) y(0) = 3
dt
Carry out one time step of the Runge-Kutta method RK4 with a step size of
h = 0.1 so as to obtain an approximation to y(0.1).
Solution
The iteration must be carried out in four stages. We start by calculating
K1 = f (0, y0 ) = f (0, 3) = −0.989992
a value we now use in finding
K2 = f ( 21 h, y0 + 12 hK1 ) = f (0.05, 2.950500) = −0.981797
This value K2 is now used in our evaluation of
K3 = f ( 21 h, y0 + 12 hK2 ) = f (0.05, 2.950910) = −0.981875
which, in turn, is used in
K4 = f (h, y0 + hK3 ) = f (0.1, 2.901812) = −0.971390
All four of these values are then used to complete the iteration
h
y1 = y0 + (K1 + 2K2 + 2K3 + K4 )
6
0.1
= 3+ (−0.989992 + 2 × −0.981797 + 2 × −0.981875 − 0.971390)
6
= 2.901855 to 6 decimal places.
HELM (2006): 35
Task
dy
= y(1 − y) y(0) = 0.7
dt
Carry out one time step of the Runge-Kutta method RK4 with a step size of
h = 0.1 so as to obtain an approximation to y(0.1).
Your solution
Answer
The time step must be carried out in four stages. We start by calculating
K1 = f (0, y0 ) = f (0, 0.7) = 0.210000
a value we now use in finding
K2 = f ( 21 h, y0 + 12 hK1 ) = f (0.05, 0.710500) = 0.205690
K3 = f ( 12 h, y0 + 12 hK2 ) = f (0.05, 0.710284) = 0.205780
which, in turn, is used in
K4 = f (h, y0 + hK3 ) = f (0.1, 0.720578) = 0.201345
All four of these values are then used to complete the time step
h
y1 = y0 + (K1 + 2K2 + 2K3 + K4 )
6
0.1
= 0.7 + (0.210000 + 2 × 0.205690 + 2 × 0.205780 + 0.201345)
6
= 0.720571 to 6 d.p.
36 HELM (2006):
Exercises
1. Assuming the notation established earlier, write down the linear multistep scheme corresponding
to the choices k = 2, α0 = 0, α1 = −1, α2 = 1, β0 = −1 12
, β1 = 23 , β2 = 12
5
.
2. A numerical scheme has been used to approximate the solution of

dy
= t2 − y 2 y(0) = 2
dt
and has given the following estimates, to 6 decimal places,
y(0.3) ≈ 1.471433, y(0.32) ≈ 1.447892
yn+2 − 1.6yn+1 + 0.6yn = h (5fn+1 − 4.6fn )
3. Find the roots of the first characteristic polynomial for the linear multistep scheme
5yn+2 + 3yn+1 − 2yn = h (fn+2 + 2fn+1 + fn )
and hence determine whether or not the scheme is zero stable.
4. Find the order of the 2-step linear multistep scheme

h
yn+2 + 2yn+1 − 3yn = fn+2 + 16fn+1 + 17fn
10
(Would you recommend using this method?)
dy
= 1/y 2 y(0) = 2
dt
Carry out one time step of the Runge-Kutta method RK4 with a step size of h = 0.4 so as to
obtain an approximation to y(0.4).
HELM (2006): 37
Answers
h
1. yn+2 − yn+1 = 5fn+2 + 8fn+1 − fn
12
2. Evidently the value h = 0.02 will serve our purposes and we seek y17 ≈ y(0.34). The
values we will need to use in our implementation of the 2-step scheme are y16 = 1.447892,
y15 = 1.471433 and f16 = f (0.32, y16 ) = −1.993991 f15 = f (0.3, y15 ) = −2.075116
2 2
since f (t, y) = t − y . It follows that
y17 = 1.6y16 − 0.6y15 + 0.02 × (5f16 − 4.6f15 ) = 1.425279
And we conclude that y(0.34) ≈ 1.425279, to 6 decimal places.
3. The first characteristic polynomial is ρ(z) = α2 z 2 + α1 z + α0 = 5z 2 + 3z − 2 and the roots of

ρ(z) = 0 can be found from the quadratic formula. In this case the roots are real and distinct
and are equal to 0.4 and − 1. In the case of roots that are distinct zero-stability requires
that the absolute values have magnitude less than or equal to 1 . Consequently we conclude
that the method is zero stable.
4. In the established notation we have α2 = 1, α1 = 2 and α0 = −3. The beta terms similarly
come from the coefficients on the right hand side (remembering the denominator of 10).
X X
Now c0 = αj = 0 and c1 = jαj − βj = 0

X X
1 2 1 3
We also find that c2 = 2
j αj − jβ j = 0 c 3 = 6
j αj − 12 j 2 βj = −0.533333
so that the method is of order 2 . This method is not to be recommended however (check
the zero stability).
5. Each time step must be carried out in four stages. We start by calculating
K1 = f (0, y0 ) = f (0, 2) = 0.250000
a value we now use in finding K2 = f ( 12 h, y0 + 21 hK1 ) = f (0.2, 2.050000) = 0.237954
K3 = f ( 21 h, y0 + 12 hK2 ) = f (0.2, 2.047591) = 0.238514
which, in turn, is used in K4 = f (h, y0 + hK3 ) = f (0.4, 2.095406) = 0.227753
All four of these values are then used to complete the time step
h
y1 = y0 + (K1 + 2K2 + 2K3 + K4 )
6
0.4
= 2+ (0.250000 + 2 × 0.237954 + 2 × 0.238514 + 0.227753) = 2.095379
6
38 HELM (2006):
Predictor-Corrector
Methods 32.3
Introduction
In this final Section on numerical approximations for initial value problems involving ordinary differ-
ential equations we consider predictor-corrector methods. These methods are a way of getting
around the difficulties inherent in implementing certain implicit numerical schemes.

Prerequisites • review the preceding material in this

Workbook

Learning Outcomes • implement simple predictor-corrector methods


HELM (2006): 39
Section 32.3: Predictor-Corrector Methods
1. Predictor-corrector methods
We have seen that when using an implicit linear multistep method there is an additional difficulty
because we cannot, in general, solve simply for the newest approximate y-value yn+k . A general
k-step implicit method involves, at the k th time step,
αk yn+k + · · · + α1 y1 + α0 y0 = h(βk fn+k + · · · + β1 f1 + β0 f0 )
↑ ↑
the yn+k occurs
unknown here too
and if f depends on y in a complicated way then it is not obvious how to dig yn+k out of fn+k =
f ((n + k)h, yn+k ).
One solution to this problem would be to only ever use explicit methods in which βk = 0. But this
is not a good solution, for implicit methods generally have better properties than the explicit ones
(for example, the implicit trapezium is second order while the explicit Euler is only first order).
Another solution involves a so-called predictor-corrector method. This involves
P
1. The predictor step. We use an explicit method to obtain an approximation yn+k to yn+k .
P
2. The corrector step. We use an implicit method, but with the predicted value yn+k on the
P
right-hand side in the evaluation of fn+k . We use fn+k to denote this approximate (predicted)
value of fn+k .
3. We can then go on to correct again and again. At each step we put the latest approximation
to yn+k in the right-hand side of the scheme (via f ) to generate a new approximation from the
left-hand side.
(This is not unlike an implementation of Newton-Raphson. In that method we require an initial guess
(we “predict”) and then the Newton-Raphson approach tells us how to iterate (or “correct”) our
latest approximation. The main difference here is that we have a systematic way of obtaining the
initial prediction.)
It is sufficient for our purposes to illustrate the idea of a predictor-corrector method using the simplest
possible pair of methods. We use Euler’s method to predict and the trapezium method to correct.
40 HELM (2006):
Example 12
dy
= t + y, y(0) = 3
dt
Use Euler’s method and the trapezium method as a predictor-corrector pair (with
one correction at each time step). Take the time step to be h = 0.05 so as to
Solution
Euler’s method, yn+1 = yn + hfn , is the explicit method so we use that to predict. For the first
time step we require f0 = f (0, y0 ) = f (0, 3) = 3 and therefore
y1P = y0 + hf0 = 3 + 0.05 × 3 = 3.15
We now use this predicted value of y1 to obtain a “predicted” value for f1 which we can use in the
implicit trapezium method. We find f1P = f (h, y1P ) = f (0.05, 3.15) = 3.2. We now correct using
the trapezium method in the form
h 1
f0 + f1P = 3 + (0.05)(3 + 3.2) = 3.155

y1 = y0 +
2 2
This completes prediction and one correction for the first time step.
y2P = y1 + hf1 = 3.155 + 0.05 × 3.205 = 3.31525
which is the predicted value for y2 . We now correct it with
h 1
f1 + f2P = 3.155 + (0.05)(3.205 + 3.41525) = 3.320506

y2 = y1 +
2 2
We conclude that
y(0.05) ≈ 3.155
y(0.1) ≈ 3.320506
If correction is repeated until the corrected values settle down to a converged number then the
approximation inherits all the (nice) properties of the implicit scheme. So, in the example above we
would have second order accurate results obtained by a procedure which gets around the implicit
nature of the trapezium method. Of course in the hand-calculations done above we only corrected
once, rather than repeatedly to convergence.
The example above is such that the dependence of f (t, y) on y is very simple and we could use
the approach seen in Section 32.1 to implement the trapezium method. It turns out that the true
trapezium method approximations to y(0.05) and y(0.1) are y1 = 3.155128 and y2 = 3.320776
respectively, to 6 decimal places. The predictor-corrector method will produce these values if enough
corrections are taken.
As noted in the last paragraph, the example above was one in which it is possible to get around the
HELM (2006): 41
implicit nature of the trapezium method easily because of the simple way in which the right-hand
side of the differential equation depends on y. This is not true of the next example.
Example 13
dy
= − tan(y) y(0) = 1
dt
Solution
time step we require f0 = f (0, y0 ) = f (0, 1) = −1.55741 and therefore
y1P = y0 + hf0 = 1 + 0.2 × −1.55741 = 0.688518
We now use this predicted value to obtain a “predicted” value for f1 which we can use in the implicit
trapezium method. We find f1P = f (h, y1P ) = f (0.2, 0.688518) = −0.82285. We now correct using
h 1
f0 + f1P = 1 + (0.2)(−1.55741 − 0.822848) = 0.761974

y1 = y0 +
2 2
y2P = y1 + hf1 = 0.76194 + 0.2 × −0.95422 = 0.571131
h 1
f1 + f2P = 0.761974 + (0.2)(−0.95422 − −0.64257) = 0.602296

y2 = y1 +
2 2
We conclude that
y(0.2) ≈ 0.761974
y(0.4) ≈ 0.602296
42 HELM (2006):
Task
dy
= cos(y), y(0) = 0
dt
Your solution
Answer
Euler’s method, yn+1 = yn + hfn , is the explicit method so we use that to predict. For the first time
step we require f0 = f (0, y0 ) = f (0, 0) = 1 and therefore y1P = y0 + hf0 = 0 + 0.1 × 1 = 0.1 We
now use this predicted value to obtain a “predicted” value for f1 which we can use in the implicit
trapezium method. We find f1P = f (h, y1P ) = f (0.1, 0.1) = 0.995004. We now correct using the
h 1
trapezium method in the form y1 = y0 + f0 + f1P = 0 + (0.1)(1 + 0.995004) = 0.099750
2 2
which completes the prediction and one correction for the first time step.
y2P = y1 + hf1 = 0.099750 + 0.1 × 0.995029 = 0.199253
h 1
f1 + f2P = 0.099750 + (0.1)(0.995029 + 0.980215) = 0.198512

y2 = y1 +
2 2
We conclude that y(0.1) ≈ 0.099750, y(0.2) ≈ 0.198512 to six decimal places.
HELM (2006): 43
Exercise
dy
= 1/(1 + y 2 ) y(0) = 1
dt
Use Euler’s method and the trapezium method as a predictor-corrector pair (with one correction at
each time step). Take the time step to be h = 0.25 so as to obtain approximations to y(0.25) and
y(0.5).
Answer
time step we require f0 = f (0, y0 ) = f (0, 1) = 0.5 and therefore
y1P = y0 + hf0 = 1 + 0.25 × 0.5 = 1.125
We now use this predicted value to obtain a “predicted” value for f1 which we can use in the implicit
trapezium method. We find f1P = f (h, y1P ) = f (0.25, 1.125) = 0.441379. We now correct using
h 1
f0 + f1P = 1 + (0.25)(0.5 + 0.441379) = 1.117672

y1 = y0 +
2 2
y2P = y1 + hf1 = 1.125 + 0.25 × 0.444604 = 1.228823
h 1
f1 + f2P = 1.117672 + (0.25)(0.444604 + 0.398405) = 1.223049

y2 = y1 +
2 2
We conclude that y(0.25) ≈ 1.117672, y(0.5) ≈ 1.223049 to six decimal places.
44 HELM (2006):
®

Parabolic PDEs 32.4
Introduction
Second-order partial differential equations (PDEs) may be classified as parabolic, hyperbolic or elliptic.
Parabolic and hyperbolic PDEs often model time dependent processes involving initial data.
In this Section we consider numerical solutions of parabolic problems.

Prerequisites • review difference methods for first and second

derivatives ( 31.3)

• implement simple methods to obtain
Learning Outcomes approximate solutions of the heat diffusion
On completion you should be able to . . . equation

HELM (2006): 45
Section 32.4: Parabolic PDEs
1. Definitions
We begin by giving some definitions.
Suppose that u = u(x, t) satisfies the second order partial differential equation
Auxx + Buxt + Cutt + Dux + Eut + F u = G
in which A, . . . , G are given functions. This equation is said to be
parabolic if B 2 − 4AC = 0
hyperbolic if B 2 − 4AC > 0
elliptic if B 2 − 4AC < 0
These may look like rather abstract definitions at this stage, but we will see that equations of different
types give rise to mathematical models of different physical situations. In this Section we will consider
equations only of the parabolic type. The hyperbolic type is dealt with later in this Workbook and
the elliptic type is discussed in 33.
2. Motivation
Consider an example of the type seen in the earlier material concerning separable solutions of the
heat conduction equation. Suppose that u = u(x, t) is the temperature of a metal bar a distance x
from one end and at time t. For the sake of argument let us suppose that the metal bar has length
equal to ` and that the ends are held at constant temperatures uL at the left and uR at the right.
x
uL uR
0
Figure 2
We also suppose that the temperature distribution at the initial time is known to be f (x), with
f (0) = uL and f (`) = uR so that the initial and boundary conditions do not give rise to a conflict
at the ends of the bar at the initial time.
This physical situation may be modelled by

ut = αuxx (0 < x < `, t > 0) 

u(0, t) = uL (t > 0)

u(`, t) = uR (t > 0) 

u(x, 0) = f (x) (0 < x < `)

in which α > 0 is a constant called the thermal diffusivity or simply the diffusivity of the metal.
If the bar is made of aluminium then α = 0.86 cm2 s−1 , and if made of copper then α = 1.14 cm2
s−1 .
Using separation of variables and Fourier series (neither of which are required for the remainder of this
Section) it can be shown that the solution to the above problem (in the case where uL = uR = 0) is
46 HELM (2006):
®
∞ Z `
X
−m2 απ 2 t/`2 2
u(x, t) = Bm e sin(mπx/`), where Bm = f (s) sin(mπs/`) ds.
m=1
` 0
Now, let us be realistic. Any evaluation of u for particular choices of x and t must involve ap-
proximating the infinite series that defines u (that is, just taking the first few terms - and care is
required if we are to be sure that we have taken enough). Also, in each of the terms we retain in
the sum, we need to find Bm by integration. It is not surprising that computation of this procedure
is a common approach. So if we (eventually) resort to computation in order to find u, why not start
with a computational approach?
(This is not to say that there is no value in the analytic solution involving the Bm . The solution above
is of great value, but we simply observe here that there are times when a computational approach is
all we may end up needing.)
So, the aim of this Section is to derive methods for obtaining numerical solutions to parabolic
problems of the type above. In fact, it is sufficient for our present purposes to restrict attention to
that particular problem.
3. Approximating partial derivatives

Earlier, in 31.3, we saw methods for approximating first and second derivatives of a function
of one variable. We review some of that material here. If y = y(x) then the forward and central
difference approximations to the first derivative are:
dy y(x + δx) − y(x) dy y(x + δx) − y(x − δx)
≈ , ≈
dx δx dx 2δx
and the central difference approximation to the second derivative is:
d2 y y(x + δx) − 2y(x) + y(x − δx)
2
≈
dx (δx)2
in which δx is a small x-increment. The quantity δx is what we previously referred to as h, but it is
now convenient to use a notation which is more closely related to the independent variable (in this
case x). (Examples implementing the difference approximations for derivatives can be found in
31.)
We now return to the subject of this Section, that of partial derivatives. The PDE ut = αuxx involves
∂u ∂2u
the first derivative and the second derivative . We now adapt the ideas used for functions of
∂t ∂x2
one variable to the present case involving u = u(x, t).
∂u
Let δt be a small increment of t, then the partial derivative may be approximated by:
∂t
∂u u(x, t + δt) − u(x, t)
≈
∂t δt
∂2u
Let δx be a small increment of x, then the partial derivative may be approximated by:
∂x2
∂2u u(x + δx, t) − 2u(x, t) + u(x − δx, t)
≈
∂x (δx)2
The two difference approximations above are the ones we will use later in this Section. Example 14
below refers to these and others.
HELM (2006): 47
Example 14
Consider the function u defined by
u(x, t) = sin(x2 + 2t)
Using increments of δx = 0.004 and δt = 0.04, and working to 8 decimal places,
approximate
(a) ux (2, 3) with a one-sided forward difference
(b) uxx (2, 3) with a central difference
(c) ut (2, 3) with a one-sided forward difference
(d) ut (2, 3) with a central difference.
Enter your approximate derivatives to 3 decimal places.
Solution
The evaluations of u we will need are u(x, t) = −0.54402111, u(x + δx, t) = −0.55738933,
u(x − δx, t) = −0.53054047, u(x, t + δt) = −0.60933532, u(x, t − δt) = −0.47522703. It follows
that
−0.55738933 + 0.54402111
(a) ux (2, 3) ≈ = −3.342
0.004
−0.55738933 + 2 × 0.54402111 − 0.53054047
(b) uxx (2, 3) ≈ = 7.026
0.0042
−0.60933532 + 0.54402111
(c) ut (2, 3) ≈ = −1.633
0.04
−0.60933532 + 0.47522703
(d) ut (2, 3) ≈ = −1.676
2 × 0.04
to 3 decimal places. (Workings shown to 8 decimal places.)
4. An explicit numerical method for the heat equation

The approximations used above for approximating partial derivatives can now be applied in order to
derive a numerical method for solving the heat conduction problem
ut = αuxx (0 < x < `, t > 0)
u(0, t) = 0 (t > 0)
u(`, t) = 0 (t > 0)
u(x, 0) = f (x) (0 < x < `).
In order to specify the numerical method we choose values for δt and δx and use these in approx-
imations of the two derivatives in the partial differential equation. It is convenient to divide the
48 HELM (2006):
®
interval 0 < x < ` into equally spaced subintervals so, in effect, we choose a whole number J so
`
that δx = .
J
Key Point 15
In order to specify the numerical procedure for solving the heat conduction equation
∂u ∂2u
=α 2
∂t ∂x
we need to choose
δt − the time step

δx − the space step
j=1 j=2 j=3 j = J −1 j=J

t
δt
n=4
n=3
2δt n=2
δt n=1
x
δx 2δx x=
δx
Figure 3
The diagram above shows the independent variables x and t at which we seek the function u. The
numerical solution we shall find is a sequence of numbers which approximate u at a sequence of (x, t)
points.
HELM (2006): 49
Key Point 16
The numerical approximations to u(x, t) that we will find will be approximations to u at (x, t) values
where the horizontal and vertical lines cross in the above diagram (Figure 3).
The notation we use is that

unj ≈ u(j δx , n δt)
| {z }
↑ ↑
numerical exact (i.e., unknown) solution
approximation evaluated at x = j × δx, t = n × δt
The idea is that the subscript j counts how many “steps” to the right we have taken from the origin
and the superscript n counts how many time-steps (up, on the diagram) we have taken. To say this
another way
the superscript counts up the t values
!
unj
"
the subscript counts across the x values
For example, consider the point on Figure 3 which is highlighted with a small square. This point is
two steps to the right of the origin (so that j = 2) and five steps up (so that n = 5). The exact
solution evaluated at this point is u(2δx, 5δt) and our numerical approximation to that value is u52 .
Combining this new notation with the familiar idea for approximating derivatives we obtain the
following approximation to the PDE
un+1
j − unj unj−1 − 2unj + unj+1
=α
δt (δx)2
Key Point 17
The exact solution u = u(x, t) satisfies the partial differential equation
ut = αuxx
The approximate (numerical) solution satisfies the difference equation
un+1
j − unj unj−1 − 2unj + unj+1
=α
δt (δx)2
The difference between the unknown exact solution and the numerical solution will be governed by
how well the one-sided and central differences approximate the partial derivatives in the PDE.
50 HELM (2006):
®
αδt
To simplify (the appearance of) the numerical method we define a new quantity r = so that
(δx)2
our numerical procedure can be written
un+1
j = unj + r(unj−1 − 2unj + unj+1 ) = runj−1 + (1 − 2r)unj + runj+1
This equation defines a numerical “stencil” which allows us to find one of the values at the n+1 time
level in terms of values at the previous level, n. In Figure 4 we envisage terms on the right-hand side
of the above equation leading towards a result equal to the left-hand side, and the arrows therefore
point towards the point at which un+1j approximates u.
j−1 j j+1
n+1
Figure 4
At the stage of the process depicted above, the solid circles represent points in the (x, t) plane where
we have already found our numerical approximation. The unfilled circle is the point for which the
new approximation un+1j is being found.
Implementation
The initial condition gives u at t = 0, and this information can be used to find
u00 , u01 , u02 , . . . , u0J
that is, the numerical solution at all the selected x values and at t = 0. In general
u0j = f (j × δx) = fj
where fj is a shorthand notation for f (j × δx).
Then we use the boundary conditions and numerical method
un+1
j = unj + r(unj−1 − 2unj + unj+1 )
(with n = 0) to work out u1j for j = 0, 1, 2, . . . , J. (This completes the first time-step.)
The time-stepping procedure is then used repeatedly to find un+1j in terms of the unj , which are
known either from the last time-step or (at the beginning) from the initial condition.
The time-stepping procedure is summarised in the following Key Point.
HELM (2006): 51
Key Point 18
Here the step-by-step process used to implement the numerical procedure is presented.
1. The initial condition implies that

u0j = fj (j = 0, 1, 2, . . . , J)
(the boundary conditions could be used to find u00 and u0J , but our supposition is that this is
consistent with taking f0 and fJ ).
2. The first time-step
Here we find u1j for j = 0, 1, . . . , J.
(a) The boundary condition at x = 0 is u(0, t) = uL . It follows that u10 = uL .

(b) The boundary condition at x = ` is u(`, t) = uR . It follows that u1J = uR .
(c) Now we work from left to right finding u1j at the interior points. This is achieved by
repeatedly applying the general numerical scheme:
u11 = u01 + r(u00 − 2u01 + u02 )

u12 = u02 + r(u01 − 2u02 + u03 )
..
.
u1J−1 = u0J−1 + r(u0J−2 − 2u0J−1 + u0J )
This completes the first time-step. We have taken the initial data and used our approx-
imation to the PDE to obtain an approximate solution at time t = δt.
3. The second time-step
Here we find u2j for j = 0, 1, . . . , J.
(a) The boundary condition at x = 0 is u(0, t) = uL . It follows that u20 = uL .

(b) The boundary condition at x = ` is u(`, t) = uR . It follows that u2J = uR .
(c) Now we work from left to right finding u2j at the interior points. This is achieved by
repeatedly applying the general numerical scheme:
u21 = u11 + r(u10 − 2u11 + u12 )

u22 = u12 + r(u11 − 2u12 + u13 )
..
.
u2J−1 = u1J−1 + r(u1J−2 − 2u1J−1 + u1J )
This completes the second time-step. We now have an approximation to u at time

t = 2δt.
4. And so on ....
52 HELM (2006):
®
The following is a concrete example of the time-stepping procedure.
Example 15
The temperature u(x, t) of a metal bar of length ` = 2 at a distance x from one
end and at time t is modelled by the partial differential equation
ut = αuxx (0 < x < `, t > 0)
It is given that the metal has diffusivity α = 4, that the two ends of the bar are
kept at temperature u = 0 and that the initial temperature distribution is
u(x, 0) = f (x) = x(` − x)
Use the explicit difference scheme with δx = 0.5 and δt = 0.01 to approximate
u(x, t) at t = δt and t = 2δt.
Solution
In this case r = αδt/(δx)2 = 0.16 so that the numerical method can be written
un+1
j = unj + 0.16(unj−1 − 2unj + unj+1 ) = 0.68unj + 0.16(unj−1 + unj+1 )
We now find u0j
u00 = 0 from the left-hand boundary condition
u01 = f (δx) = 0.75 from the initial condition
u02 = f (2δx) = 1 from the initial condition
u03 = f (3δx) = 0.75 from the initial condition
u04 = 0 from the boundary condition at the right hand end
The first time-step will find u1j , but first we note that u10 = u14 = 0 from the two boundary conditions.
Now
u11 = 0.68u01 + 0.16(u00 + u02 ) = 0.68 × 0.75 + 0.16(0 + 1) = 0.670
u12 = 0.68u02 + 0.16(u01 + u03 ) = 0.68 × 1 + 0.16(0.75 + 0.75) = 0.920
u13 = 0.68u03 + 0.16(u02 + u04 ) = 0.68 × 0.75 + 0.16(1 + 0) = 0.670
The second time-step will find u2j , but first we note that u20 = u24 = 0 from the two boundary
conditions. Now
u21 = 0.68u11 + 0.16(u10 + u12 ) = 0.68 × 0.67 + 0.16(0 + 0.92) = 0.603
u22 = 0.68u12 + 0.16(u11 + u13 ) = 0.68 × 0.92 + 0.16(0.67 + 0.67) = 0.84
u23 = 0.68u13 + 0.16(u12 + u14 ) = 0.68 × 0.67 + 0.16(0.92 + 0) = 0.603
(Quantities have been rounded to three decimal places here.)
Figure 5 plots the numerical solutions found in the example above. The initial condition is shown as
circles. Results of the first time-step appear as squares and the second time-step is shown as stars.
The line joining the values we found are not part of the numerical solution and are included only as
HELM (2006): 53
an aid to clarity.
1
u0j
0.8 u1j
u2j
u 0.6
0.4
0.2
0 x
0 0.5 1 1.5 2
Figure 5
Notice how the numerical results are behaving as they should. The temperature decreases slightly at
each time-step.
Task
ut = αuxx (0 < x < `, t > 0)
It is given that the metal has diffusivity α = 2.25, that the two ends of the bar
are kept at temperature u = 0 and that the initial temperature distribution is
u(x, 0) = f (x) = sin(πx/`)
Your solution
Initial condition and first time-step:
54 HELM (2006):
®
Answer
In this case r = αδt/(δx)2 = 0.45 so that the numerical scheme can be written
un+1
j = unj + 0.45(unj−1 − 2unj + unj+1 ) = 0.1unj + 0.45(unj−1 + unj+1 )
The first stage is to use the given data to find u0j
u00 = 0 from the boundary condition
u01 = f (δx) = f (0.5) = 0.707 from the initial condition
u02 = f (2δx) = f (1) = 1 from the initial condition
u03 = f (3δx) = f (1.5) = 0.707 from the initial condition
The first time-step will find u1j . First we note that the boundary condition implies that u10 = u14 = 0.
u11 = 0.1u01 + 0.45(u00 + u02 ) = 0.1 × 0.71 + 0.45(0 + 1) = 0.521
u12 = 0.1u02 + 0.45(u01 + u03 ) = 0.1 × 1 + 0.45(0.71 + 0.71) = 0.736
u13 = 0.1u03 + 0.45(u02 + u04 ) = 0.1 × 0.71 + 0.45(1 + 0) = 0.521
Your solution
Second time-step:
Answer
The second time-step will find u2j . First we note that the boundary condition implies that u20 =
u24 = 0. Now
u21 = 0.1u11 + 0.45(u10 + u12 ) = 0.1 × 0.52 + 0.45(0 + 0.74) = 0.383
u22 = 0.1u12 + 0.45(u11 + u13 ) = 0.1 × 0.74 + 0.45(0.52 + 0.52) = 0.542
u23 = 0.1u13 + 0.45(u12 + u14 ) = 0.1 × 0.52 + 0.45(0.74 + 0) = 0.383
5. Stability of the simple explicit scheme

The purpose of the time-stepping scheme is to approximate u(x, t) at later and later times t. It is
clear that the larger we take the time step δt, the fewer steps will be necessary to reach a particular
time t. One constraint on the size of δt is that we know from our earlier look at difference methods
that derivative approximations are most accurate when small increments are used. However, as we will
see in the next couple of pages, a far more telling constraint on the size of δt arises on consideration
of stability. We begin with an Example.
HELM (2006): 55
Example 16

ut = αuxx (0 < x < `, t > 0)
u(x, 0) = f (x) = x(` − x)
Solution
un+1
j = unj + 1.2(unj−1 − 2unj + unj+1 ) = −1.4unj + 1.2(unj−1 + unj+1 )
u11 = −1.4u01 + 1.2(u00 + u02 ) = −1.4 × 0.19 + 1.2(0 + 0.25) = 0.038
u12 = −1.4u02 + 1.2(u01 + u03 ) = −1.4 × 0.25 + 1.2(0.188 + 0.188) = 0.1
u13 = −1.4u03 + 1.2(u02 + u04 ) = −1.4 × 0.19 + 1.2(0.25 + 0) = 0.038
u24 = 0. Now
u21 = −1.4u11 + 1.2(u10 + u12 ) = −1.4 × 0.04 + 1.2(0 + 0.1) = 0.067
u22 = −1.4u12 + 1.2(u11 + u13 ) = −1.4 × 0.1 + 1.2(0.038 + 0.038) = −0.05
u23 = −1.4u13 + 1.2(u12 + u14 ) = −1.4 × 0.04 + 1.2(0.1 + 0) = 0.067
56 HELM (2006):
®
Figure 6 shows the results found in Example 16.
0.25 u0j
0.2 u1j
u2j
0.15
u
0.1
0.05
0
−0.05 x
0 0.25 0.5 0.75 1.0
Figure 6
Something has gone wrong here. And it only gets worse in subsequent time-steps. After 9 time-steps
the numerical solution approximating u(x, t) at t = 9δt is
u(0.25, 9δt) ≈ u91 = −140.5531

u(0.50, 9δt) ≈ u92 = 198.7722
u(0.75, 9δt) ≈ u93 = −140.5531
(to 4 decimal places). This is an example of instability. A part of the numerical solution wants to
keep growing and growing in a way that is not a part of the engineering application being modelled.
There are many different definitions of (in)stability, and they often depend on the specific application
in mind. For the heat conduction problem under discussion here, the following definition is sufficient.
Key Point 19

αδt
The explicit difference scheme un+1
j = runj−1 + (1 − 2r)unj + runj+1 r=
(δx)2
un0 = 0 (n > 0)
unJ = 0 (n > 0)
u0j = f (j δx) (j = 1, 2, . . . , J − 1)
where Jδx = `, approximating the heat conduction problem


ut = αuxx (0 < x < `, t > 0)  
u(0, t) = 0 (t > 0)

u(`, t) = 0 (t > 0) 

u(x, 0) = f (x) (0 < x < `).

is said to be stable if the approximations unj do not grow in magnitude with n.
HELM (2006): 57
(Of course, there are applications where the principal quantity of interest does grow with time, and
in these cases other definitions of stability are appropriate.)
The main stability result for the explicit scheme is proved in many textbooks on the subject, but for
this Workbook it is sufficient to simply state it.
Key Point 20
The explicit scheme is stable if and only if
1
r≤ 2
Writing this another way we see that the restriction on the time-step is that
δx2
δt ≤
2α
Why is the stability constraint a problem?

In the above account it has been stated that the stability constraint is a severe restriction on the
time-step δt. Here we discuss why this is the case.
1
For sake of argument let us take an example where α = 1 and choose δx = . The stability
10
requirement insists that we must choose
1 1
δt ≤ δx2 = ,
2 200
which is much smaller than δx. If we require an even smoother approximation in the x direction we
1
could halve δx taking it to be equal to . It is now necessary that
20
1 1
δt ≤ δx2 = .
2 800
Decreasing δx by a factor of 2 causes δt to decrease by a factor of 4. The problem is that the upper
bound on δt involves the square of δx, which is likely to be very small.
The following method overcomes the requirement of tiny time-steps.
58 HELM (2006):
®
6. The Crank-Nicolson method

In the notation established for the explicit method, the so-called Crank-Nicolson scheme can be
written

un+1 = u n
j + 1
r u n
− 2u n
+ u n
+ u n+1
− 2u n+1
+ u n+1
j 2
| j−1 {zj j+1
} | j−1 {zj j+1
}
† ‡
which might, at first glance, look off-puttingly complicated. To aid clarity, certain groups of terms
have been gathered together in the above:
† these are the terms that appeared on the right hand side of the explicit method and are involved
with approximating uxx at time t = n δt
‡ these are very similar to the † terms, but all the superscripts are n + 1 instead of n, that is the
terms ‡ approximate uxx at time t = (n + 1) δt
1
(the factor of 2
outside the large bracket shows that we take the average of † and ‡)
Figure 7 shows another way of thinking of this numerical method. As in the earlier diagram of this
type, arrows point away from positions relating to terms on the right-hand side of the numerical
scheme.
j−1 j j+1
n+1
Figure 7
The new terms in the Crank-Nicolson method, as compared with the explicit method, give rise to
two new unfilled circles on the diagram and the horizontal arrows.
The implementation of this method is similar to that used for the explicit method, but there is a key
difference. The Crank-Nicolson scheme is implicit, for consider its use in the first time-step when
finding u1j ,

u1j = u0j + 12 r u0j−1 −2 u0j + u0j+1 + u1j−1 − 2u1j + u1j+1
|{z} |{z} |{z} |{z} | {z }
X X X X ?
The terms labelled X are known from the initial condition. But there are other unknown terms on
the right-hand side. We cannot simply “read off” the values at the new time-step as we did using the
explicit scheme. Instead we have to store all of the equations given by the stencil at a particular time-
step and then solve them as a system of simultaneous equations. The following Example illustrates
this point.
HELM (2006): 59
Example 17
The temperature u(x, t) of a metal bar of length ` = 1.2 at a distance x from one
ut = αuxx (0 < x < `, t > 0).
p
u(x, 0) = f (x) = x (` − x)3
Use the Crank-Nicolson difference scheme with δx = 0.4 and δt = 0.1 to approx-
imate u(x, t) at t = δt and t = 2δt.
Solution
0.62500 n
un+1
j = unj + (uj−1 − 2unj + unj+1 + un+1
j−1 − 2uj
n+1
+ un+1
j+1 )
2
Moving the unknowns to the left of the equation we obtain
−0.31250un+1 n+1
j−1 + 1.62500uj − 0.31250un+1 n n n
j+1 = 0.37500uj + 0.31250(uj−1 + uj+1 )

Two uses of the stencil give
−0.31250u10 + 1.62500u11 − 0.31250u12 = 0.37500u01 + 0.31250(u00 + u02 ) = 0.17058
−0.31250u11 + 1.62500u12 − 0.31250u13 = 0.37500u02 + 0.31250(u01 + u03 ) = 0.16534
The implicit nature of this method means that we have to do some extra work to complete the
time-step. We must now solve the simultaneous equations
1
1.62500 −0.31250 u1 0.17058
=
−0.31250 1.62500 u12 0.16534
In this case there are only two unknowns and it is a simple matter to solve the pair of equations to
give u11 = 0.12932 and u12 = 0.12662.
60 HELM (2006):
®
Solution (contd.)
u23 = 0. Two uses of the stencil give
−0.31250u20 + 1.62500u21 − 0.31250u22 = 0.37500u11 + 0.31250(u10 + u12 ) = 0.08806
−0.31250u21 + 1.62500u22 − 0.31250u23 = 0.37500u12 + 0.31250(u11 + u13 ) = 0.08789
2
1.62500 −0.31250 u1 0.08806
=
−0.31250 1.62500 u22 0.08789
give u21 = 0.06707 and u22 = 0.06699.
Figure 8 depicts the numerical solutions found in Example 17 above. (Again, the dotted lines are
intended to aid clarity, they are not part of the numerical solution.)
0.3 u0j
0.25 u1j
u2j
0.2
u
0.15
0.1
0.05
0 x
0 0.4 0.8 1.2
Figure 8
HELM (2006): 61
Task
The temperature u(x, t) of a metal bar of length ` = 0.9 at a distance x from one
ut = αuxx (0 < x < `, t > 0).
It is given that the metal has diffusivity α = 0.25, that the two ends of the bar
are kept at temperature u = 0 and that the initial temperature distribution is
u(x, 0) = f (x) = sin(πx/`)
Use the Crank-Nicolson difference scheme with δx = 0.3 and δt = 0.2 to approx-
imate u(x, t) at t = δt and t = 2δt.
Your solution
Initial condition and first time-step:
62 HELM (2006):
®
Answer
0.55556 n
un+1
j = unj + (uj−1 − 2unj + unj+1 + un+1
j−1 − 2uj
n+1
+ un+1
j+1 )
2
−0.27778un+1 n+1
j−1 + 1.55556uj − 0.27778un+1 n n n
j+1 = 0.44444uj + 0.27778(uj−1 + uj+1 )

Two uses of the stencil give
−0.27778u10 + 1.55556u11 − 0.27778u12 = 0.44444u01 + 0.27778(u00 + u02 ) = 0.62546
−0.27778u11 + 1.55556u12 − 0.27778u13 = 0.44444u02 + 0.27778(u01 + u03 ) = 0.62546
1
1.55556 −0.27778 u1 0.62546
=
−0.27778 1.55556 u12 0.62546
give u11 = 0.48949 and u12 = 0.48949.
Your solution
Second time-step:
HELM (2006): 63
Answer
u23 = 0. Two uses of the stencil give
−0.27778u20 + 1.55556u21 − 0.27778u22 = 0.44444u11 + 0.27778(u10 + u12 ) = 0.35352
−0.27778u21 + 1.55556u22 − 0.27778u23 = 0.44444u12 + 0.27778(u11 + u13 ) = 0.35352
2
1.55556 −0.27778 u1 0.35352
=
−0.27778 1.55556 u22 0.35352
give u21 = 0.27667 and u22 = 0.27667.
In general
Having now seen some instances with a relatively large δx, we now look at the general case where
the space step may be much smaller. In this case there will be a larger system of equations to solve
at each time-step than was the case above.
In general, the procedure of moving the unknowns to the left hand side of the equation leads to
r r r n r n
− un+1
j−1 + (1 + r)uj
n+1
− un+1 n
j+1 = uj−1 + (1 − r)uj + uj+1
2 2 2 2
which we apply all the way along the x-axis. That is, we put j = 1, 2, 3, . . . , J − 1 in the above
expression and hence derive a system of equations for all the u with superscript n + 1.
    
r n+1 n+1
 1 + r −2 0 ... ... r n n
0   u1   2 u0 + (1 − r)u1 + 2 u2 + 2 u0r n r

    
   n+1
  r r

 −r 1 + r −r  u   un1 + (1 − r)un2 + un3 
 2 2  2   2 2 
    r n n r n

  u3   2 u2 + (1 − r)u3 + 2 u4
n+1
 0
 − 2r 1 + r − 2r     

=
    
  
    

 .

 .  
  .
.. 
 .. . .. . .. . ..   ..  


    
    
r 
 − 2 
  
r n+1
   r n n r n

r n+1 u
2 J−2
+ (1 − r)uJ−1 + 2 uJ + 2 uJ
0 ... ... 0 −2 1 + r uJ−1
The underlined terms on the right-hand side will be known from the boundary conditions. The doubly
underlined quantities are “new” at the current time-step and involve the only appearances of n + 1
on the right-hand side. All the other u approximations at time level n + 1 are unknown at this stage
and appear on the left.
The matrix on the left-hand side of the system has the following properties
• It is independent of n . In other words, the same matrix appears at each time-step. (We saw
this in the example and exercise above in which the same 2 × 2 matrix appeared at each of the
two time-steps carried out).
64 HELM (2006):
®
• It is tridiagonal. That is, the only non-zero entries are either on, or adjacent to, the diagonal.
r
Furthermore, there are only two different values ( and 1 + r) which appear. This is good
2
news as far as storage is concerned. Gaussian elimination (seen in 30, for example) works
extremely well on tridiagonal matrices.
It is also true that the matrix is strictly diagonally dominant. (That is, the diagonal element on
each row is greater in size than the sum of the absolute values of the off-diagonal elements on that
row.) This means that methods such as Jacobi and Gauss Seidel (see 30 for details) would
work very well.
Stability of the Crank-Nicolson scheme

This is the big pay-off when using the Crank-Nicolson method.
Key Point 21
The Crank-Nicolson method is stable for all values of r.
This is excellent news. It means that there is no hideously restrictive constraint on the size of δt.
7. Cost -v- benefit

At a first reading of this Section, it might be tempting to think that the extra effort involved in using
Crank-Nicolson (we have to store a set of simultaneous equations, we have to solve them and we
have to do this at every time-step) is enough to make the explicit method the winner in a cost-benefit
analysis. But this would be wrong.
In practical problems involving numerical approximations to parabolic problems the explicit method
is rarely good enough. The stability constraint (r ≤ 12 ) imposes such tiny time-steps that it takes
a great deal of time for a computer to produce approximations corresponding to even fairly modest
values of t. If efficiency is what matters, then Crank-Nicolson beats the explicit approach, and it is
worth the extra initial effort formulating a solver (such as those we saw in 30) for the system
of equations.
HELM (2006): 65
Exercises
1. Consider the function u defined by
u(x, t) = x3 cos(xt)
Using increments of δx = 0.005 and δt = 0.01, and working to 8 decimal places, approximate
(a) ux (2, 3) with a one-sided forward difference

(b) uxx (2, 3) with a central difference
(c) ut (2, 3) with a one-sided forward difference
(d) ut (2, 3) with a central difference.
State the approximate derivatives to 3 decimal places.
2. The temperature u(x, t) of a metal bar of length ` = 3 at a distance x from one end and at
time t is modelled by the partial differential equation
ut = αuxx (0 < x < `, t > 0)
It is given that the metal has diffusivity α = 1.6, that the two ends of the bar are kept at
temperature u = 0 and that the initial temperature distribution is
u(x, 0) = f (x) = x(` − x)
Use the explicit difference scheme with δx = 0.75 and δt = 0.08 to approximate u(x, t) at
t = δt and t = 2δt.
3. The temperature u(x, t) of a metal bar of length ` = 1.2 at a distance x from one end and at
time t is modelled by the partial differential equation
ut = αuxx (0 < x < `, t > 0).
It is given that the metal has diffusivity α = 2.25, that the two ends of the bar are kept at
temperature u = 0 and that the initial temperature distribution is
u(x, 0) = f (x) = sin(πx/`)
Use the Crank-Nicolson difference scheme with δx = 0.4 and δt = 0.06 to approximate u(x, t)
at t = δt and at t = 2δt.
66 HELM (2006):
®
Answers
1. The evaluations of u we will need are u(x, t) = −0.41614684, u(x + δx, t) = −0.43162908,
u(x − δx, t) = −0.40095819, u(x, t + δt) = −0.42521885, u(x, t − δt) = −0.40703321. It
follows that
−0.43162908 + 0.41614684
(a) ux (1, 2) ≈ = −3.096
0.005
−0.43162908 + 2 × 0.41614684 − 0.40095819
(b) uxx (1, 2) ≈ = −11.744
0.0052
−0.42521885 + 0.41614684
(c) ut (1, 2) ≈ = −0.907
0.01
−0.42521885 + 0.40703321
(d) ut (1, 2) ≈ = −0.909
2 × 0.01
to 3 decimal places. (Workings shown to 8 decimal places.)
2. In this case r = α2 δt/(δx)2 = 0.227556 so that the numerical scheme can be written
un+1
j = unj + 0.227556(u2j−1 − 2unj + unj+1 ) = 0.544889unj + 0.227556(u2j−1 + unj+1 )

The first timestep will find u1j . We note that the boundary condition implies that u10 = u14 = 0.
u11 = 0.544889u01 + 0.227556(u00 + u02 )=0.544889×1.6875 + 0.227556(0 + 2.25) = 1.4315

u12 = 0.544889u02 + 0.227556(u01 + u03 )=0.544889×2.25 + 0.227556(1.688 + 1.688) = 1.994
u13 = 0.544889u03 + 0.227556(u02 + u04 )=0.544889×1.6875 + 0.227556(2.25 + 0) = 1.4315
The second timestep will find u2j . First we note that the boundary condition implies that
u20 = u24 = 0.
u21 = 0.544889u11 + 0.227556(u10 + u12 )=0.544889×1.4315 + 0.227556(0 + 1.994) = 1.233754

u22 = 0.544889u12 + 0.227556(u11 + u13 )=0.544889×1.994 + 0.227556(1.432 + 1.432) = 1.738
u23 = 0.544889u13 + 0.227556(u12 + u14 )=0.544889×1.4315 + 0.227556(1.994 + 0) = 1.233754
where some quantities have been rounded to 6 decimal places.
HELM (2006): 67
Answers
3. In this case r = αδt/(δx)2 = 0.84375 so that the numerical scheme can be written
0.84375 n
un+1
j = unj + (uj−1 − 2unj + unj+1 + un+1 n+1
j−1 − 2uj + un+1
j+1 )
2
−0.42188un+1 n+1
j−1 + 1.84375uj − 0.42188un+1 n n n
j+1 = 0.15625uj + 0.42188(uj−1 + uj+1 )

The first time-step will find u1j . First we note that the boundary condition implies that
u10 = u13 = 0. Two uses of the stencil give
−0.42188u10 + 1.84375u11 − 0.42188u12 = 0.15625u01 + 0.42188(u00 + u02 ) = 0.50067

−0.42188u11 + 1.84375u12 − 0.42188u13 = 0.15625u02 + 0.42188(u01 + u03 ) = 0.50067
The implicit nature of this method means that we have to do some extra work to complete
the time-step. We must now solve the simultaneous equations
1
1.84375 −0.42188 u1 0.50067
=
−0.42188 1.84375 u12 0.50067
In this case there are only two unknowns and it is a simple matter to solve the pair of equations
to give u11 = 0.35212 and u12 = 0.35212.
The second time-step will find u2j . First we note that the boundary condition implies that
u20 = u23 = 0. Two uses of the stencil give
−0.42188u20 + 1.84375u21 − 0.42188u22 = 0.15625u11 + 0.42188(u10 + u12 ) = 0.20357

−0.42188u21 + 1.84375u22 − 0.42188u23 = 0.15625u12 + 0.42188(u11 + u13 ) = 0.20357
The implicit nature of this method means that we have to do some extra work to complete
the time-step. We must now solve the simultaneous equations
2
1.84375 −0.42188 u1 0.20357
=
−0.42188 1.84375 u22 0.20357
In this case there are only two unknowns and it is a simple matter to solve the pair of equations
to give u21 = 0.14317 and u22 = 0.14317.
68 HELM (2006):

Hyperbolic PDEs 32.5

Introduction
In the preceding Section we looked at parabolic partial differential equations. Another class of PDE
modelling initial value problems are of the hyperbolic type.
In this Section we will concentrate on the wave equation, which was introduced in 25.
' $
• revise those aspects of 25 which deal
with the wave equation
Prerequisites • familiarise yourself with difference methods

for approximating first and second derivatives
• be familiar with the numerical methods used
for parabolic equations
&
%

Learning Outcomes • obtain simple numerical solutions of the wave

equation

HELM (2006): 69
Section 32.5: Hyperbolic PDEs
1. The (one-dimensional) wave equation
The wave equation is a PDE which (as its name suggests) models wave-like phenomena. It is a model
of waves on water, of sound waves, of waves of reactant in chemical reactions and so on. For the
purposes of most of the following examples we may think of the application in hand as that of being
a length of string tightly stretched between two points. Let u = u(x, t) be the displacement from
rest of the string at time t and distance x from one end. Oscillations in the string may be modelled
by the wave equation
utt = c2 uxx (0 < x < `, t > 0)
where ` is the length of the string, t = 0 is some initial time and c > 0 is a constant (the wave
speed) dependent on the material properties of the string. (Further discussion of the constant c is
given in 25.2.)
The wave equation is hyperbolic, as we can readily verify on recalling the definitions at the beginning
of Section 32.4. Extra information is needed to specify the initial value problem. The initial position
and initial velocity are given as

u(x, 0) = f (x)
0≤x≤`
ut (x, 0) = g(x)
Finally, we need boundary conditions specifying how the ends of the string are held. For example
u(0, t) = u(`, t) = 0 (t > 0)
models the situation where the string is fixed at each end.
(We will suppose that f (0) = f (`) = 0 so that there is no apparent conflict at the ends of the string
at the initial time.)
2. Numerical solutions
The approach we will adopt is similar to that seen in Section 32.4 where we looked at parabolic
equations. We use the notation
unj
to denote an approximation to u evaluated at x = j × δx, t = n × δt. Approximating the derivatives
in the PDE
utt = c2 ux
by central differences we obtain the numerical difference equation
un+1
j − 2unj + un−1
j 2
unj+1 − 2unj + unj−1
=c .
(δt)2 (δx)2
Multiplying through by (δt)2 this can be rearranged to give
un+1
j = 2unj − un−1
j + µ2 (unj+1 − 2unj + unj−1 )
cδt
in which µ = is called the Courant number.
δx
The equation above gives un+1j in terms of u-approximations at earlier time-steps (that is, all the
appearances of u on the right-hand side have a superscript smaller than n + 1).
70 HELM (2006):
j−1 j j+1
n+1
n−1
Figure 9
Thinking of the numerical stencil graphically we have the situation shown above. We may think of
the values on the right-hand side of the equation “pointing to” a new value on the left-hand side.
Key Point 22
Timesteps (other than the first one) are carried out by using the numerical stencil
un+1
j = 2unj − ujn−1 + µ2 (unj+1 − 2unj + unj−1 )
| {z }
↑ ↑
“new” approximation “old” approximations at
th
at (n + 1) time-step earlier time-steps
(We will deal with how to carry out the first time-step shortly.)
The time-stepping process has much in common with the corresponding procedure for parabolic
problems. The following Example will help establish the general idea.
HELM (2006): 71
Example 18
Given that u = u(x, t) satisfies the wave equation utt = c2 uxx in t > 0 and 0 < x < 1 with boundary
conditions u(0, t) = u(1, t) = 0 (t > 0) with wave speed c = 1.2.
The numerical method un+1
j = 2unj − ujn−1 + µ2 (unj+1 − 2unj + unj−1 ) where µ = c δt/δx, is
implemented using δx = 0.25 and δt = 0.1.
Suppose that, after 5 time-steps, the following data forms part of the numerical solution:
u40 = 0.0000 u50 = 0.0000
u41 = 0.9242 u51 = 0.7110
u42 = −0.0020 u52 = −0.0059
u43 = −0.9624 u53 = −0.7409
u44 = 0.0000 u54 = 0.0000
Carry out the next time-step so as to find an approximation to u at t = 6δt.
Solution
In this case µ = 1.2 × 0.1/0.25 = 0.48 and the required time-step is carried out as follows:

u61 = 2u51 − u41 + µ2 (u52 − 2u51 + u50 ) = −0.1689
u62 = 2u52 − u42 + µ2 (u53 − 2u52 + u51 ) = −0.0140
u63 = 2u53 − u43 + µ2 (u54 − 2u53 + u52 ) = −0.1794
to 4 decimal places and these are the approximations to u(0, 6δt), u(0.25, 6δt), u(0.5, 6δt),
u(0.75, 6δt) and u(1, 6δt), respectively.
The diagram below shows the numerical results that appeared in the example above. It can be seen
that the example was a (rather coarse) model of a standing wave with two antinodes.
1 n=4
n=5
n=6
0.5
u
0
− 0.5
x
0 0.25 0.5 0.75 1.0
Figure 10
72 HELM (2006):
Task
Suppose that u = u(x, t) satisfies the wave equation utt = c2 uxx in t > 0 and
0 < x < 1. It is given that u satisfies boundary conditions u(0, t) = u(1, t) = 0
(t > 0) and initial conditions that need not be stated for the purposes of this
question. The application is such that the wave speed c = 1.2.
The numerical method un+1 j = 2unj − ujn−1 + µ2 (unj+1 − 2unj + unj−1 ) where
µ = c δt/δx, is implemented using δx = 0.25 and δt = 0.2.
Suppose that, after 8 time-steps, the following data forms part of the numerical
solution:
u70 = 0.0000 u80 = 0.0000
u71 = 0.6423 u81 = 0.4640
u72 = 0.8976 u82 = 0.6792
u73 = 0.6789 u83 = 0.4668
u74 = 0.0000 u84 = 0.0000
Your solution
Answer
In this case µ = 1.2 × 0.2/0.25 = 0.96 and the required time-step is carried out as follows:

u91 = 2u81 − u71 + µ2 (u82 − 2u81 + u80 ) = 0.0564
u92 = 2u82 − u72 + µ2 (u83 − 2u82 + u81 ) = 0.0667
u93 = 2u83 − u73 + µ2 (u84 − 2u83 + u82 ) = 0.0202
u(0.75, 9δt) and u(1, 9δt), respectively.
The above Task concerns a stretched string oscillating in such a way that at the 9th time-step the
string is approximately flat. The motion continues with u taking negative values. Figure 11 below
uses data calculated above, and also data for the next two time-steps so as to show subsequent
progress of the solution.
HELM (2006): 73
n=7
1 n=8
n=9
n = 10
0.5 n = 11
u
0
− 0.5
x
0 0.25 0.5 0.75 1.0
Figure 11
3. The first time-step

In the Example and Task above we have seen how time-steps can be carried out using the numerical
stencil
un+1
j = 2unj − un−1
j + µ2 (unj+1 − 2unj + unj−1 ),
but there remains one issue which, so far, we have neglected. How do we carry out the first time-step?
Initial conditions
The initial time-step must use information from the two initial conditions

u(x, 0) = f (x)
0≤x≤`
ut (x, 0) = g(x)
The first initial condition is easy enough to interpret. It gives unj in the case where n = 0. In fact
u0j = fj
where fj is simply shorthand for f (j × δx).
∂u
The second initial condition, the one involving g, gives information about ut = at t = 0. We can
∂t
approximate the t-derivative of u at t = 0 and x = j × δx by a central difference to write
u1j − u−1
j
= gj
2δt
in which gj is shorthand for g(j × δx).
This last expression involves u−1
j which, if it has a meaning at all, refers to u at time t = −δt, that
is, before the initial time t = 0. One way to think of u−1j is simply as an artificial quantity which
proves useful later on. The equation above, rearranged for u−1j is
u−1 1
j = uj − 2δt × gj
74 HELM (2006):
Key Point 23
A central difference used to approximate the first derivative in the condition defining initial speed
gives rise to the following useful equation
u−1 1
j = uj − 2δt × gj
The first time-step

To carry out the first time-step we put n = 0 in the numerical stencil

un+1
j = 2u n
j − u n−1
j + µ 2
u n
j+1 − 2u n
j + u n
j−1 ,
to give

u1j = 2u0j − u−1
j + µ 2
u 0
j+1 − 2u 0
j + u 0
j−1 .
Those terms on the right-hand side with a 0 superscript are known via the function f since we know
that u0j = fj . Hence

1 −1 2
uj = 2fj − uj + µ fj+1 − 2fj + fj−1 .
And the u−1

j term is dealt with using the Key Point above to give

u1j = 2fj − u1j + 2δt × gj + µ2 fj+1 − 2fj + fj−1 .
and therefore, moving the latest appearance of u1j over to the left-hand side and dividing by 2,

u1j = fj + δt × gj + 21 µ2 fj+1 − 2fj + fj−1

1 2
= 2
µ fj−1 + fj+1 + (1 − µ2 )fj + δt × gj
Key Point 24
The first time-step is carried out by using the initial data and can be summarised as

u1j = 12 µ2 fj−1 + fj+1 + (1 − µ2 )fj + δt × gj
HELM (2006): 75
Example 19
0 < x < 1. It is given that u satisfies boundary conditions u(0, t) = u(1, t) = 0
(t > 0) and initial conditions that may be summarised as
f0 = 0.0000 g0 = 0.0000
f1 = 0.6000 g1 = 0.1000
f2 = 0.0000 g2 = 0.2000
f3 = −0.5000 g3 = 0.1000
f4 = 0.0000 g4 = 0.0000
The application is such that the wave speed c = 1.
Carry out the first two time-steps of the numerical method
un+1
j = 2unj − un−1
j + µ2 (unj+1 − 2unj + unj−1 )
where µ = c δt/δx in which δx = 0.25 and δt = 0.2.
Solution
In this case µ = 1 × 0.2/0.25 = 0.8 and the first time-step is carried out as follows (to 4 d.p.):

u11 = 12 µ2 (f0 + f2 ) + (1 − µ2 )f1 + δtg1 = 0.2360
u12 = 21 µ2 (f1 + f3 ) + (1 − µ2 )f2 + δtg2 = 0.0720
u13 = 21 µ2 (f2 + f4 ) + (1 − µ2 )f3 + δtg3 = −0.0160
The second time-step is as follows (to 4 d.p.):

u21 = 2u11 − u01 + µ2 (u12 − 2u11 + u10 ) = −0.3840
u22 = 2u12 − u02 + µ2 (u13 − 2u12 + u11 ) = 0.1005
u23 = 2u13 − u03 + µ2 (u14 − 2u13 + u12 ) = 0.4309
76 HELM (2006):
Task
0 < x < 0.8. It is given that u satisfies boundary conditions u(0, t) = u(0.8, t) = 0
(t > 0) and initial conditions that may be summarised as
f0 = 0.0000 g0 = 0.0000
f1 = 0.1703 g1 = 0.4227
f2 = 0.2364 g2 = 0.5417
f3 = 0.1703 g3 = 0.4227
f4 = 0.0000 g4 = 0.0000
The application is such that the wave speed c = 1.
un+1
j = 2unj − un−1
j + µ2 (unj+1 − 2unj + unj−1 )
Your solution
HELM (2006): 77
Answer
In this case µ = 1 × 0.11/0.2 = 0.55 and the first time-step is carried out as follows:

u11 = 21 µ2 (f0 + f2 ) + (1 − µ2 )f1 + δtg1 = 0.2010
u12 = 21 µ2 (f1 + f3 ) + (1 − µ2 )f2 + δtg2 = 0.2760
u13 = 21 µ2 (f2 + f4 ) + (1 − µ2 )f3 + δtg3 = 0.2010
The second time-step is as follows:

u21 = 2u11 − u01 + µ2 (u12 − 2u11 + u10 ) = 0.1936
u22 = 2u12 − u02 + µ2 (u13 − 2u12 + u11 ) = 0.2702
u23 = 2u13 − u03 + µ2 (u14 − 2u13 + u12 ) = 0.1936
4. Stability
There is a stability constraint that is common to many methods for obtaining numerical solutions of
the wave equation. Issues relating to stability of numerical methods can be extremely complicated,
but the following Key Point is enough for our purposes.
Key Point 25
The numerical method seen in this Section requires that
cδt
µ≤1 that is, ≤1
δx
for solutions not to grow unrealistically with n.
This is called the CFL condition (named after an acronym of three mathematicians Courant, Friedrichs
and Lewy).
78 HELM (2006):
Exercises
1. Suppose that u = u(x, t) satisfies the wave equation utt = c2 uxx in t > 0 and 0 < x < 0.6.
It is given that u satisfies boundary conditions u(0, t) = u(0.6, t) = 0 (t > 0) and initial
conditions that need not be stated for the purposes of this question. The application is such
that the wave speed c = 1.4.
The numerical method
un+1
j = 2unj − un+1
j + µ2 (unj+1 − 2unj + unj−1 )
where µ = c δt/δx, is implemented using δx = 0.15 and δt = 0.1.

Suppose that, after 7 time-steps, the following data forms part of the numerical solution:
u60 = 0.0000 u70 = 0.0000

u61 = 0.1024 u71 = 0.0997
u62 = 0.1986 u72 = 0.1730
u63 = 0.2361 u73 = 0.1169
u64 = 0.0000 u74 = 0.0000
2. Suppose that u = u(x, t) satisfies the wave equation utt = c2 uxx in t > 0 and 0 < x < 1. It is
given that u satisfies boundary conditions u(0, t) = u(1, t) = 0 (t > 0). The initial elevation
may be summarised as
f0 = 0.0000 f1 = 0.7812 f2 = 0.2465

f3 = −0.1209 f4 = 0.0000
and the string is initially at rest (that is, g(x) = 0). The application is such that the wave
speed c = 1.
un+1
j = 2unj − un−1
j + µ2 (unj+1 − 2unj + unj−1 )
HELM (2006): 79
Answers
1. In this case µ = 1.4 × 0.1/0.15 = 0.93333 and the required time-step is carried out as follows:

u81 = 2u71 − u61 + µ2 (u72 − 2u71 + u70 ) = 0.0740
u82 = 2u72 − u62 + µ2 (u73 − 2u72 + u71 ) = 0.0347
u83 = 2u73 − u63 + µ2 (u74 − 2u73 + u72 ) = −0.0552
u(0.45, 8δt) and u(0.6, 8δt), respectively.
2. In this case µ = 1 × 0.2/0.25 = 0.8 and the first time-step is carried out as follows:

u11 = 21 µ2 (f0 + f2 ) + (1 − µ2 )f1 + δtg1 = 0.3601
u12 = 21 µ2 (f1 + f3 ) + (1 − µ2 )f2 + δtg2 = 0.3000
u13 = 21 µ2 (f2 + f4 ) + (1 − µ2 )f3 + δtg3 = 0.0354
The second time-step is as follows:

u21 = 2u11 − u01 + µ2 (u12 − 2u11 + u10 ) = −0.3299
u22 = 2u12 − u02 + µ2 (u13 − 2u12 + u11 ) = 0.2226
u23 = 2u13 − u03 + µ2 (u14 − 2u13 + u12 ) = 0.3384
80 HELM (2006):
Contents 33
Numerical
Boundary Value Problems
33.1 Two-point Boundary Value Problems 2
33.2 Elliptic PDEs 18
Learning outcomes
In this Workbook, which follows on from Workbook 32, you will see some more numerical
methods for approximating solutions of differential equations. The methods to be
presented in this Workbook are most usually applied in engineering examples involving
so-called boundary data, that is, some of the available information arises "around the
edges" of the problem area.
Two-point Boundary
Value Problems 33.1
Introduction
Boundary value problems arise in applications where some physical process involves knowledge of
information at the edges. For example, it may be possible to measure the electric potential around
the edge of a semi-conductor and then use this information to infer the potential distribution near
the middle.
In this Section we discuss numerical methods that can be used for certain boundary value problems
involving processes that may be modelled by an ordinary differential equation.

Prerequisites • revise central difference approximations

( 31)

'
$
• approximate certain boundary value problems
using central differences
Learning Outcomes
• obtain simple numerical approximations to
On completion you should be able to . . . the solutions to certain boundary value
problems
& %
2 HELM (2006):
Workbook 33: Numerical Boundary Value Problems
®
1. Three point stencil

Let us consider the boundary value problem defined by Equations (1):

p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x) (0 < x < `) 
a0 y 0 (0) + b0 y(0) = c0 (1)
a1 y 0 (`) + b1 y(`) = c1

The first line is the differential equation, and the second and third lines are the boundary condi-
tions which can involve derivatives.
It is our aim to approximate the solution of this problem numerically, and we adopt an approach
similar to that seen in 32.
We divide the interval 0 < x < ` into a number, J say, of subintervals each of equal width h = `/J.
Our numerical solution will provide an approximation to y = y(x) at each value of x where two
subintervals meet (see Figure 1).
yJ
yJ−1
yj
y2
y1
y0
x
h 2h jh l−h l
x1 x2 xj xJ−1 xJ
Figure 1
Key Point 1
A numerical approximation to the boundary value problem
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x) (0 < x < `)
a0 y 0 (0) + b0 y(0) = c0
a1 y 0 (`) + b1 y(`) = c1
is a sequence of numbers y0 , y1 , y2 , y3 , . . . , yJ .
Here yj is an approximation to y(x) at x = jh, where h = `/J and j = 0, 1, 2, . . . , J.
It is useful to give a name to the x-values where we seek an approximation to y = y(x). Hence we
will sometimes write xj = jh j = 0, 1, 2, . . . , J.
The functions p, q, r and s will frequently occur evaluated at the values xj so it is convenient to
set up the following abbreviations:
pj = p(xj ), qj = q(xj ), rj = r(xj ), sj = s(xj ).
HELM (2006): 3
Section 33.1: Two-point Boundary Value Problems
A numerical approximation to Equations (1) can be found by approximating the derivatives by finite
differences. Here we will approximate y 00 (x) and y 0 (x) by central differences to obtain
y(xj + h) − 2y(xj ) + y(xj − h) y(xj + h) − y(xj − h)

p(xj ) 2
+ q(xj ) + r(xj )y(xj ) ≈ s(xj )
h 2h
that is, on using the abbreviations established in Key Point 1,
y(xj + h) − 2y(xj ) + y(xj − h) y(xj + h) − y(xj − h)

pj 2
+ qj + rj y(xj ) ≈ sj
h 2h
which we use as the motivation for the numerical method
yj+1 − 2yj + yj−1 yj+1 − yj−1

pj 2
+ qj + r j y j = sj .
h 2h
This last equation can be rearranged, gathering together all the like y-terms. It neatens things further
to multiply through by h2 as well, and the result of these manipulations appears in the following Key
Point.
Key Point 2
A central difference approximation to
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
yj−1 p j − qj + yj h rj − 2pj + yj+1 p j + qj = h2 sj .
2 2
This approximation to the differential equation can be thought of as a three-point stencil linking
three of the approximate y-values. The expression

h h
yj−1 pj − qj + yj h rj − 2pj + yj+1 pj + qj = h2 sj .
2
2 2
is centred around x = xj and involves yj−1 , yj and yj+1 . The general rule when dealing with a
numerical stencil like this is to centre the stencil at every point where y is unknown. (This
general rule will appear again, for example on page 13.)
4 HELM (2006):
®
Key Point 3
Centre the stencil at every x-point where y is unknown. This will give a set of equations for the
unknown y-values, and we are guaranteed exactly as many equations as there are unknowns.
In the following Example, matters are simplified because the functions p, q, r and s are all constant.
Example 1
Let y = y(x) be a solution to the boundary value problem
3y 00 (x) + 4y 0 (x) + 5y(x) = 7 0 < x < 1.5
y(0) = 2, y(1.5) = 2.
Using a mesh width of h = 0.5, obtain a central difference approximation to the
differential equation and hence find yj ≈ y(jh), j = 1, 2.
Solution
In general, the central difference approximation to
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2
In this case the coefficients are
h h
pj − qj = 2, h2 rj − 2pj = −4.75, pj + qj = 4, h2 sj = 1.75.
2 2
These values will be the same for all x because p, q, r and s are constants in this Example. Hence
the general stencil is
2yj−1 − 4.75yj + 4yj+1 = 1.75
In this case h = 0.5, and our numerical solution consists of the values
y0 = y(0) = 2 from the boundary condition

y1 ≈ y(0.5)
y2 ≈ y(1)
y3 = y(1.5) = 2 from the boundary condition
HELM (2006): 5
Solution (contd.)
So there are two unknowns, y1 and y2 . We centre the stencil at each of the corresponding x values.
Putting j = 1 in the numerical stencil gives
2y0 − 4.75y1 + 4y2 = 1.75
Moving the stencil one place to the right, we put j = 2 so that
2y1 − 4.75y2 + 4y3 = 1.75
In these two equations y0 and y3 are known from the boundary conditions and we move terms
involving them to the right-hand side. This leads to the system of equations

−4.75 4 y1 −2.25
=
2 −4.75 y2 −6.25
Solving this pair of simultaneous equations we find that
y1 = 2.45, y2 = 2.35
This approximation is shown in Figure 2 in which the numerical approximations to point values of y
are shown as circles.
2.5
2.4
2.3
2.2
2.1
2
0 0.5 1 1.5
x
Figure 2
The question remains how close to the exact solution these approximations are. (Of course for
Example 1 above it is possible to find the analytic solution fairly easily, but this will not usually be
the case.)
A pragmatic way to deal with this question is to recompute the results with a smaller value of h. We
know from 31 that the central difference approximations get closer and closer to the derivatives
which they approximate as h decreases. In Figure 3 the results for h = 0.5 are given again as circles,
and a computer has been used to find more accurate approximations to y using h = 1.5 7
(shown as
1.5
squares) and yet more accurate results (shown as dots) from using h = 10 . (This involves solving
larger systems of equations than are manageable by hand. The methods seen in 30 can be
used to deal with these larger systems.)
6 HELM (2006):
®
2.5
3 subintervals
7 subintervals
2.4 10 subintervals
2.3
2.2
2.1
2
0 0.5 1 1.5
x
Figure 3
We can now have some confidence that the results we calculated using h = 0.5 tended to overestimate
the true values of y.
Example 2
y 00 (x) + 2y 0 (x) − 2y(x) = −3 0<x<2
y(0) = 1, y(2) = −2.
Using a mesh width of h = 0.5 obtain a central difference approximation to
the differential equation and hence find simultaneous equations satisfied by the
unknowns y1 , y2 and y3 .
Solution
h h
pj − qj = 0.5, h2 rj − 2pj = −2.5, pj + qj = 1.5, h2 sj = −0.75.
2 2
These values will be the same for all x because p, q, r and s are constants in this Example.
Hence the general stencil is
0.5yj−1 − 2.5yj + 1.5yj+1 = −3
Here we have h = 0.5 and our numerical solution consists of the values

y1 ≈ y(0.5)
y2 ≈ y(1)
y3 ≈ y(1.5)
y4 = y(2) = −2 from the boundary condition
HELM (2006): 7
Solution (contd.)
So there are three unknowns, y1 , y2 and y3 . We centre the stencil at each of the corresponding x
values. Putting j = 1 in the numerical stencil gives
0.5y0 − 2.5y1 + 1.5y2 = −0.75.
0.5y1 − 2.5y2 + 1.5y3 = −0.75
and finally we let j = 3 so that
0.5y2 − 2.5y3 + 1.5y4 = −0.75
In these three equations y0 and y4 are known from the boundary conditions and we move terms
    
−2.5 1.5 0 y1 −1.25
 0.5 −2.5 1.5   y2  =  −0.75 
0 0.5 −2.5 y3 2.25
We can find (using methods from 30, for example) that the solution to the system of equations
in Example 2 is
y1 = 0.39 y2 = −0.18 y3 = −0.94 to 2 decimal places.
Task
y 00 (x) + 4y 0 (x) = 4 0<x<1
y(0) = −2, y(1) = 3.
Using a mesh width of h = 0.25 obtain a central difference approximation to the
differential equation and hence find a system of equations satisfied by yj ≈ y(jh),
j = 1, 2, 3.
Your solution
Work the solution on a separate piece of paper. Record the main results and your conclusions here.
8 HELM (2006):
®
Answer
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2
h h
pj − qj = 0.5, h2 rj − 2pj = −2, pj + qj = 1.5, h2 sj = 0.25.
2 2
In this case h = 0.25 and our numerical solution consists of the values

y1 ≈ y(0.25)
y2 ≈ y(0.5)
y3 ≈ y(0.75)
0.5y0 − 2y1 + 1.5 = 0.25
0.5y1 − 2y2 + 1.5 = 0.25
0.5y2 − 2y3 + 1.5 = 0.25
In these three equations y0 and y4 are known from the boundary conditions and we move terms
    
−2 1.5 0 y1 1.25
 0.5 −2 1.5   y2  =  0.25 
0 0.5 −2 y3 −4.25
(And you might like to check that the solution to this system of equations is y1 = 0.95, y2 = 2.1
and y3 = 2.65.)
HELM (2006): 9
Example 3
The temperature y of an electrically heated wire of length ` is affected by local air
currents. This situation may be modelled by
d2 y
= −a − b(Y − y), (0 < x < `).
dx2
Consider the case where ` = 3, a = 50, b = 0.1 and Y = 20◦ C and suppose
that the ends of the wire are known (to 1 decimal place) to be at temperatures
y(0) = 15.0◦ C and y(3) = 25.0◦ C.
Using a central difference to approximate the derivative and using 3 subintervals
obtain approximations to the temperature 13 and 23 of the length along the wire.
Solution
This Example falls into the general case given at the beginning of this Section if we choose p = 1,
q = 0, r = −0.1 and s = −52. In this case h = 1 and our numerical solution consists of the values

y1 ≈ y(1)
y2 ≈ y(2)
y0 − 2.1y1 + y2 = −52
y1 − 2.1y2 + y3 = −52

−2.1 1 y1 −67
=
1 −2.1 y2 −77
y1 = 63.8, y2 = 67.1
to 1 decimal place.
We conclude that the temperature 31 of the wire’s length from the cooler end is approximately 63.8◦ C
and the temperature the same distance from the hotter end is approximately 67.1◦ C, where we have
rounded these numbers to the same number of places as the given boundary conditions.
The Examples and Task above were such that p, q, r and s were each equal to a constant for all
values of x. More realistic engineering applications may involve coefficients that vary, and the next
Example is of this type.
10 HELM (2006):
®
Example 4
ln(2 + x)y 00 (x) + xy 0 (x) + (x + 1)2 y(x) = cos(x) 0 < x < 1.2
y(0) = 0, y(1.2) = 2.
differential equation and hence find yj ≈ y(jh), j = 1, 2.
Solution
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2
The coefficients will vary with j in this Example because the functions p, q, r and s are not all
constants. In this case h = 0.4 and our numerical solution consists of the values

y1 ≈ y(0.4)
y2 ≈ y(0.8)
0.795469y0 − 1.437337y1 + 0.955469y2 = 0.147370
0.869619y1 − 1.540839y2 + 1.189619y3 = 0.111473
involving them to the right-hand side. This gives the pair of equations

−1.437337 0.955469 y1 0.147370
=
0.869619 −1.540839 y2 −2.267766
y1 = 1.40, y2 = 2.26
HELM (2006): 11
Once again we can monitor the accuracy of the results obtained in the Example above by recomputing
for a smaller value of h. In Figure 4 the values calculated are shown as circles and a computer has
been used to obtain the more accurate results (shown as dots) obtained from choosing h = 1.220
.
2.5
1.5
1
3 subintervals
20 subintervals
0.5
0
0.2 0.4 0.6 0.8 1 1.2
x
Figure 4
In the next Example we see that the derivative y 0 appears in the boundary condition at x = 0. This
means that y is not given at x = 0 and we use the general rule given earlier in Key Point 3:
centre the stencil at every x -value where y is unknown.
So this implies that we must centre the stencil at x = 0 and this will cause the value y−1 to appear.
This is a fictitious value that plays no part in the solution we seek and we use the derivative boundary
condition to get y−1 in terms of y1 . This is done with the central difference
y1 − y−1
y 0 (0) ≈ .
2h
The following Example implements this idea.
Example 5
ln(2 + x)y 00 (x) + xy 0 (x) + 2y(x) = cos(x) 0 < x < 1.2
y 0 (0) = −1, y(1.2) = 2
(Note the derivative boundary condition at x = 0.)
differential equation and hence find the system of equations satisfied by yj ≈ y(jh),
j = 0, 1, 2.
12 HELM (2006):
®
Solution
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2
The coefficients vary with j in this Example because the functions p, q, r and s are not all constants.
y0 ≈ y(0) which is not given by the boundary condition

y1 ≈ y(0.4)
y2 ≈ y(0.8)
0.693147y−1 − 1.066294y0 + 0.6931472y1 = 0.16
which introduces the fictitious quantity y−1 . We attach a meaning to y−1 on using the boundary
condition at x = 0. Approximating the derivative in the boundary condition by a central difference
gives
y1 − y−1
= −1 ⇒ y−1 = y1 + 0.8
2h
and we use this to remove y−1 from the equation where it first appeared. Hence
−1.066294y0 + 1.386294y1 = 0.7145177
The remaining steps are similar to previous Examples. Putting j = 1 in the stencil gives
0.795469y0 − 1.430937y1 + 0.955469y2 = 0.147370.
0.869619y1 − 1.739239y2 + 1.189619y3 = 0.111473
In the last equation y3 is known from the boundary conditions and we move the term involving it
to the right-hand side. This leaves us with the system of equations
    
−1.066294 1.386294 0 y0 0.714518
 0.795469 −1.430937 0.955469   y1  =  0.147370 
0 0.869619 −1.739239 y2 −2.267766
where the components are given to 6 decimal places.
HELM (2006): 13
Exercises
1. Let y = y(x) be a solution to the boundary value problem
y 00 (x) − 2y 0 (x) + 3y(x) = 6 0 < x < 0.75
y(0) = 2, y(0.75) = 1
Using a mesh width of h = 0.25 obtain a central difference approximation to the differential
equation and hence find y1 ≈ y(0.25) and y2 ≈ y(0.5).
2y 00 (x) + 3y 0 (x) = 5 0 < x < 1.2
y(0) = −2, y(1.2) = 3
equation and hence find a system of equations satisfied by y1 ≈ y(0.3), y2 ≈ y(0.6) and
y3 ≈ y(0.9).
y 00 (x) + x2 y 0 (x) + 3y(x) = x 0 < x < 1.5
y 0 (0) = 2, y(1.5) = 1. (Note the derivative boundary condition at x = 0.)
equation and hence find the system of equations satisfied by yj ≈ y(jh), j = 0, 1, 2.
14 HELM (2006):
®
Answers
1. In general, the central difference approximation to
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2

h h
pj − qj = 1.25, h2 rj − 2pj = −1.8125, pj + qj = 0.75, h2 sj = 0.375.
2 2

y1 ≈ y(0.25)
y2 ≈ y(0.5)
So there are two unknowns, y1 and y2 . We centre the stencil at each of the corresponding x
1.25y0 − 1.8125y1 + 0.75y2 = 0.375
1.25y1 − 1.8125y2 + 0.75y3 = 0.375

−1.8125 0.75 y1 −2.125
=
1.25 −1.8125 y2 −0.375
with solution
y1 = 1.76, y2 = 1.42
HELM (2006): 15
Answers
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2

h h
pj − qj = 1.55, h2 rj − 2pj = −4, pj + qj = 2.45, h2 sj = 0.45.
2 2

y1 ≈ y(0.3)
y2 ≈ y(0.6)
y3 ≈ y(0.9)
So there are three unknowns, y1 , y2 and y3 . We centre the stencil at each of the corresponding
x values. Putting j = 1 in the numerical stencil gives
1.55y0 − 4y1 + 2.45 = 0.45
1.55y1 − 4y2 + 2.45 = 0.45
1.55y2 − 4y3 + 2.45 = 0.45
In these three equations y0 and y4 are known from the boundary conditions and we move
terms involving them to the right-hand side. This leads to the system of equations
    
−4 2.45 0 y1 3.55
 1.55 −4 2.45   y2  =  0.45 
0 1.55 −4 y3 −6.9
16 HELM (2006):
®
Answers
p(x)y 00 (x) + q(x)y 0 (x) + r(x)y(x) = s(x)
is

h
2
h
2 2
The coefficients vary with j in this exercise because the functions p, q, r and s are not all
constants. In this case h = 0.5 and our numerical solution consists of the values
y0 ≈ y(0) which is not given by the boundary condition

y1 ≈ y(0.5)
y2 ≈ y(1)
So there are three unknowns, y0 , y1 and y2 . We centre the stencil at each of the corresponding
x values. Putting j = 0 in the numerical stencil gives
y−1 − 1.25y0 + y1 = 0
which introduces the fictitious quantity y−1 . We attach a meaning to y−1 on using the
boundary condition at x = 0. Approximating the derivative in the boundary condition by a
central difference gives
y1 − y−1
=2 ⇒ y−1 = y1 − 2
2h
and we use this to remove y−1 from the equation where it first appeared. Hence
−1.25y0 + 2y1 = −2
The remaining steps are similar to previous exercises. Putting j = 1 in the stencil gives
0.9375y0 − 1.25y1 + 1.0625y2 = 0.125
0.75y1 − 1.25y2 + 1.25y3 = 0.25
In the last equation y3 is known from the boundary conditions and we move the term involving
it to the right-hand side. This leads to the system of equations
    
−1.25 2 0 y0 −2
 0.9375 −1.25 1.0625   y1  =  0.125 
0 0.75 −1.25 y2 −1
HELM (2006): 17

Elliptic PDEs 33.2
Introduction
In 32.4 and 32.5, we saw methods of obtaining numerical solutions to Parabolic and Hyperbolic
partial differential equations (PDEs). Another class of PDEs are the Elliptic type, and these usually
model time-independent situations. In this Section we will concentrate on two particularly important
Elliptic type PDEs: Laplace’s equation and Poisson’s equation.
' $
• familiarise yourself with difference methods
for approximating second derivatives (
Prerequisites 31.3 )
Before starting this Section you should . . . • revise the Jacobi and Gauss-Seidel methods
from ( 30.5)
&
%

Learning Outcomes • obtain simple approximate solutions of

certain elliptic equations

18 HELM (2006):
1. Elliptic equations
Consider a region R (for example, a rectangle) in the xy-plane. We might pose the following boundary
value problem
uxx + uyy = f (x, y) a given function, in R
u = g a given function, on the boundary of R
• if f = 0 everywhere, then the PDE is called Laplace’s equation
• if f is non-zero somewhere in R then the PDE is called Poisson’s equation
Laplace’s equation models a huge range of physical situations. It is used by coastal engineers to
approximate the motion of the sea; it is used to model electric potential; it can give an approximation
to heat distribution in certain steady state problems. The list goes on and on. The generalisation
to Poisson’s equation opens up further application areas, but for our purposes in this Section we will
concentrate on how to solve the equation, rather than on how it is applied.
2. A five point stencil

The approach we shall use is to approximate the two second derivatives using central differences.
First we need some notation for our numerical solution, and we shall re-use some of the ideas seen
in 32.4 and 32.5. We divide the x-axis up into subintervals of width δx and the y-axis
into subintervals of width δy.
There is a simplification available to us now that was not possible in 32. Here, the two
independent variables (x and y) both measure distance (in 32 we had x measuring distance
and t measuring time) and there is no reason to suppose that one direction is more important than
another, so we may choose the subintervals δx and δy to be equal.
Key Point 4
In deriving numerical solutions to elliptic PDEs we use equal steps in the x and y directions. That
is, we take
δx = δy = h (say)
So the idea is to approximate the second derivatives in the familiar way:

u(x + h, y) − 2u(x, y) + u(x − h, y) u(x, y + h) − 2u(x, y) + u(x, y − h)
uxx ≈ , uyy ≈
h2 h2
We will write our numerical approximation as
ui,j ≈ u(i h , j h)
| {z }
↑ ↑
numerical exact (i.e. unknown) solution
approximation evaluated at x = i × h, y = j × h
HELM (2006): 19
Section 33.2: Elliptic PDEs
Key Point 5
We use subscripts on u to relate to space variables. For Elliptic PDEs both of the independent
variables measure distance and so we have two subscripts.
Key Point 6
If there is no danger of ambiguity we may omit the comma from the subscript. That is,
ui,j may be written uij and fi,j may be written fij
Given all of this preamble we can now write down a difference equation which approximates the
partial differential equation:
ui+1,j − 2ui,j + ui−1,j ui,j+1 − 2ui,j + ui,j−1
2
+ 2
= fi,j
| h
{z } | h
{z }
↑
≈ uxx ≈ uyy notation for f (ih, jh)
Rearranging this gives
−4ui,j + ui+1,j + ui−1,j + ui,j+1 + ui,j−1 = h2 fi,j
This equation defines a five-point stencil approximating the PDE. The following diagram shows
the stencil.
ui,j+1
j+1
ui−1,j ui,j ui+1,j

j
j−1
ui,j−1
j−2
i−1 i i+1
The idea in an implementation of this stencil is to centre the cross-shape on each i, j node where we
want to find u. This guarantees that we will end up with the same number of equations as unknowns.
An example of this approach will follow shortly, but first we note other ways of writing down the
five-point stencil.
20 HELM (2006):
As the diagram above shows, the stencil involves a centre point and four additional points each
corresponding to one of the points of the compass. It is this observation which has led to a simplified
version of the mathematical expression and the diagram. The symbolic stencil can be written
−4u0 + uE + uW + uN + uS = h2 f0 ,
where a subscript 0 corresponds to the centre of the stencil and other subscripts correspond to
compass points (North, South, East, West) in the obvious way. The diagram becomes
uN
uW u0 uE
uS
and we reinterpret the local “0, N, S, E, W ” positions each time we move the stencil on the global
grid.
Another way of writing the stencil is as follows:
1 −4 1
1
This latest version has the advantage of showing the values of the coefficients used in approximating
uxx + uyy .
We summarise in Key Point 7 the main idea using the notation established above.
Key Point 7
The five-point stencil used to approximate the partial differential equation
uxx + uyy = f (x, y)
gives rise to the difference equation
−4u0 + uE + uW + uN + uS = h2 f0
HELM (2006): 21
Example 6
Consider the boundary value problem
uxx + uyy = 0 in the square 0 < x < 1, 0 < y < 1
u = x2 y on the boundary.
1
Use h = 3
and formulate a system of simultaneous equations for the 4 unknowns.
Solution
In the diagram on the right we see a schematic y ↑

1 4
of the square in the xy plane. The numbers cor- 0 9 9
1
respond to boundary data where the numerical
2
grid intersects that boundary. The (as yet un- 0 u12 u22 3
known) numerical approximations are shown in
1
the positions where they approximate u(x, y). 0 u11 u21 3
0 0 0 0 → x
The numerical stencil in this case is −4u0 + uE + uW + uN + uS = 0 and we centre this at each
of the places where u is sought. There are four such places in this example:
bottom left: −4u11 + u21 + 0 + u12 + 0 = 0
1
bottom right: −4u21 + 3
+ u11 + u22 + 0 = 0
1
top left: −4u12 + u22 + 0 + 9
+ u11 = 0
2 4
top right: −4u22 + 3
+ u12 + 9
+ u21 = 0
↑ ↑ ↑ ↑ ↑
Centre East West North South
This is a system of equations in the four unknowns which may be written
    
−4 1 1 0 u11 0
    
    1 
 1 −4 0 1   u21 
   
  3 
   = − 
    1 
 1
 0 −4 1    u12 
  
 9 

    
10
0 1 1 −4 u22 9
It is now a (simple, in theory) matter of solving the system to obtain the numerical approximation
to u.
22 HELM (2006):
1 7 5
It turns out that the solution to the system of equations is u11 = 12 , u21 = 36 , u12 = 36 and
13
u22 = 36 . These values are, to four decimal places, 0.0833, 0.1944, 0.1389 and 0.3611, respectively.
We will say more later about how to solve the system of equations, but first there is a Task to help
consolidate what we have covered so far.
Task
Consider the boundary value problem
uxx + uyy = −2 in the square 0 < x < 1, 0 < y < 1

u = xy on the boundary.
Use h = 13 and hence formulate a system of simultaneous equations for the four
unknowns.
Your solution
HELM (2006): 23
Answer
In the diagram on the right we see a schematic y ↑

of the square in the xy plane. The numbers cor-
1 2
respond to boundary data where the numerical 0 3 3
1
grid intersects that boundary. The (as yet un-
2
known) numerical approximations are shown in 0 u12 u22 3
the positions where they approximate u(x, y).
1
0 u11 u21 3
0 0 0 0 → x
The numerical stencil in this case is
1 2
−4u0 + uE + uW + uN + uS = h2 f0 = ( )2 × (−2) = −
3 9
and we centre this at each of the places where u is sought. In this Example there are four such
places:
bottom left: −4u11 + u21 + 0 + u12 + 0 = − 92
1
bottom right: −4u21 + 3
+ u11 + u22 + 0 = − 29
1
top left: −4u12 + u22 + 0 + 3
+ u11 = − 92
2 2
top right: −4u22 + 3
+ u12 + 3
+ u21 = − 92
↑ ↑ ↑ ↑ ↑
Centre East West North South
This is a system of equations in the four unknowns and it may be written
    2 
−4 1 1 0 u11 9
    
    5 
 1 −4 0 1   u21 
   
  9 
   = − 
    5 
 1
 0 −4 1    u12 
  
 9 

    
14
0 1 1 −4 u22 9
24 HELM (2006):
3. Systems of equations
In order to obtain accurate results over a large number of interior points, we need to decrease h
compared to the values used in the Examples above.
The diagram below shows a case where 5 steps are used in each direction on a square domain. It
follows that there will be 4 × 4 = 16 unknowns. Positioning the stencil over each xy position where u
is unknown will give the right number of equations, and the order we take the 16 points is indicated
by the arrows on the diagram.
x
It follows that there will be a system of equations involving
−4
0 10 1
1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 u11
B CB C
−4
B CB C
B
B 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 CB
CB u21 C
C
B CB C
−4
B CB C
B 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 CB u31 C
B CB C
B CB C
−4
B
0 0 1 1 0 0 1 0 0 0 0 0 0 0 0
CB
u41 C
C
B CB
B CB C
B CB C
1 0 0 1 −4 1 0 0 1 0 0 0 0 0 0 0 u12 C
B CB C
B CB
B CB C
B CB C
0 1 0 0 1 −4 1 0 0 1 0 0 0 0 0 0 u22 C
B CB C
B CB
B CB C
B CB C
0 0 1 0 0 1 −4 1 0 0 1 0 0 0 0 0 u32 C
B CB C
B CB
B CB C
B CB C
0 0 0 1 0 0 1 −4 1 0 0 1 0 0 0 0 u42 C
B CB C
B CB
C = ...
B CB C
B CB
0 0 0 0 1 0 0 1 −4 1 0 0 1 0 0 0 u13 C
B CB C
B CB
B CB C
B CB C
B 0 0 0 0 0 1 0 0 1 −4 1 0 0 1 0 0 CB u23 C
C
B CB
B CB C
B CB C
B
B 0 0 0 0 0 0 1 0 0 1 −4 1 0 0 1 0 CB
CB u33 C
C
B CB C
B CB C
B
B 0 0 0 0 0 0 0 1 0 0 1 −4 1 0 0 1 CB
CB u43 C
C
B CB C
B CB C
B
B 0 0 0 0 0 0 0 0 1 0 0 1 −4 1 0 0 CB
CB u14 C
C
B CB C
B CB C
B
B 0 0 0 0 0 0 0 0 0 1 0 0 1 −4 1 0 CB
CB u24 C
C
B CB C
B CB C
B
B 0 0 0 0 0 0 0 0 0 0 1 0 0 1 −4 1 CB
CB u34 C
C
@ A@ A
0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 −4 u44
with a right-hand side that depends on the function f and the boundary conditions.
There is a great deal of structure in this matrix. Most of the elements are zero. Apart from that there
are five non-zero diagonal bands (from top-left to bottom-right), each corresponding to a component
of the five-point stencil. The main diagonal is made up of repetitions of −4, the coefficient from the
centre of the 5-point stencil. Immediately above and below the main diagonal are terms that come
from the easterly and westerly extremes of the stencil, respectively. Separated from the tridiagonal
band are two outlying lines of 1s. The uppermost sequence of 1s is due to the northerly point on the
stencil and the lowermost is a consequence of the southerly point.
HELM (2006): 25
It is worth noting that much of this structure failed to emerge in the numerical examples considered
earlier. This was because the mesh was so coarse (that is, h was so large) that the stencil was
always in touch with the boundary. It is more usual that most placings of the stencil will produce an
equation involving five unknowns.
In general, then, an implementation of the five-point stencil will ultimately involve having to solve a
potentially large number of simultaneous equations. We have seen in 30 methods for dealing
with systems of equations, for example we saw the Jacobi and Gauss-Seidel iterative methods. It is
possible, in the present application, to implement these methods directly via the numerical stencil.
The next subsection describes how this may be achieved.
4. Iterative methods
An implementation of the five-point stencil
−4u0 + uE + uW + uS + uN = h2 f0
leads to a system of simultaneous equations in the unknowns. This system of equations can be dealt
with using methods seen in 30, but here we show ways in which systematic iterative methods
can be derived directly from the numerical stencil.
The general approach is as follows:
1. Start with an initial guess for the unknowns. Call this initial guess u0i,j .
2. Use some means to improve the guess. Call the improvement u1i,j .
3. And so on. In general we derive a new set of approximations un+1

i,j in terms of the previous
approximations uni,j .
Jacobi iteration
The approach we adopt here is to update the approximation at the centre of the stencil using the
four old values around the edge of the stencil. That is
−4un+1
0 + unE + unW + unS + unN = h2 f0
rearranging this gives
1 n
un+1 uE + unW + unS + unN − h2 f0

0 =
4
The following Example uses the same data (rounded to four decimal places here) as in Example 6.
26 HELM (2006):
Example 7
Suppose that u = u(x, y) satisfies Laplace’s equation
uxx + uyy = 0
in the square region 0 < x, y < 1 with u = x2 y on the boundary. Assuming
a mesh size of h = 13 use the Jacobi iteration, with starting values u0ij = 0, to
perform two iterations. The boundary data are as given in the schematic below.
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 u12 u22 0.6667
0.0000 u11 u21 0.3333
0.0000 0.0000 0.0000 0.0000 → x
Solution
Putting in the initial guesses for the four unknowns u11 , u12 , u21 , u22 we obtain the situation depicted
below.
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 0 0 0.6667
0.0000 0 0 0.3333
0.0000 0.0000 0.0000 0.0000 → x
The first iteration involves using
4u10 = u0E + u0W + u0N + u0S − h2 f0
where, in this case, h2 f0 = 0. So the first iteration gives us
u111 = 0.0000 u121 = 0.0833 u112 = 0.0278 u122 = 0.2778
The second iteration begins by putting these new approximations to the interior values into the grid.
This gives
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 0.0278 0.2778 0.6667
0.0000 0.0000 0.0833 0.3333
0.0000 0.0000 0.0000 0.0000 → x
We now apply 4u20 = u1E + u1W + u1N + u1S to obtain
u211 = 0.0278 u221 = 0.1528 u212 = 0.0972 u222 = 0.3056
HELM (2006): 27
In practice, using a computer to carry out the arithmetic, we would continue iterating until the results
settle down to a converged value. Using a computer spreadsheet, for example, we can see that a
total of 15 iterations is enough to achieve results converged to four decimal places. We noted earlier
that, to four decimal places, u11 = 0.0833, u21 = 0.1944, u12 = 0.1389 and u22 = 0.3611.
The following Task uses the same data as the preceding Task (pages 23-24), except that we have
rounded the boundary data to four decimal places instead of using the exact fractions.
Task
Suppose that u = u(x, y) satisfies Poisson’s equation
uxx + uyy = −2
in the square region 0 < x, y < 1 with u = xy on the boundary. Assuming a mesh
size of h = 31 use the Jacobi iteration, with starting values u0ij = 0, to perform
two iterations. The boundary data are as given in the schematic below.
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 u12 u22 0.6667
0.0000 u11 u21 0.3333
0.0000 0.0000 0.0000 0.0000 → x
Your solution
First iteration:
Answer
Putting in the initial guesses for the four unknowns we obtain the situation depicted below.
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 0 0 0.6667
0.0000 0 0 0.3333
0.0000 0.0000 0.0000 0.0000 → x
4u10 = u0E + u0W + u0N + u0S − h2 f0
where in this case h2 f0 = −0.2222. So the first iteration gives us
u111 = 0.0556 u121 = 0.1389 u112 = 0.1389 u122 = 0.3889
28 HELM (2006):
Your solution
Second iteration:
Answer
This gives
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 0.1389 0.3889 0.6667
0.0000 0.0556 0.1389 0.3333
0.0000 0.0000 0.0000 0.0000 → x
We now perform the second iteration 4u20 = u1E + u1W + u1N + u1S − h2 f0 again, but with the new
values. We obtain
u211 = 0.1250
u221 = 0.2500
u212 = 0.2500
u222 = 0.4583
In the case above 17 iterations are required to achieve results that have converged to 4 decimal
places. We find that u11 = 0.2222, u12 = 0.3333, u21 = 0.3333 and u22 = 0.5556.
Gauss-Seidel iteration
In the implementation of the Jacobi method we used old values for the southerly and westerly points
when new values had already been calculated.
new values
already found
HELM (2006): 29
The Gauss-Seidel method uses the new values as soon as they are available. Stating this formally we
have
new values here

↓ ↓
un+1 = 14 (unE + un+1 n+1

0 W + u S + unN − h2 f0
Example 8 below uses the same data as Examples 6 and 7.
Example 8
Suppose that u = u(x, y) satisfies Laplace’s equation
uxx + uyy = 0
in the square region 0 < x, y < 1 with u = x2 y on the boundary. Assuming a
mesh size of h = 31 , use the Gauss-Seidel iteration, with starting values u0ij = 0, to
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 u12 u22 0.6667
0.0000 u11 u21 0.3333
0.0000 0.0000 0.0000 0.0000 → x
Solution
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 0 0 0.6667
0.0000 0 0 0.3333
0.0000 0.0000 0.0000 0.0000 → x
4u10 = u0E + u1W + u0N + u1S − h2 f0
where in this case h2 f0 = 0. So the first iteration gives us
u111 = 0.0000
u121 = 0.0833
u112 = 0.0278
u122 = 0.3056
30 HELM (2006):
Solution (contd.)
This gives
y↑
0.0000 0.1111 0.4444 1.0000
0.0000 0.0278 0.3056 0.6667
0.0000 0.0000 0.0833 0.3333
0.0000 0.0000 0.0000 0.0000 → x
We now apply 4u20 = u1E + u2W + u1N + u2S − h2 f0 to obtain
u211 = 0.0278 u221 = 0.1667 u212 = 0.1111 u222 = 0.3472
(And, using a computer spreadsheet, for example, we can see that a total of 7 iterations is enough to
achieve results converged to four decimal places. This compares well with the 15 iterations required
by Jacobi in Example 7.)
Task
Suppose that u = u(x, y) satisfies Poisson’s equation
uxx + uyy = −2
in the square region 0 < x, y < 1 with u = xy on the boundary. Assuming a
mesh size of h = 13 use the Gauss-Seidel iteration, with starting values u0ij = 0, to
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 u12 u22 0.6667
0.0000 u11 u21 0.3333
0.0000 0.0000 0.0000 0.0000 → x
Your solution
First iteration:
HELM (2006): 31
Answer
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 0 0 0.6667
0.0000 0 0 0.3333
0.0000 0.0000 0.0000 0.0000 → x
4u10 = u0E + u1W + u0N + u1S − h2 f0
where in this case h2 f0 = −0.2222. We need to take care so as to use new values as soon as they
are available So the first iteration gives us
u111 = 0.0556
u121 = 0.1528 using the new u11 approximation
u122 = 0.4653 using the new u12 and u21 approximations
(to 4 decimal places).
Your solution
Second iteration:
32 HELM (2006):
Answer
This gives
y↑
0.0000 0.3333 0.6667 1.0000
0.0000 0.1528 0.4653 0.6667
0.0000 0.0556 0.1528 0.3333
0.0000 0.0000 0.0000 0.0000 → x
We now apply 4u20 = u1E + u2W + u1N + u2S − h2 f0 again, but with the new values. We obtain
u211 = 0.1319
u222 = 0.5330 using the new u12 and u21 approximations
and we can write this information in the form

y↑
0.0000 0.3333 0.6667 1.0000
0.0000 0.2882 0.5330 0.6667
0.0000 0.1319 0.2882 0.3333
0.0000 0.0000 0.0000 0.0000 → x
Again, a computer can be used to continue iterating until convergence. This method applied to this
Task needs 8 iterations to achieve 4 decimal place convergence, a fact which compares very well with
the 17 required by the Jacobi method.
Convergence
We now summarise some important points
1. For the problems discussed in these pages, the Jacobi and Gauss-Seidel methods will always
converge for any initial guesses u0ij . (Of course, very poor initial guesses will result in more
iterations being required.)
2. For a given problem and given starting guesses u0ij , the Gauss-Seidel method will, in general,
converge in fewer iterations than Jacobi. (That is, using the new, improved values as
soon as they are available speeds up the process.)
3. One possible advantage with the Jacobi approach is that it can be parallelised, that is, it is in
theory possible to do all the calculations for a given iteration simultaneously. In other words,
everything we will need to know to carry out an iteration is known before the iteration begins.
This is not the case with Gauss-Seidel in which during an iteration, most calculations use a
result from within the current iteration. This advantage with Jacobi only manifests itself when
using computers with a parallelisation option and for large problems.
HELM (2006): 33
Exercises
1. Suppose that u = u(x, y) satisfies Laplace’s equation
uxx + uyy = 0
in the square region 0 < x, y < 1. Assuming a mesh size of h = 13 use the Jacobi iteration,
with starting values u0ij = 0, to perform two iterations. The boundary data are as given in the
schematic below:
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 u12 u22 0.8000
0.8000 u11 u21 0.4000
0.0000 0.7500 0.2500 0.0000 → x
2. Suppose that u = u(x, y) satisfies Laplace’s equation
uxx + uyy = 0
in the square region 0 < x, y < 1. Assuming a mesh size of h = 31 use the Gauss-Seidel
iteration, with starting values u0ij = 0, to perform two iterations. The boundary data are as
given in the schematic below.
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 u12 u22 0.8000
0.8000 u11 u21 0.4000
0.0000 0.7500 0.2500 0.0000 → x
34 HELM (2006):
Answers
1. Putting in the initial guesses for the four unknowns we obtain the situation depicted below.
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 0 0 0.8000
0.8000 0 0 0.4000
0.0000 0.7500 0.2500 0.0000 → x
4u10 = u0E + u0W + u0N + u0S − h2 f0
where in this case h2 f0 = 0.0000. So the first iteration gives us
u111 = 0.3875
u121 = 0.1625
u112 = 0.1625
u122 = 0.3875
The second iteration begins by putting these new approximations to the interior values into
the grid. This gives
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 0.1625 0.3875 0.8000
0.8000 0.3875 0.1625 0.4000
0.0000 0.7500 0.2500 0.0000 → x
u211 = 0.4688
u221 = 0.3563
u212 = 0.3563
u222 = 0.4688
HELM (2006): 35
Answers
2. Putting in the initial guesses for the four unknowns we obtain the situation depicted below.
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 0 0 0.8000
0.8000 0 0 0.4000
0.0000 0.7500 0.2500 0.0000 → x
4u10 = u0E + u1W + u0N + u1S − h2 f0
where in this case h2 f0 = 0.0000. So the first iteration gives us
u111 = 0.3875
u121 = 0.2594
u112 = 0.2594
u122 = 0.5172
The second iteration begins by putting these new approximations to the interior values into
the grid. This gives
y↑
0.0000 0.2500 0.7500 1.0000
0.4000 0.2594 0.5172 0.8000
0.8000 0.3875 0.2594 0.4000
0.0000 0.7500 0.2500 0.0000 → x
u211 = 0.5172
u221 = 0.4211
u212 = 0.4211
u222 = 0.5980
36 HELM (2006):
Contents 34
Modelling Motion
34.1 Projectiles 2
34.2 Forces in More Than One Dimension 34
34.3 Resisted Motion 55
Learning outcomes
This Workbook follows on from Workbook 5 in describing ways in which mathematical
techniques are used in modelling. In this Workbook you will learn how use of vectors
provides shorthand descriptions of projectile motion in several contexts, motion in a circle
and on curved paths such as in fairground rides. Also you will learn how the complication
of velocity-dependent resistance to motion can be handled in certain cases.

Projectiles 34.1
Introduction
In this Section we study the motion of projectiles constrained only by gravity. Although historically
the mechanics of projectile motion were studied and developed mainly in military contexts, there
are many relevant non-military situations. For example botanists study the mechanics of dispersal
of seeds from exploding pods; hydraulic engineers are interested in the distribution and settling of
sediments and particles; many athletic activities and sports such as skiing and diving involve humans
acting as projectiles through leaping or hurdling or otherwise throwing themselves about. Other
sporting activities involve inanimate projectiles e.g. balls of various kinds, javelins. Precise models of
some possible situations, for example swerving or swinging or spinning balls, or ski-jumping involve
rather complicated kinds of motion and require considerations of resistive forces and aerodynamic
forces. First trips around the modelling cycle (see 5), sometimes second trips, are given here.
' $
• be able to use vectors and to carry out scalar
and vector products
• be able to use Newton’s laws to describe and

model the motion of particles
Prerequisites
• be able to use coordinate geometry to study
circles and parabolas
• be able to use calculus to differentiate and

integrate polynomials
&
' %
$
• use vector notation to represent the position,
velocity and acceleration of projectiles,
objects moving on inclined planes and objects
moving on curved paths
Learning Outcomes
On completion you should be able to . . . • compute frictional forces on static and
moving objects on inclined planes and on
objects moving at constant speed around
bends
& %
2 HELM (2006):
Workbook 34: Modelling Motion
®
1. Introduction
In this Section we study the motion of projectiles constrained only by gravity. We revise the model,
based on Newtons laws, for the motion of an object falling vertically without air resistance and extend
this to two dimensions using vector functions to represent position, velocity and acceleration. It is
pointed out that an object falling under gravity or thrown vertically upwards before falling back under
gravity are simple examples of projectiles. More interesting projectiles involve horizontal as well as
vertical motion. The vector nature of the motion is explored. The influences of launch height and
launch angle are explored in various contexts. Also we consider the motion of objects constrained to
move on inclined planes (e.g. the balls in pinball machines).
2. Projectiles: an introduction
Vertical motion under gravity
Consider a marble which is thrown horizontally off the Clifton Suspension Bridge at a speed of 10
m s−1 and falls into the River Avon. We wish to find the location at which it will splash into the
river. We assume that the only force acting on the marble is the force of gravity and that this force
is constant. The marble is regarded as a projectile i.e. a point object which has mass but does not
spin or rotate. Another assumption made is that the Earth is locally flat. Since the initial vertical
speed is zero, application of the distance-time equation (s = ut + 12 at2 ) to the vertical motion gives
1
y(t) = gt2 (1.1)
2
where y is measured downwards from the bridge and g is the acceleration due to gravity.
The position vector of an object falling freely in the vertical (j) direction with zero initial velocity
and no air resistance may be expressed as a position vector r(t) which is a variable vector depending
on the (scalar) variable t representing the time where
1
r(t) = y(t)j = gt2 j, illustrated in the diagram below.
2
j r y(t)= distance
dropped
in time t
y − axis
For motion in a straight line there is no particular reason for introducing vectors. However, time-
dependent vectors may be used to describe more complicated motion - for example that along curved
paths. By introducing the horizontal unit vector i in addition to the vertical unit vector j, a position
vector in two dimensions may be written
r(t) = x(t)i + y(t)j.
For an object falling vertically, x(t) = 0 because x does not change with time. Suppose, however,
that the object were to have been launched horizontally at speed u. Then, if air resistance is
ignored and there are no other forces acting in the horizontal direction, the horizontal acceleration is
zero and the horizontal speed of the object should remain constant. This means that the horizontal
HELM (2006): 3
Section 34.1: Projectiles
coordinate is given by x(t) = ut and, using the earlier result for y(t), the vector function describing
the position at time t of an object thrown horizontally from some point, which is taken as the origin
of coordinates, is given by
1
r(t) = uti + gt2 j. (1.2)
2
The coordinate system and the vectors corresponding to such a situation are shown in Figure 1.
Bridge O i
77 m y
r
j
River
x
Figure 1: Coordinate system and unit vectors for an object thrown from a bridge
The information in Equation (1.2) is sufficient to determine the object’s position graphically at any
time t, since it gives both x(t) and y(t) and hence it is possible to plot y(t) against x(t) for various
values of t. An example calculation of the path during the first 3.5 s of the descent of an object
thrown horizontally at 10 m s−1 with g = 9.81 m s−2 is shown in Figure 2. In this graph values
of y(t) increase downward so that the curve corresponds to the downward path of the object. The
technical name used to describe such a path is the trajectory.
20
y(t) m
40
60
0 10 20 30 40
x(t) m
Figure 2
Trajectory for first 3.5 s of an object thrown horizontally from a bridge
at 10 m s−1 ignoring air resistance
Given the vector components of the time-dependent velocity, it is possible to calculate its magnitude
and direction at any time. The magnitude is given by the square root of the sum of the squares of
the components. Hence, from the last example, the magnitude of the position vector is given by
1/2
2 2 1 24
|r| = u t + g t .
4
The angle of the position vector measured clockwise from the x-direction is given by
1
cos φ = ut/|r| sin φ = gt2 /|r|
2
4 HELM (2006):
®
so
1
tan φ = gt/u.
2
Note that the angle is zero when t is zero and increases with t (as might be expected). Figure 3
shows graphs of |r| and φ for the example and values of t considered in Figure 2.
100
|r(t)| m 60
50
40
φ◦
20
0 0
0 1 2 3 4 0 1 2 3 4
ts
ts
Figure 3: Values of |r| and φ◦ (= φ(t) × (180/π)) for the object projected from a bridge
Note that φ is the angle that the position vector makes with the horizontal and does not denote
the direction of motion (i.e. the velocity) of the object. Note also that by introducing another unit
vector k at right-angles to both i and j it is possible to consider motion in three dimensions.
Task
Write down the position vector for a particle moving so that its coordinates are
given by
x = 2 cos(wt) y = 2 sin(wt) z = 1.
What is the corresponding magnitude of this vector? How would you describe the
resulting motion?
Your solution
HELM (2006): 5
Answer
The position vector may be written
r(t) = 2 cos(ωt)i + 2 sin(ωt)j + k.
Hence
q √
|r(t)| = 4 cos2 (ωt) + 4 sin2 (ωt) + 1 = 5.
Since this is constant, the particle stays at a constant distance from the origin during its motion.
When t = 0, r(0) = 2i + k .
k
j
1
2 i
The object is moving in a circle of radius 2 in the z = 1 plane (see diagram).
Task
Show that the vector function
r(t) = ati + bt2 j,
where a and b are constant scalars, can be represented by a parabola.
By comparing Equation (1.1) with the equation in this Task demonstrate that the
trajectory shown in Figure 2 is part of a parabola.
Your solution
Answer
Given r(t) = ati + bt2 j = x(t)i + y(t)j we can write x(t) = at and y(t) = bt2 . Using the first of
x x2
these to obtain t = and substituting for t in the second, we obtain y = b 2 = cx2 where c is a
a a
constant. This has the form of a parabola centred on (0, 0).
Suppose that we wish to calculate the coordinates at which the marble will splash into the River
Avon, given that the water surface is 77 m below the point of launch. Since the horizontal component
of velocity is not changing during the fall, we concentrate on the vertical motion. The strategy is
to calculate the length of time it takes to drop through the vertical distance between the point of
launch and the water surface and then use this time to calculate the horizontal distance moved at
6 HELM (2006):
®
constant speed. We use Equation (1.1) to calculate the length of time needed to fall 77 m i.e. the
value of t such that
1 2
gt = 77.
2
r
2 × 77
This gives t = = 3.96 s. During this time the marble will have moved a horizontal distance
9.81
ut. So if u = 10 m s−1 , the horizontal distance moved is 39.6 m and the coordinates of the splash
down are (39.6, 77.0).
The question arises of how to deal with more general problems of a similar nature but starting from
first principles. This question leads to a fuller consideration of vector representations of motion.
Velocity and acceleration vectors

The first derivative of time-dependent position vectors may be identified as the velocity vector and
the second derivative as the acceleration vector. So, for the example of a stone falling from rest
under gravity without air resistance, given that the velocity vector is the first derivative of the position
vector,
d d 1
v(t) = r(t) = ( gt2 j) = gtj (since j does not vary with t).
dt dt 2
Similarly, the acceleration vector is the second derivative of the position vector, which will be the
same as the first derivative of the velocity vector, so
d d2
a(t) = v(t) = 2 r(t) = gj.
dt dt
Note that this is an expected result (the acceleration is that due to gravity).
In two dimensions
dx dy
v(t) = i+ j
dt dt
and
d2 x d2 y
a(t) = 2 i + 2 j.
dt dt
For the marble thrown horizontally at velocity u from the bridge
v(t) = ui + gtj
and
a(t) = gj.
Note that the horizontal and vertical parts of the velocity (or acceleration) are called the horizontal
and vertical components respectively. For the marble thrown horizontally at speed u from the bridge
the horizontal component of velocity at any time t is u and the vertical component of velocity at any
time t is gt.
Since each component of the vector is differentiated separately, the integral of the acceleration vector
may be identified with a velocity vector and the integral of the velocity vector may be identified with
a position vector. These give the same expressions as those that we started with apart from arbitrary
constants. Note that when integrating vector expressions the arbitrary constant is a constant vector.
HELM (2006): 7
Example 1
(a) Use integration and the variables and vectors identified in Figure 1
to derive vector expressions for the velocity and position of an object
thrown horizontally from a bridge at speed u ignoring air resistance.
(b) Find the object’s coordinates after it has dropped a distance h.
Solution
(a) The acceleration has only a vertical component i.e. the acceleration due to gravity. a(t) = gj.
Integrating once gives v(t) = gtj + c where c is a constant vector.
The initial velocity has only a horizontal component, so v(0) = c = ui and v(t) = ui + gtj.
1
Integrating again r(t) = uti + gt2 j + d where d is another constant vector.
2
Since r(0) = 0 , then d = 0 (the zero vector), so
1
r(t) = uti + gt2 j
2
which is the result obtained previously as Equation (1.2) by considering the horizontal and vertical
components of motion separately.

1 2
(b) The position coordinates at any time t are ut, gt .
2
When y(t) = h, then 12 gt2 = h, or
s
2h
t= . (1.3)
g
s s !
2h 2h
At this value of t, x(t) = ut = u . So the coordinates when y(t) = h are u ,h .
g g
In this Workbook you will only meet straightforward examples of vector integration where the integral
of the vector is obtained by integrating its components. More complicated vector integrals called line
integrals are introduced in 29.
8 HELM (2006):
®
3. Projectiles
Horizontal launches
Let us reflect on what has been done in the last example because it illustrates both the features of
projectile motion in the absence of air resistance and a procedure for solving mechanics problems
involving projectiles. Instead of the vector method used in Example 1, the relevant projectile motion
could have been considered in terms of the separate equations of motion in the horizontal (x-) and
vertical (y-) directions; these may be written
ẍ = 0 ÿ = g.
These may be solved separately but the vector method is neater since it shows horizontal and vertical
component information at the same time.
The most important features of projectile motion in the absence of air resistance are the
constant vertical acceleration and the constant horizontal speed. In projectile problems, the
usual procedure is to find the time taken to reach the vertical coordinate position of interest and then
to use this time together with the horizontal component of velocity to get the horizontal distance.
Example 2
In an apparatus to demonstrate two-dimensional projectile motion, ball bearings are
released simultaneously to roll on two identical ramps that are separated vertically.
The ramps consist of sloped and horizontal portions of the same length. The ball
bearing on the upper ramp becomes a projectile when it reaches the end of the
upper ramp while the lower ball bearing rolls along a horizontal channel when it
reaches the end of its ramp. The situation is represented in Figure 4.
(a) What is the speed of each ball bearing at the end of its ramp (A or B in Figure 4)?
(b) How does the point at which the upper ball bearing hits the lower channel vary with the
height of the upper ramp?
(c) Where will be the location of the lower ball bearing at the time at which the projectile
ball bearing hits the lower channel?
(d) What assumptions have been made in answering (a), (b) and (c)?
Upper ramp C
d A
h
Lower ramp D
d
B Channel
Figure 4: Side view of the projectile demonstrator
HELM (2006): 9
Solution
(a) The concepts of kinetic and potential energy and conservation of total energy may be used. At
the top of its ramp, each ball bearing will have a potential energy with respect to the bottom of
mgd, where m is its mass, g is gravity and d is the vertical drop from top to bottom of the ramp.
Also it will have zero kinetic energy since it is stationary. At the bottom of the ramp, the potential
energy will be zero (as long as the thickness of the ramp is ignored) and the kinetic energy will be
1
mu2 where u is the magnitude of the velocity at the bottom of each ramp. So, by conservation
2
of energy,
1 p
mgd = mu2 , or u = 2gd.
2
This will be the component of velocity in the direction of the sloping part of the ramp and, in the
absence of any losses along the ramp or at the bend where there is a sudden change in momentum,
this becomes the horizontal component of velocity at the end of the ramp.
(b) Since the ramps are identical, both ball bearings will have the same horizontal component of
velocity at the ends of their ramps. Suppose that we use coordinates x (horizontal) and y (downward
vertical) with the origin at A. The answer to Example 2(b) may be employed without having to
start
rfrom scratch. This tells us that the coordinates of the projectile ball bearing when y = h are
√ √

2h
u , h or, since u = 2gd , the coordinates are (2 hd, h). Since d is constant, this means
g
that the location of the point at which the projectile ball
√ bearing hits the lower channel varies with
the square root of the height of A above B (i.e. with h).
(c) As remarked earlier, the lower ball bearing will have the same horizontal component of velocity
(u) at B as the projectile ball bearing has at A. Consequently it will travel the same horizontal
distance in the same time as the projectile ball bearing. This means that the projectile ball bearing
should hit the lower one.
(d) In (a) the thickness of the ramps has been ignored and the bends in the ramps have been
assumed not to introduce any energy losses. In (b) air resistance has been assumed to be negligible.
In (a) and (c) rolling friction along the sloping ramp and the horizontal channel has been assumed
to be negligible. In fact the effects due to the bends in the ramps will mean that the calculation
of horizontal velocity at the end of the ramp is not accurate. However, it can be assumed that
identical bends will affect identical ball bearings identically. So the conclusion that the ball from the
upper ramp will hit the lower one is still valid (provided rolling friction for the lower ball bearing is
comparable to air resistance for the upper ball bearing).
Task
A crashed car is found on the beach near an unfenced part of sea wall where the
top of the wall is 18 m above the beach and the beach is level. The investigating
police officer finds that the marks in the beach resulting from the car’s impact
with the beach begin at 8 m from the wall and that the vehicle appears to have
been travelling at right-angles to the wall. Estimate how fast the vehicle must
have been travelling when it went over the wall.
10 HELM (2006):
®
Your solution
Answer
Use y measured downwards as the vertical coordinate. The vector equation of motion is
a = gj.
Integrating once gives
v = gtj + c.
The car’s initial vertical velocity component may be assumed to be zero. If the initial horizontal
component is represented by ui, then c = ui and
v = ui + gtj.
Integrating again to get position as a function of time,
1
r = uti + gt2 j + d.
2
In accordance with the initial condition that the vertical position is measured from the top of the
sea wall, d = 0 and
1
r = uti + gt2 j.
2
Now consider vertical motion only. When y = 18.0,
r
2 × 18
t= = 1.916.
9.81
HELM (2006): 11
Answer
So the car is predicted to hit the beach after 1.916 s. Next consider horizontal motion. During 1.916
s, in the absence of air resistance, the car is predicted to move a horizontal distance of u × 1.916.
This distance is given as 8 m. So
8 = 1.916u
or
u = 4.175.
So the car is estimated to have left the sea wall at a speed of just over 4 m s−1 (about 15 kph).
There are several complications that may arise when studying and modelling projectile motion. Launch
at some angle other than horizontal is the main consideration in the remainder of this Section. For
a given launch speed it is possible to find more than one trajectory that can pass through the same
target location. Another complication results from launch at a location that is not the origin of the
coordinate system used for modelling the motion.
Angled launches
Vector equations may be used to obtain the position and velocity of a projectile as a function of time
if the object is not launched horizontally but with some arbitrary velocity. We shall start by modelling
an angled launch from and to a horizontal ground plane. Again it is sensible to use the launch point
as the origin of the coordinate system employed. Ignoring air resistance, we shall find expressions for
the velocity and position vectors at time t of an object that is launched from ground level (r = 0)
at velocity u with direction θ above the horizontal. Subsequently we shall find an expression for the
time at which the object will hit the ground and the horizontal distance it will have travelled in this
time. Finally we will find the coordinates of the highest point on its trajectory (see Figure 5), the
angle that will give maximum range and an expression for maximum range in terms of the magnitude
and angle of the launch velocity.
maximum
height
θ
x
Figure 5: Path of a projectile after an angled launch from ground level

We use an upward-pointing vertical vector in all of the remaining projectile problems in this Workbook.
We start from first principles using Newton’s second law in vector form:
F = ma. (1.4)
This time, since the initial motion is upward, we choose to point the unit vector j upward and so the
weight of the projectile may be expressed as W = −mgj. Ignoring air resistance, the weight is the
only force on the projectile, so
ma = −mgj. (1.5)
12 HELM (2006):
®
Note the minus sign which is a result of the choice of direction for j. After dividing through by m,
we obtain the vector equation for the acceleration due to gravity: a = −gj.
dv dv
Recall that a = so, = −gj.
dt dt
Integrating this gives
v(t) = −gtj + c.
Integrating again gives
1
r(t) = − gt2 j + tc + d (1.6)
2
Since r(0) = 0, d = 0.
u u sin θ
θ
u cos θ i
Figure 6: Components of launch velocity

The initial velocity may be expressed in vector form. Recall from 9 that the component of a
vector along a specific direction is given by the dot product of the vector with the unit vector in the
direction of interest. The dot product involves the cosine of the angle between the vectors.
v(0) = c = u cos θi + u sin θj (from Figure 6).
Hence
v(t) = u cos θi + (u sin θ − gt)j (1.7)
and

1 2 1 2
r(t) = − gt j + tc + d = ut cos θi + ut sin θ − gt j. (1.8)
2 2
The vertical component of the position will be zero at the launch and when the projectile hits the
ground. This will be true when
1 1
ut sin θ − gt2 = 0 so t(u sin θ − gt) = 0.
2 2
This gives t = 0, as it should, or
2u sin θ
t= .
g
At this value of t the horizontal position coordinate ut cos θ will give the horizontal range, R, so
2u2 sin θ cos θ

2u sin θ
R=u cos θ = ,
g g
HELM (2006): 13
or, since sin(2θ) = 2 sin θ cos θ,
u2 sin 2θ
R= . (1.9)
g
From this result it is possible to deduce that the maximum range, R max , of a projectile measured
at the same vertical level as its launch occurs when sin(2θ) has its maximum value, which is 1,
corresponding to θ = 45◦ .
So, the maximum range is given by
Rmax = u2 /g. (1.10)
From (1.8), the height (y coordinate) at any time is given by
1
y(t) = ut sin θ − gt2 .
2
To find the maximum height, we can find the value of t at which ẏ(t) = 0. Once we have this
value of t we can substitute in the expression for y to find the corresponding value of y. Note that
the condition ẏ(t) = 0 at maximum height is the same as asserting that the vertical component of
velocity must be zero at the maximum height. Hence, by differentiating y(t) above, or from (1.7),
it is required that
u sin θ
u sin θ − gt = 0, which gives t = .
g
u sin θ
The height for any t is given by y(t) = ut sin θ − 12 gt2 . After substituting t = this becomes
g
u2 sin2 θ u2 sin2 θ u2 sin2 θ
y(t) = − = .
g 2g 2g
Note that the time at which the projectile reaches its maximum height is exactly half the total time of
2u sin θ
flight (t = ). In the absence of air resistance, the trajectory of the object will be a parabola
g
with maximum height at its vertex which will occur halfway between the launch and the landing i.e.
halfway through its flight. The horizontal coordinate of this point will be
Rmax u2 sin 2θ
= .
2 2g
So the coordinates of the maximum height are
u sin 2θ u2 sin2 θ
2
, . (1.11)
2g 2g
If the trajectory corresponds to maximum range, i.e. θ = 45◦ (which means that sin2 θ = 12 ), then
u2 u2 u2
the maximum height is at a horizontal distance of and the maximum range is . Although
4g 2g g
several of these results are useful, particularly (1.10) for maximum range, and are worth committing
to memory, it is more important to remember the method for deriving them from first principles.
14 HELM (2006):
®
Example 3
During a particular downhill run a skier encounters a short but sharp rise that causes
the skier to leave the ground at 25 m s−1 at an angle of 30◦ to the horizontal.
The ground immediately beyond the rise is flat for 60 m. Beyond this the downhill
slope continues. See Figure 7. Ignoring air resistance, will the skier land on the
flat ground beyond the rise?
y
u
30◦
x
60m
Figure 7: Coordinate system for skiing example
Solution
In this Example, from (1.9), R = 625 sin(60)/g = 55.7 m. So the skier will land on the flat part of
the slope. The horizontal range achieved will be reduced by air resistance but increased if the skier
is able to exploit aerodynamic lift from the skis during flight. In the absence of these effects, an
effort to leave the ground at a slightly faster speed would be rewarded with the possibility of landing
on the continuation of the downhill slope which may be an advantage in racing since it might reduce
the interruption caused by the rise while picking up speed. A speed of 26.5 m s−1 at the same angle
would mean that the skier lands beyond the flat ground at a range of 62 m from the rise.
A similar method to that used when considering a projectile launched from ‘ground’ level may be used
if the projectile is launched at some height above the chosen origin of coordinates. If air resistance
is ignored, the governing vector acceleration is still
a(t) = −gj
or
dv
= −gj.
dt
Integrating this
v(t) = −gtj + c.
Integrating again
1
r(t) = − gt2 j + tc + d.
2
This time, instead of being launched from y = 0, the projectile is launched from y = H. So
r(0) = Hj, and hence d = Hj. As before, the initial velocity may be expressed in vector form.
HELM (2006): 15
v(0) = c = u cos θi + u sin θj.
Hence v(t) = u cos θi + (u sin θ − gt)j which is the same as (1.8). But, now

1 2
r(t) = ut cos θi + ut sin θ − gt + H j (1.12)
2
which differs from (1.8) by the extra H in the j component. Note that this single vector equation
for r(t) may be expressed as two separate equations for x(t) and y(t):
1
x(t) = ut cos θ y(t) = ut sin θ − gt2 + H.
2
Example 4
A stone is thrown upwards at 45◦ from a height of 1.5 m above flat ground and
lands on the ground at a distance of 30 m from the point of launch. Ignoring air
resistance, calculate the speed at launch.
Solution
For the particular case of interest θ = 45◦ and the stone lands at a horizontal distance of 30 m from
ut
the point of launch. Using these values in the x(t) component of (1.12) gives 30 = √ . So the
√ 2
time for which the stone is in the air is 30 2/u. Substitution of this time, at which y = 0, into the
y(t) component of (1.12) gives
2 r
30 g
0 = 30 − g + 1.5 or u = 30 = 16.739 m s−1 .
u 31.5
The speed of release is about 17 m s−1 .
Choosing trajectories
So far most of the projectiles we have considered have been launched without any particular control
or target. However there are many instances in sport and recreation where there are clearly defined
targets for the projectile motion and the trajectory is controlled through the speed and angle of
launch. As we shall discover by considering several examples, it is possible to choose more than one
path to achieve a given target. First we model a case in which the choice of angle is important.
Consider a projectile launched at an angle θ to the horizontal. If we take the origin of coordinates at
the launch point, then, according to (1.7), the x- and y-coordinates of the projectile at time t are
1
x(t) = ut cos θ y(t) = ut sin θ − gt2 .
2
Although the path is characterised parametrically in terms of t by these expressions, if we are given
y or x or both instead of t, then it is useful to be able to express y in terms of x. We shall eliminate
x
t, by substituting t = (which can be deduced from the first equation) in the second equation
u cos θ
to give
16 HELM (2006):
®
1 x 2
y = x tan θ − g .
2 u cos θ
1
By using = sec2 θ = 1 + tan2 θ, the equation for y may be written in the form:
cos2 θ
1 x 2
y = x tan θ − g (1 + tan2 θ). (1.13)
2 u
x1
x2
Figure 8: A projectile achieves a given height at two different ranges

For given values of u, θ and y, Equation (1.13) represents a quadratic in x. For a given speed and
angle of launch, a given height is achieved at two different values of x. This is a consequence of
the parabolic form of the trajectory (Figure 8). For given values of u, y and x, i.e. given a launch
velocity and a target location, Equation (1.13) becomes a quadratic in tan θ i.e.
g x 2 g x 2
tan2 θ − x tan θ + + y = 0. (1.14)
2 u 2 u
Recall the condition b2 > 4ac for the quadratic at2 + bt + c = 0 to have real roots. As long as
x 2 g x 2
2
x > 2g +y (1.15)
u 2 u
the quadratic will have two real roots.
θ2
θ1
Figure 9: For a given speed and target location two different angles will attain the target
This means that two different choices of angle of launch will cause the projectile to pass through
given coordinates (x, y); this is illustrated in Figure 9. If the projectile is launched from y = H in the
x
chosen coordinate system, then the x(t) part of Equation (1.13) leads to the substitution t =
u cos θ
as before, but the equation for y becomes
1 g 2
1 + tan2 θ + H

y = x tan θ − g (1.16)
2 x
which differs from (1.13) by the addition of H to the right-hand side.
Next we will look at two Examples of projectiles with chosen trajectories. In the first Example
the influence of initial speed on the trajectory is important; in the second the influence of angle is
important.
HELM (2006): 17
Example 5
(choosing speed)
As a result of many years of practice, a university teacher, is skilled at throwing
screwed up sheets of paper, containing unsatisfactory attempts at setting exami-
nation questions, into a cylindrical waste paper bin. She throws at an angle of 20◦
above the horizontal and from 1.5 m above the floor. More often than not, the
paper balls land in the bin which is 0.2 m high and has a radius of 0.15 m. The
bin is placed so that its nearest edge is 3.0 m away (in a horizontal direction) from
the point of launch. Model the paper ball as a projectile. Ignore air resistance and
calculate the speed of throw that will result in the paper ball entering the bin at
the centre of its open end.
Solution
We can choose the origin at the point of launch, with x- and y-axes as before (see Figure 10). In
this case, we need to use the position vector (Equation (1.11)) and find the condition on the speed
for the throw to be on target. In particular, we are given that θ = 20◦ and the location of the
bin and need to determine the speed of throw necessary for the paper projectile to pass through
the centre of the open end of the bin. The centre of the open end of the bin has the coordinates
(3.15, −1.3) with respect to the chosen origin. Note the negative value of the vertical coordinate
since the top of the bin is located 1.3 m below the chosen origin.
u
y
20◦
x
1.5 m
0.2 m
3.0 m 0.3m
Figure 10: Path of screwed-up-paper projectiles
At the centre of the bin, using Equation (1.11) and the horizontal position coordinate, we have
3.15
ut cos 20 = 3.15. so, t= .
u cos 20
Also, from (1.16) and the vertical coordinate
1
ut sin 20 − gt2 = −1.3.
2
18 HELM (2006):
®
Solution (contd.)
Using g = 9.81 m s−2 , and the expression for t, gives
1
3.15 tan(20) − 9.81 (3.15)2 /u2 sec2 20 = −1.3,

2
s
0.5(9.81)
which means that u = 3.15 sec(20) .
1.3 + 3.15 tan(20)
Hence u = 4.746 m s−1 i.e. the academic throws the screwed up paper at about 4.75 m s−1 .
Clearly the motion of screwed up pieces of paper will depend to a significant extent on air resistance.
We shall consider how to model resisted motion later (Section 34.3).
Task
According to the model developed above, for what range of throwing speeds will
the academic be successful in getting the paper ball into the basket?
Your solution
Answer
Assume that the time of flight of the screwed up paper balls and the angle of throw do not change.
The permitted variation in throwing speed is determined by the horizontal distance ut cos 20 . The
screwed up paper ball will not find the bin if this product is less than 3 m or more than 3.3 m. Using
t = (3.15/4.75) cos 20 with ut cos 20 ranging from 3 to 3.3 gives 4.524 ≤ u ≤ 4.976.
Example 6
(choosing angle)
In the game of Tiddly-winks, small plastic discs or counters (‘tiddly-winks’ or
‘winks’) are caused to spring into the air by exerting sharp downward pressure
at their edges with another (usually larger) disc called a squidger. By changing
the pressure and overlap of the larger disc it is possible to control the velocity at
launch of each wink. The object of the game is to ‘pot’ all of the winks into a
cup or cylindrical receptacle before your opponent does. One important skill when
‘potting’ is to be able to clear the edge of the collecting cup with the winks. For a
given speed at launch, an experienced or successful player will know how the path
changes with angle. Suppose that air resistance can be ignored and that a wink
may be modelled as a point object and its spin may be ignored.
HELM (2006): 19
Given that the nearest edge of the cup is 0.05 m high,
(a) Calculate (i) the speed of launch such that the maximum height on the maximum range trajectory
is 0.05 m and (ii) the associated maximum range.
(b) Given a launch speed that is 0.1 m s−1 faster than that calculated in (a) find the angle of launch
that is likely to be successful for potting the wink when the centre of the cup is 0.1 m from the point
of launch.
Solution
(a) The expression for maximum height (Equation (1.11)) may be used to calculate a corresponding
speed of launch.
u2 √
Hence = 0.05, or u = 0.2g = 1.4 so the required speed of launch is 1.4 m s−1 .
4g
u2
For this trajectory the maximum height is reached at a horizontal distance of = 0.1 m from
2g
launch and the maximum range is 0.2 m (see Figure 11).
0.06
0.05
0.04
y(x)
0.02
0 0.05 0.1 0.15 0.2

x
Figure 11: Maximum range trajectory of wink achieving a maximum height of 0.05 m
(b) Although the wink, travelling on the trajectory shown in Figure 11 would reach the edge of the
cup i.e. a height of 0.05 m at 0.1 m range, it would not necessarily enter the cup. The finite size
of the wink might mean that it hits the edge of the cup and falls back. The wink is more likely to
enter the cup if it is descending when it encounters the cup.
20 HELM (2006):
®
Solution (contd.)
In this case the launch speed and target coordinates are specified so Equation (1.16) can be brought
into play. If we take x = 0.1 m, u = 1.5 m s−1 , andy (= H) = 0.05 m, then it turns out that
x 2 g x 2
condition (1.15) is satisfied (x2 = 0.01 and 2g + y = 0.006) and there are two
u 2 u
values for θ, which are 41.5◦ and 71.8◦ .
The corresponding trajectories y1(x) and y2(x) are shown in Figure 12. The smaller angle results
in the shallower trajectory (solid line). The larger angle produces the required result (dotted line)
that the wink is descending at x = 0.1 m and hence is more likely to enter the cup. This assumes
that the cup is at least 2 cm wide.
0.15
0.1 y2(x)
0.05
y1(x)
0.05 0.1 0.15

x
Figure 12
Two trajectories corresponding to x = 0.1, y = 0.05, u = 1.5 m s−1 that pass through (0.1, 0.05)
HELM (2006): 21
Example 7
(choosing the angle again)
An engineering student happens to be a fine shot-putter. At a tutorial on projectiles
he argues that, because he throws from a height of about 2 m, he needs to launch
the shot at an angle other than 45◦ to get the greatest range. He claims that
when launching at 45◦ the furthest he can put the shot is to a horizontal distance
of 17 m from the launch.
(a) Calculate the speed at which he releases the shot at 45◦ ignoring air
resistance.
(b) Write down an equation for the trajectory of the shot, assuming that
the shot is released always at the maximum speed calculated in (a).
Set the vertical coordinate to zero and substitute the constant L for
the maximum range at the height of launch to obtain a quadratic for
the horizontal range R.
(c) Hence, by differentiating the resulting equation with respect to R, cal-
culate the optimum angle of launch and the maximum range.
Solution
(a) For the particular case of interest θ = 45◦ and the shot lands at a horizontal distance of
17 m from the point of launch. Using these values, (1.12) gives
ut
17 = ut cos(45◦ ) = √
2
√
So the time for which the shot is in the air, i.e. before it lands, is 17 2/u. Substitution
of this time (at which y = 0) into the y part of (1.12) gives
2 r
17 g
0 = 17 − g + 2 or u = 17 = 12.2 m s−1 .
u 19
The speed of release is about 12 m s−1 . According to the shot-putter this is more or less
his maximum speed of release.
(b) The general equation for the trajectory of the shot is (1.16)
1 x 2
y = x tan θ − g (1 + tan2 θ) + H.
2 u
Given that the maximum speed of release and optimum angle of launch are employed,
the shot should land at the maximum range, R. From the general Equation (1.16), with
x = R and y = 0, we have
2
1 R
0 = R tan θ − g (1 + tan2 θ) + H.
2 u
22 HELM (2006):
®
Solution (contd.)
u2
Since does not depend on either R or θ, we replace it by a constant L, and rearrange the
g
equation into the usual form for a quadratic in R:
R2
0=− (1 + tan2 θ) + R tan θ + H.
2L
dR
(c) The optimum angle of launch is found by obtaining an expression for R and setting equal
dθ
to zero. As you can imagine, the expression for R resulting from solving this quadratic is rather
complicated and nasty to differentiate. An alternative approach is called implicit differentiation.
(See 11.7). We work through the equation as it stands differentiating term by term with
respect to θ and making use of the relationship
df (R) df (R) dR
= × .
dθ dR dθ
d(R2 ) dR
(For example, = 2R ). Hence implicit differentiation gives
dθ dθ
R2

1 dR dR
0=− 2R (1 + tan2 θ) − (2 tan θ sec2 θ) + tan θ + R sec2 θ.
2L dθ 2L dθ
dR
At the maximum range, = 0:
dθ
R2

R
0 = − (2 tan θ sec2 θ) + R sec2 θ so 2
R sec θ − tan θ + 1 = 0.
2L L
Since sec2 θ cannot be zero and the option of R = 0 is not very interesting, it is possible to conclude
that the relationship between the optimum angle of launch and the maximum range is given by
L
tan θ = .
R
This result may be substituted back into the quadratic for R to give
R2 L2 R2 L

0=− 1 + 2 + L + H or 0 = − − + L + H.
2L R 2L 2
Multiplying throughout by 2L gives
p
R2 = L2 + LH i.e. R = (L2 + 2LH).
A consequence of this result is that R > L. Bearing in mind that L = u2 /g is the maximum range
at the height of the launch (or for a launch at ground level), this means that the maximum range
from the elevated launch to ground level is greater than the maximum range in the plane at the
height of the launch. Substituting the result for R in the result for tan θ gives
L 1
tan θ = √ =r .
L2 + 2HL 2H
1+
L
HELM (2006): 23
Solution (contd.)
For H > 0, this implies that tan θ < 1, which in turn implies an optimum angle of launch < 45◦
and that the shot putter’s assertion is justified. When a projectile is launched at some angle from
some point above ground level to land on the ground then the optimum angle of launch is less
than 45◦ . Specifically, if u = 12.2 m s−1 and H = 2 m, then L = 15.2 m, R = 17.06 m and
tan θ = 0.9 (θ = 41.7◦ ).
The 45◦ launch trajectory and optimum angle launch trajectory are shown in Figure 13 together
with the maximum range trajectory for a ground level launch. A close up of the ends of the first
two trajectories is shown in Figure 14. The shot-putter can increase the length of his putt only by
a few centimetres if he putts at the optimum angle of launch rather than 45◦ . However these could
be a vital few centimetres in a tight competition!
6 y1(x) j
y
y2(x)
i
4
y0(x)
0
0 5 10 15 20
x
Figure 13: 45◦ trajectory (solid line) and optimum angle trajectories for a shot-put
2 y1(x)
y
y2(x)
0
15 16 x 17
Figure 14: Close up of ends of trajectories in Figure 13
24 HELM (2006):
®
Task
A fairground stall known as a ‘coconut shy’ consists of an array of coconuts placed
on stands. The objective is to win a coconut by knocking it off its stand with a
wooden ball. A local youth has learned that if he throws a wooden ball as fast as
he can at 10◦ above the horizontal he is able to hit the nearest coconut more or
less dead centre and knock it down almost every time. The nearest coconut stand
is located 4 m from the throwing position with its top at the same height as the
balls are thrown. The coconuts are 0.1 m long.
(a) Calculate how fast the youth is able to throw if air resistance is ignored:
Your solution
Answer
Choose an origin of coordinates at the point of launch of the wooden balls, with the y-coordinate
in the vertical upwards direction and the x-coordinate along the horizontal towards the coconut.
y
0.1 m
O x 4m
Possible trajectory of ball at fairground ‘coconut shy’ stall

If air resistance is ignored, the trajectory of the balls may be modelled by the Equation (1.13)
1 x 2
y = x tan θ − g (1 + tan2 θ).
2 u
If the balls hit the coconut dead centre, then their trajectory must pass through the coordinates (4,
0.05) (see diagram above). Hence
1 x 2
y = x tan θ − g (1 + tan2 θ)
2 u
2 s
4 2 (4 tan 10 − 0.05) g 1 + tan2 10
so = or u=4 = 11.1.
u g 1 + tan2 10 2 (4 tan 10 − 0.05)
So the youth is able to throw at 11.1 m s−1 .
HELM (2006): 25
(b) Calculate how much further the operator of the fairground stall should move the cocunuts from
the throwing line to prevent the youth hitting the coconut so easily:
Your solution
Answer
The youth will fail if the nearest coconut is moved sufficiently far away so that the trajectory
considered in part (a) passes beneath the bottom of the coconut on top of the stand. If x is the
horizontal range corresponding to a y-coordinate of 0, then, using the expression in Equation (1.9)
on page 13:
u2 sin 2θ (11.1)2 sin 20
x= . In this case, x= = 4.296.
g 9.81
So the nearest coconut stand should be moved another 0.296 m from the throwing line. This has
assumed that the youth will favour as ‘flat’ a trajectory as possible. The youth could choose to
throw at a greater angle to increase the range. For example throwing at an angle of 45◦ would
result in a range of 12.6 m. However the steeper the angle of launch, the greater will be the angle
to the horizontal at which the ball arrives at the coconut. A large angle would not be as efficient
as a small one for dislodging it.
Task
Basketball players are able to gain three points for long-range shooting. The shot
must be made from outside a certain radius from the basket. A skilled player
makes a jump shot rather than standing on the ground to shoot. He leaps so that
he is able to project the ball at a slower, i.e. more controllable, speed and from
the same height as the basket, which is 3 m above the ground. Assume that the
ball would be released from a height of 2 m when the player is standing on the
ground and that air resistance can be ignored.
26 HELM (2006):
®
(a) Calculate the speed of release during a jump shot made at a horizontal distance of 12 m from
the basket at maximum range for that speed of release:
Your solution
Answer
If the maximum range is 12 m, then, since the jump shot is made at the same height as the basket,
u2 /g = 12, i.e. u = 10.85, so the speed of release is 10.85 m s−1 .
(b) Calculate the preferred angle of launch that would hit the basket if the shot were to be made
when the player is standing at the same point and shoots at 12 m s−1 :
Your solution
Answer
Use may be made of Equation (1.15), in the form of a quadratic for tan(θ), where θ is the launch
angle, with y = 3, x = 12, H = 2. The largest root corresponds to the preferred form of trajectory
for passing into the basket (see diagram below) and gives tan(θ) = 1.764 or θ = 60.5◦ .
8
Height m
4
2
Distance m
0 2 4 6 8 10 12
Trajectories for a jump-shot (solid line) and a shot-from-the-floor (dashed line)
HELM (2006): 27
4. Energy and projectile motion
In this Section we demonstrate that, in the absence of air resistance, energy is conserved during the
flight of a projectile. Consider first the launch of an object vertically with speed u. In the absence
of air resistance, the height reached by the object is given by the result obtained in Equation (1.9)
u2 sin2 θ u2
on page 13 i.e. with θ = 90◦ , so the height is . This result was obtained in Equation
2g 2g
(1.10) by considering time of flight. However it can be obtained also by considering the energy. If
the highest point reached by the object is h above the point of launch, then with respect to the
level of launch, the potential energy of the object at the highest point is mgh. Since the vertical
velocity is zero at this point then mgh represents the total energy also. At the launch, the potential
1
energy is zero and the total energy is given by the kinetic energy mu2 . Hence, according to the
2
conservation of energy
1 u2
mu2 = mgh or h=
2 2g
as required. Now we will repeat this analysis for the more general case of an angled launch and
for any point (x, y) along the trajectory. Let us use the general results for the position vector and
velocity vector expressed in Equations (1.4) and (1.5) on page 12. In the absence of air resistance,
the horizontal component of the projectile velocity (v) is constant. If the height of the launch is
taken as the reference level, then the potential energy at any time t and height y is given by

1 2
mgy = mg ut sin θ − gt .
2
The kinetic energy is given by
1 1
m|v|2 = m (u sin θ − gt)2 + u2 cos2 θ

2 2
1 1
= m u2 − 2gtu sin θ + g 2 t = mu2 − mgx.
2 2
So
1 1
m|v|2 + mgx = mu2 .
2 2
1
But the initial kinetic energy, which is the initial total energy also, is given by mu2 . Consequently
2
we have shown that energy is conserved along a projectile trajectory.
28 HELM (2006):
®
5. Projectiles on inclined planes

The forces acting on an object resting on a sloping surface are its weight W , the normal reaction N ,
and the frictional force R. Since all of the forces act in a vertical plane then they can be described
with just two axes (indicated by the unit vectors j and k in Figure 15). If there is negligible friction
(R = 0) then, of course, the object will slide down the plane. If we apply Newton’s second law to
motion in the frictionless inclined plane, then the only remaining force to be considered is W , since
N is normal to the plane. Resolving in the j-direction gives the only force as
−|W | cos(90 − α)j = −|W | sin αj.
z
y j k
N
W α
Figure 15: Mass on an inclined plane

This will be the only in-plane force on an object projected across the inclined plane, moving so that it
is always in contact with the plane, and projected at some angle θ above the horizontal in the plane
(as in Figure 16 for Example 8). Newton’s second law for such an object may be written
ma = −mg sin αj or a = −g sin αj.
The resulting acceleration vector differs from that considered in 34.2 Subsection 2 only by the
constant factor sin α. In other words, the ball will move on the inclined plane as a projectile under
reduced gravity (since g sin α < g).
Suppose that the object has an intial velocity u. This is given in terms of the chosen coordinates by
u cos θi + u sin θj
which is the same as that considered earlier in Section 34.1. So it is possible to use the result for the
range obtained in Section 34.1 Equation (1.6), after remembering to replace g by g sin α. Equations
(1.9) to (1.11) may be applied as long as g is replaced by g sin α.
HELM (2006): 29
Example 8
In a game a small disc is projected from one corner across a smooth board inclined
at an angle α to the horizontal. The disc moves so that it is always in contact
with the board. The speed and angle of launch can be varied and the object of the
game is to collect the disc in a shallow cup situated in the plane at a horizontal
distance d from the point of launch. Calculate the speed of launch at an angle of
45◦ in the plane of the board that will ensure that the disc lands in the cup.
Solution
y
Inclined j
Plane u
x
i
α θ
d
Plan View
Figure 16
Sketches of inclined plane and the desired trajectory in the plane for Example 8
Consider a point along the disc’s path portrayed from the side and looking down on the plane of
the board in Figure 16. We choose coordinates and unit vectors as shown in these figures, so that
y is up the plane and x along it, while z is normal to the plane. Equation (1.11) will apply as long
u2 sin 2θ
as g is replaced by g sin α. So the range is given by R = .
g sin α
u2 p
When R = d and θ = 45◦ , this gives d = or u = gd sin α
g sin α
Note that d represents the maximum ‘horizontal’ , i.e. in-plane, range from the point of launch for
this launch speed.
30 HELM (2006):
®
Task
Suppose that, in the game featured in the last example, there is another cup in
the plane with the centre of its open end at coordinates (3d/5, d/5) with respect
to the point of launch, and that a successful ‘pot’ in the cup will gain more points.
What angle of launch will ensure that the√ disc will enter this second cup if the
magnitude of the launch velocity is u = gd sin α?
Your solution
Answer
Equation (1.11) can be used with g replaced by g sin α i.e.
1 x 2
y = x tan θ − g sin α (1 + tan2 θ).
2 u
√
With y = d/5, x = 3d/5, u = gd sin α , it is possible to obtain a quadratic equation for tan θ:
9 tan2 θ − 30 tan θ + 19 = 0.
Hence the required launch angle is approximately 68◦ from the ‘horizontal’ in the plane.
Task
Skateboarders have built jumps consisting of short ramps angled at about 30◦ from
the horizontal. Assume that the speed on leaving the ramp is 10 m s−1 and ignore
air resistance.
(a) Write down appropriate position, velocity and acceleration vectors:
Your solution
HELM (2006): 31
Answer
Ignoring air resistance, the relevant vectors are
 
(u cos θ)t 
u cos θ
 
0

r(t) =   , v(t) =  , a(t) =  .

 
1 2
u sin θ − gt −gt −g
2
In the skateboarders case, u = 10 and θ = 30◦ , so the vectors may be written
 
8.67t 
8.67
 
0

r(t) =   , v(t) =  , a(t) =  .

 
1 2
5 − gt −gt −g
2
(b) Predict the maximum length of jump possible at the level of the ramp exits:
Your solution
Answer
The horizontal range x m at the level of exit from the ramp is given by
u2 sin 2θ 100 sin 60
x= = = 8.828.
g 9.81
(c) Predict the maximum height of jump possible measured from the ramp exits:
Your solution
Answer
From Section 34.1 Equation (1.11), the maximum height ym m measured from the ramp exit is
given by
u2v u2 sin2 θ 25
ym = = = = 1.274.
2g 2g 2 × 9.81
32 HELM (2006):
®
(d) Comment on the choice of slope for the ramp:
Your solution
Answer
u2 sin 2θ
For a fixed value of u, the maximum range is given when θ = 45◦ . On the other hand,
g
u2 sin2 θ
the maximum height is given when θ = 90◦ . The latter would require vertical ramps and
2g
these are not very practicable!
HELM (2006): 33
Forces in More
than One Dimension 34.2
Introduction
This Section looks at forces on objects resting or moving on inclined planes and forces on objects
moving along curved paths. The previous ideas are exploited in example calculations related to
passenger sensations of forces on amusement rides.
' $
• be able to use vectors and to carry out scalar
and vector products

Prerequisites
• be able to use coordinate geometry to study
circles and parabolas
• be able to use calculus to differentiate and

integrate polynomials
& %
' $
• compute frictional forces on static and
moving objects on inclined planes and on
objects moving at constant speed around
Learning Outcomes bends
On completion you should be able to . . . • calculate the forces experienced by
passengers in vehicles moving along straight,
curved and inclined tracks
& %
34 HELM (2006):
®
1. Forces in two or three dimensions

Forces during circular motion
Consider a particle moving in a horizontal plane so that its position at any time t is given by
r = r cos θi + r sin θj
where r is a constant and i and j are unit vectors at right-angles. The angle, θ, made by r with the
horizontal is a function of time. We can consider four special values of θ and the associated values
of r. These are shown in the following table and in Figure 17.
r(θ=π/2) y
θ r
0 ri r cos θ
π/2 rj r(θ=π) r(θ=0) j
π −ri r sin θ
3π/2 −rj θ
x
r(θ=3π/2) i
Figure 17
Note that |r| = r is a constant for all values of θ, so we must have motion in a circle of radius r. If
we assume a constant angular velocity ω rad s−1 so that θ = ωt, then the velocity is
dr
= ωr(− sin ωti + cos ωtj). (2.1)
dt
Hence, taking the dot product,
dr
r. = ωr2 (− cos ωt sin ωt + sin ωt cos ωt) = 0
dt
dr dr
which implies that is always perpendicular to r. Since is the velocity vector v, this means that
dt dt
the velocity vector is always tangential to the circle (see Figure 18). Note also that |v| = v = ωr,
v
so ω = . Differentiating (2.1) again,
r
d2 r
= ω 2 r(− cos ωti − sin ωtj) = −ω 2 r. (2.2)
dt2
dr
r̈
dt
r
Figure 18: The velocity vector is tangential to motion in a circle
HELM (2006): 35
Section 34.2: Forces in More Than One Dimension
d2 r
Equation (2.2) means that the second derivative, , which represents the acceleration a, acts along
dt2
dr
the radius towards the centre of the circle and is perpendicular to .
dt
The magnitude of the velocity (the speed) is constant and the acceleration, a, is associated with
the changing direction of the velocity. The force must act towards the centre of the circle to achieve
v
this change in direction around the circle. Since a(t) = −ω 2 r(t) , where ω = , we see that the
r
v2
acceleration acts towards the centre of the circle and has a magnitude given by a = . This is a
r
special example of the fact that forces in the direction of motion cause changes in speed, while forces
at right-angles to the direction of motion cause changes in direction.
When a particle is moving at constant speed around a circle on the end of a rope, then the force
directed towards the centre is supplied by the tension in the rope. When a vehicle moves at constant
speed around a circular bend in a road, then the force directed towards the centre of the bend is
supplied by sideways friction of the tyres with the road. If the vehicle of mass m were to be pushed or
dragged sideways by a steady force then it would be necessary to overcome the frictional force. This
force depends on the normal reaction R, which is equal and opposite to the weight of the vehicle
(mg). The friction force is given by µmg where µ is the coefficient of friction and it must at least
equal the required force towards the centre of the bend to avoid skidding. So, we must have
mv 2
µmg ≥ . (2.3)
r
Example 9
A car of mass 900 kg drives around a roundabout of radius 15 m at a constant
speed of 10 m s−1 .
(a) Draw a vector diagram showing the forces on the car in the vertical
and sideways directions.
(b) What is the magnitude of the force directed towards the centre of the
bend?
(c) What is the friction force between the car and the road?
(d) What does this imply about the minimum value of the coefficient of
friction?
R
mv 2
r
F
mg
Figure 19: Forces on a vehicle negotiating a circular bend at constant speed
36 HELM (2006):
®
Solution
(a) See Figure 19. (In addition to the sideways friction involved in cornering, there will be a net
force causing forward motion which is generated by the vehicle engine and exerted through friction
between the tyres and the road.) The forces are shown as if they act at the centre of the vehicle,
since the vehicle is being treated as a particle. Strictly speaking, the frictional forces on a road
vehicle should be considered to act at the tyre/road contact and there will be differences between
the forces at each wheel.
(b) The magnitude of the force is obtained by using
mv 2 900 × 100
= = 6000.
r 15
So the magnitude of the force acting towards the centre of the roundabout is 6000 N.
(c) The sideways force provided by friction is |F | = µ|R|. In this case |R| = mg.
(d) Consequently we must have µmg ≥ 6000, that is
6000
µ≥ = 0.68.
900 × 9.81
Task
Suppose that the coefficient of friction between the car and the ground in dry
conditions is 0.96.
(a) At what speed could the car drive around the roundabout without skidding?
Your solution
Answer
Equation (2.3) on page 36 states that for the vehicle to go round the bend just without skidding
mv 2 v2
= µmg, or = µg.
r r
In this case, the maximum speed is required, so the relationship is best rearranged into the form
√
v = µgr. The values to be substituted are µ = 0.96, g = 9.81 and r = 15, so
√
v = 0.96 × 9.81 × 15 = 11.9.
So the car will skid if it drives round the roundabout at more than 11.9 m s−1 (nearly 27 mph).
HELM (2006): 37
(b) What would be the radius of roundabout that would enable a car to drive around it safely in dry
conditions at 30 m s−1 (nearly 70 mph)?
Your solution
Answer
For this part of the question, the speed around the roundabout is known and the safe radius is to
be found. Further rearrangement of the expression used in part (a) gives
v2
r= .
µg
Hence
302
r= = 95.6.
0.96 × 9.81
In reality drivers should not be exactly at the limits of the friction force while going round the
roundabout. There should be some safety margin. So the roundabout should have a radius of at
least 100 m to allow cars to drive round it at 30 m s−1 .
(c) What would be the safe radius of this roundabout when conditions are wet so that the coefficient
of friction between the car and the road is reduced by a factor of 2?
Your solution
Answer
The form of the equation used in part (b) indicates that if the coefficient of friction is halved (to
0.48) by wet conditions, then the safe radius should be doubled.
38 HELM (2006):
®
When a cyclist or motorcyclist negotiates a circular bend at constant speed, the forces experienced
at the points of contact between the tyres and the road are a frictional force towards the centre of
the bend and the upward reaction to the combined weight of the cyclist and the cycle (see Figure
20). These forces can be combined into a resultant that acts along an angle to the vertical. Suppose
that the combined mass is m and that the coefficient of friction between the tyres and road surface
is µ.
reaction
i friction
Figure 20
Forces on cycle tyres and the angle for cyclist comfort on a circular bend.
The net driving force is ignored.
The total force vector may be written
F = mg(µi + j)
and the angle θ is given by
µmg
tan θ = =µ so that θ = tan−1 µ.
mg
v2
Also, as argued previously (Equation (2.3)), we must have µ ≥ , which means that
gr
v2

−1
θ ≥ tan
gr
To be comfortable while riding, the cyclist likes to feel that the total force is vertical. So when
negotiating the bend, the cyclist tilts towards the bend so that the resultant force acts along a ‘new
vertical’.
HELM (2006): 39
Example 10
(a) Calculate the angular velocity of the Earth in radians per second, as-
suming that the Earth rotates once about its axis in 24 hours.
(b) A synchronous communications satelite is launched into an orbit around
the equator and appears to be stationary when viewed from the Earth.
Calculate the radius of the satellite’s orbit, given that g = 9.81 m s−2
and that the radius of the Earth is 6.378×106 metres.
Solution
(a) The angular velocity of the Earth is

2π
= 7.272 × 10−5 radians per second.
24 × 60 × 60
(b) According to Newtonian theory of gravitation the attraction due to gravity at the Earth’s
GM m
surface, for a mass m, should be , which is set equal to mg in elementary calcu-
R2
GM
lations. Thus we must have g = 2 , so that the product GM equals
R
gR2 = 9.81 × (6.378)2 × 1012 = 3.991 × 1014 m3 /s2 .
For a circular satellite orbit of radius r, the gravitational force must equal mass times
inward acceleration. For a mass m travelling at orbital speed v and with orbital angular
velocity ω, the theory of circular orbits gives the result that the inward acceleration is
v 2 /r
.
ω2r
The equation ‘force equals mass times acceleration’ thus gives:
GM m
= mω 2 r.
r2
We wish to ensure that the value of ω for the satellite orbit equals the value of ω for the
GM gR2
Earth’s rotation. The equation above gives the result: r3 = 2 = 2
ω ω
and the value of ω to be used is that which has already been calculated. This gives the
result
3.991 × 1014
r3 = = 7.547 × 1022 m3
(7.272 × 10−5 )2
Taking the cube root gives r = 4.23 × 107 metres. The radius of the satellite orbit is
thus about 6.6 times the Earth’s radius.
40 HELM (2006):
®
Task
The pedals on a bicycle drive the chain ring, which moves the chain. The chain
passes around the sprocket (gear wheel) attached to the rear wheel and hence the
rear wheel and the bicycle are driven (see diagram). There are cog teeth of equal
width (d) cut into both the chain ring and the sprocket. If there are n1 teeth in the
chain ring and n2 teeth in the sprocket then n1 /n2 is the gear ratio. Suppose that
the radii of the chain ring, sprocket and rear wheel are r1 , r2 and r respectively
and that the angular velocities of the chain ring and rear wheel are ω1 and ω2
respectively.
r
chain ring
r2
r1
sprocket
Bicycle drive system.
(a) Write down an expression for the velocity (u) of the bicycle in terms of the angular velocity of
the rear wheel:
Your solution
Answer
u = rω2
(b) Write down the relationship between the velocity of the sprocket and the velocity of the chain
ring:
Your solution
Answer
r1 ω1 = r2 ω2
HELM (2006): 41
(c) Write down expressions for the circumference of the chain ring and that of the sprocket in terms
of the teeth width and number of teeth:
Your solution
Answer
2πr1 = n1 d, 2πr2 = n2 d
(d) Hence derive a relationship between the angular velocities and the gear ratio:
Your solution
Answer
n1 d n2 d ω2
From (b) and (c), ω1 = ω2 or n1 ω1 = n2 ω2 , so gear ratio =
2π 2π ω1
(e) Calculate the speed of the bicycle if the cyclist is pedalling at one revolution per second, the
radius of the rear wheel is 0.34 m and the gear ratio is 4:
Your solution
Answer
From (a) and (d), u = r(n1 /n2 )ω1 = 0.34(4)(2π) = 8.55 m s−1 (about 19 mph).
42 HELM (2006):
®
2. Amusement rides
The design of amusement rides is intended to make the forces experienced by passengers as exciting
as possible. High speeds are not enough. The production of accelerations of up to four times
that due to gravity and occasional feelings of near-weightlessness are deliberate design goals. The
accelerations may not only be in the forward or backward directions from the passengers’ perspective
but also sideways. Sideways accelerations are more limited than backwards or forwards ones, since
they are less welcome to passengers and also pose particular problems for the associated structures.
Upward accelerations of greater than g are avoided because of the safety risk and the need for
reliable constraints to prevent passengers ‘floating’ out of the carriages. The rates at which the
forces, or accelerations, change are important in producing the overall sensation also. The rate
of change of acceleration is called jerk. Even the rate of change of jerk, called jounce, may be
of interest. On roller-coasters the height, the tightness and the twistiness of turns in the track
are the main parameters that influence the thrill of riding on them. Another factor that relates
to thrill or discomfort is the degree of mismatch between what is observed and what is felt. For
example it is odd to feel as though one’s weight is acting in some other direction than the perceived
vertical. In this regard the magnitude of the force that is experienced may not be as important as its
direction. Although we consider only two dimensions in this Workbook, the design of roller-coasters
requires calculations and considerations in 3D. We shall consider some particular examples of forces
experienced by passengers. We start by exploring the forces that contribute to what we feel when
riding in a vehicle. Then we shall look at the forces experienced by passengers on amusement rides
ranging from rotors to roller-coasters.
Figure 21: Forces on a seated passenger

Fearsome forces
The experiences we have on amusement rides include those of being subjected to linear and sideways
accelerations that are sensed by the balancing system close to our ears and can make us feel giddy.
Potentially more ‘enjoyable’ are sensations of unfamiliar compressive forces that act on our bodies
through the vehicle in which we are travelling.
N
Figure 22: Forces when seated and being accelerated horizontally

Imagine that you are sitting on a stool, which is sufficiently tall so that your feet are not touching the
ground. What do you feel? Your weight is acting vertically downward. However you feel an upward
force, which is the normal reaction of the stool to your weight. This reaction is pushing you upward
(see Figure 21). Of course the total force on you is zero. Consequently there is a difference in this
HELM (2006): 43
case between the total force on you, which includes gravity, and the force that you experience which
excludes gravity.
On the other hand if you are sitting facing forwards in a vehicle that is accelerating forwards on a flat
track then you will experience the same acceleration as the vehicle, through the seat which pushes
you forwards (see Figure 22). The forward force from the seat combines with the normal reaction to
give a resultant that is not vertical. This simple example suggests that the force experienced by a
passenger during an amusement ride can be calculated by adding up all the component forces except
for the passenger’s weight.
Example 11
(a) Calculate the horizontal and vertical components of the force F experienced by
a passenger of mass 100 kg seated in a rollercoaster carriage that starts from rest
and moves in a straight line on a flat horizontal track with a constant acceleration
such that it is moving at 40 m s−1 after 5 s.
(b) Deduce the magnitude and direction of the force experienced by the passenger.
Solution
(a) Consider first the acceleration of the passenger’s seat. The coordinate origin is chosen at
the start of motion. The x-axis is chosen along the direction of travel, with unit vector
i, and the y-axis is vertical with y positive in the upward direction, with unit vector j.
The acceleration a may be calculated from
dv
a= = ai
dt
or v = ati + c. When t = 0, v = 0, so c = 0. When t = 5, v = 40i. Hence 40 = 5a or
a = 8. The acceleration in the direction of travel is 8 m s−2 , so the component of the
force F in the direction of travel is given by ma = 100 × 8 N = 800 N. The seat exerts a
force on the passenger that balances the force due to gravity i.e. the passenger’s weight.
So the vertical component of the force on the passenger is mg = 100 × 9.81 N = 981
N. Hence the total force experienced by the passenger may be expressed as
F = 800i + 981j.
This is the total force exerted by the vehicle on the passenger. The force exerted by the
passenger on the vehicle is the direct opposite of this i.e. −800i − 981j.
√
(b) The magnitude of the total force is 8002 + 9812 N ≈ 1300 N and the total force
experienced by the passenger is at an angle with respect to the horizontal equal to
tan−1 (981/800) ≈ 51◦ .
Here we considered the sudden application of a constant acceleration of 8 m s−2 which

will cause quite a jerk for the passengers at the start. On some rides the acceleration
may be applied more smoothly.
44 HELM (2006):
®
Example 12
Calculate the total force experienced by the passenger as a function of time if the
horizontal component of acceleration of the vehicle is given by

πt
 8 sin 0≤t≤5


10


8 t>5

Solution
With this horizontal component of acceleration, the component of force experienced by the passenger
in the direction of motion is (see Figure 23)

πt
 800 sin for 0 ≤ t ≤ 5


10


800 for t > 5

1000

t
800 sin π
10
500
0 t
0 1 2 3 4 5
Figure 23: Horizontal component of force
The vertical component remains constant at 981 N, as in Example 11. The total force may be
written

πt
 800 sin i + 981j 0 ≤ t ≤ 5


10
F =


800i + 981j t>5

HELM (2006): 45
Task
The idea of an amusement ride called the ‘Rotor’ is to whirl passengers around in
a cylindrical container at increasing speed. When the rotation is sufficiently fast
the floor is lowered but the passengers remain where they are supported by friction
against the wall. Given a rotor radius of 2.2 m and a coefficient of friction of 0.4
calculate the minimum rate of revolution when the floor may be lowered.
Your solution
Answer
The reaction of the wall on the passenger will have the same magnitude as the force causing motion
mv 2
in a circle i.e. = R. The vertical friction force between the passenger and the wall is µR.
r
r the wall when the floor is lowered if µR ≥ mg. Hence
The passenger of mass will remain against
µmv 2 rg
it is required that ≥ mg or v ≥ . Since v = ωr, where ω is the angular velocity, the
r µ
g
r
minimum required angular velocity is ω = and the corresponding minimum rate of revolution
µr
ω g
r
n= . Hence with r = 2.2 and µ = 0.4, the rate of revolution must be at least 0.53 revs
2π µr
per sec or at least 32 revs per min.
46 HELM (2006):
®
Task
In an amusement ride called the ‘Yankee Flyer’, the passengers sit in a ‘boat’,
which stays horizontal while executing a series of rotations on an arm about a
fixed centre. Given that the period of rotation is 2.75 s, calculate the radius of
rotation that will give rise to a feeling of near weightlessness at the top of each
rotation.
Your solution
Answer
To achieve a feeling of ‘near-weightlessness’ near the top of the rotation, the force on the passenger
towards the centre of rotation must be nearly equal and opposite to the reaction force of the seat
mv 2 v2
on the passengers i.e. = mg. This means that r = . Since v = ωr, this requires that
r g
g 2π
r = 2 . The period T = = 2.75 s, so ω = 2.285 rad s−1 . Hence r = 1.879 m.
ω ω
Example 13
An amusement ride carriage moves along a track at constant velocity in the hori-
zontal (x-) direction. It encounters a bump of horizontal length L and maximum
height h with a profile in the vertical plane given by

h 2πx
y(x) = 1 − cos , 0 ≤ x ≤ L.
2 L
Calculate the variation of the vertical component of force exerted on a passenger
by the seat of the carriage with horizontal distance (x) as it moves over the bump.
HELM (2006): 47
Solution
2
y(x) 1
x
0 20 40 60 80 100
Figure 24: Profile of the bump

Figure 24 shows a graph of y(x) against x for h = 2 and L = 100. Since the component of velocity
of the vehicle in the x-direction is constant, then after time t, the horizontal distance moved, x, is
given by x = ut as long as x is measured from the location at t = 0. Consequently the y-coordinate
may be written in terms of t rather than x, giving

h 2πut L
y(t) = 1 − cos , 0≤t≤ .
2 L u
The vertical component of velocity is given by differentiating this expression for y(t) with respect
to t.

h 2πu 2πut hπu 2πut
ẏ(t) = sin = sin .
2 L L L L
The vertical acceleration is given by differentiating this again.
2hπ 2 u2

hπu 2πu 2πut 2πut
ÿ(t) = cos = cos .
L L L L2 L
Two forces contribute to the vertical force exerted by the seat on the passenger: the constant
reaction to the passenger’s weight and the variable vertical reaction associated with motion over the
bump. The magnitude of the total vertical force R N exerted on the passenger by the seat is given
by
2hπ 2 u2 2hπ 2 u2

2πut 2πut
R = mg + mÿ(t) = mg + m cos = mg 1 + cos .
L2 L gL2 L
Some horizontal force may be needed to keep the vehicle moving with a constant horizontal com-
ponent of velocity and ensure that the net horizontal component of acceleration is zero. Since the
horizontal component of velocity is constant, there is no horizontal component of acceleration and
no net horizontal component of force. Consequently R = Rj represents the total force exerted on
the passenger. Figure 25 shows a graph of R/mg for h = 2 m, L = 100 m and u = 20 m s−1 .
1.2
R
1.0
mg
0.8 x
0 20 40 60 80 100
Figure 25: Vertical force acting on passenger
48 HELM (2006):
®
Banked tracks
Look back near the end of Section 34.2 subsection 1 which considered the forces on a cyclist trav-
elling around a circular bend of radius R. We were concerned with the way in which cyclists and
motorcyclists bank their vehicles to create a ‘new vertical’ along the direction of the resultant force.
This counteracts the torque that would otherwise encourage the rider to fall over when cornering.
Clearly, passengers in four wheeled vehicles, railway trains and amusement park rides are not able to
bank or tilt their vehicles to any significant extent. However what happens if2
the road or track is
v
banked instead? If the road or track is tilted or banked at angle θ = tan−1 to the horizontal,
gr
then, at speed v around the circular bend, it is possible to obtain the same result as that achieved
by tilting the cycle or motorcycle (see Figure 26).
resultant
reaction
θ Resultant force
j
θ
Banked road
friction or track
i θ
Figure 26: Equivalence of tilted cyclists and banked roads
Example 14
Calculate the angle at which a track should be tilted so that passengers in a railway
carriage moving at a constant speed of 20 m s−1 around a bend of radius 100 m
feel the resultant ‘reaction’ force as though it were acting vertically through their
centre line.
Solution
The angle of the resultant force on passengers if the track were horizontal is given by
2
−1 v
θ = tan ,
gr
where v = 20, r = 100 and g = 9.81.
Hence θ = 22.18◦ and the track should be tilted at 22.18◦ to the horizontal for the resultant force
to act at right-angles to the track.
HELM (2006): 49
Example 15
The first three seconds of an amusement ride are described by the position coor-
dinates
x(t) = 10 − 10 cos(0.5t)
(0 < t < 3)
y(t) = 20 sin(0.5t)
in the horizontal plane where x and y are in m.
(a) Calculate the velocity and acceleration vectors.

(b) Hence deduce the initial magnitude and direction of the acceleration
and the magnitude and direction of the acceleration after three seconds.
Solution
20
y(t)
10
0
0 5 10
x(t)
Figure 27: Path of ride

In this Example, the path of the ride is not circular (see Figure 27). In fact it is part of an ellipse.
If we choose unit vectors i along the x-direction and j along the y-direction, and origin at t = 0,
the position vector may be written
r(t) = (10 − 10 cos(0.5t))i + 20 sin(0.5t)j.
The velocity vector is obtained by differentiating this with respect to time.
v(t) = 5 sin(0.5t)i + 10 cos(0.5t)j.
The acceleration vector is obtained by differentiating again.
a(t) = 2.5 cos(0.5t)i − 5 sin(0.5t)j
At t = 0, a(t) = 2.5i. So the initial acceleration is 2.5 m s−2 in the x-direction.
At t = 3, a(t) = 2.5 cos(1.5)i − 5 sin(1.5)j = 0.177i − 0.487j.
So after three seconds the acceleration is 4.99 m s−2 at an angle of 88◦ in the negative y-direction.
This means a sideways acceleration of about 0.5g towards the inside of the track and almost at
right-angles to it.
50 HELM (2006):
®
Car velocity on a bend

Problem in words
A road has a bend with radius of curvature 100 m. The road is banked at an angle of 10◦ . At what
speed should a car take the bend in order not to experience any (net) side thrust on the tyres?
Figure 28 below shows the forces on the car.
mv 2
r
10◦
mg
Figure 28: A vehicle rounding a banked bend in the road.

In the figure R is the reactive force of the ground acting on the vehicle. The vehicle provides a force
of mg, the weight of the vehicle, operating vertically downwards. The vehicle needs a sideways force
mv 2
of in order to maintain the locally circular motion.
r
We have used the following assumptions:
(a) The sideways force needed on the vehicle in order to maintain it in circular motion (called the
mv 2
centripetal force) is where r is the radius of curvature of the bend, v is the velocity and m the
r
mass of the vehicle.
(b) The only force with component acting sideways on the vehicle is the reactive force of the ground.
This acts in a direction normal to the ground. (That is, we assume no frictional force in a sideways
direction.)
(c) The force due to gravity of the vehicle is mg, where m is the mass of the vehicle and g is the
acceleration due to gravity (≈ 9.8 m s−2 ). This acts vertically downwards.
The problem we need to solve is ‘What value of v would be such that the component of the reactive
mv 2
force of the ground exactly balances the sideways force of ?’ This will give us the maximum
r
velocity at which the vehicle can take the bend.
HELM (2006): 51
We can split the reactive force of the ground into two components. One component is in the
horizontal direction and the other in a vertical direction as in the following figure:
R sin 10◦
R
R cos 10◦
10◦
Figure 29: Reaction forces on the car

2
mv
The force of must be provided by a component of the reactive force in the horizontal direction
r
i.e.
mv 2
R sin(10◦ ) = (1)
r
However the reactive force must balance the force due to gravity in the vertical direction therefore
R cos(10◦ ) = mg (2)
We need to find v from the above equations. Dividing Equation (1) by Equation (2) gives
◦ v2
tan(10 ) = ⇒ v 2 = rg tan(10◦ )
rg
We are given that the radius of curvature is 100 m and that g ≈ 9.8 m s−2 . This gives
v 2 ≈ 100 × 9.8 × 0.17633
⇒ v 2 ≈ 172.8
⇒ v ≈ 13.15 m s−1 (assuming v is positive)
Interpretation
We have found that the maximum speed that the car can take the bend in order not to experience
any side thrust on the tyres is 13.15 m s−1 . This is 13.15 × 60 × 60/1000 kph = 47.34 kph. In
practice, the need for a margin of safety would suggest that the maximum speed round the bend
should be 13 m s−1 .
52 HELM (2006):
®
Exercises
1. A bend on a stretch of railway track has a radius of 200 m. The maximum sideways force on
the train on this bend must not exceed 0.1 of its weight.
(a) What is the maximum possible speed of the train on this bend?
(b) How far before this bend should a train travelling at 30 m s−1 begin to decelarate given
that the maximum braking force of the train is 0.2 of its weight?
(c) What modelling assumptions have you made? Comment on their validity.
2. The diagram shows a portion of track of a one-way fairground ride on which several trains are
to run. AB and CD are straight. BC is a circular arc with the dimension shown. Because
BC is also on a bridge, safety regulations require that the rear of one train must have passed
point C before the front of the next train passes point B. Trains are 30 m long.
144 m O
B 60◦
C D
If the maximum sideways force on a train can be no more than 0.1 of its weight, find the
shortest time it can take for a train to travel from B to C. Hence find the minimum time
between the front of one train passing point B and its rear end passing point C. Recommend
a minimum distance between trains.
HELM (2006): 53
Answers
1. (a) According to Equation (2.2) on page 35, during travel round the bend the sideways force
on the train is given by
M v2
M rω 2 =
r
M v2
The weight of the train is M g. Given that ≤ 0.1M g, the maximum possible speed,
√ r
vmax , is given by vmax = 0.1rg.
Using r = 200 m and g = 9.81 m s−2 , this implies that vmax = 14.0 m s−1 to 3 s.f.
(b) Given initial speed is 30 m s−1 , and final speed is 14 m s−1 and maximum braking force
is 0.2 M g, implying acceleration is −0.2g. Then, using the formula ‘v 2 = u2 + 2as’,
where u is initial speed, v its final speed, a is acceleration and s is distance travelled,
gives
20g = 900 − 0.4gs or s = 179.358 m.
This suggests that braking should begin about 180 m before the start of the bend.
(c) Assumptions include constant maximum braking, negligible thinking time and no skid-
ding.
2. The shortest time on the circular bend will be taken when the train is moving at the maximum
possible speed.
M v2
This will occur when = 0.1M g. If r = 144 m and g = 9.81 m s−2 , this implies
r
vmax = 11.885 m s−1 .
π
The length of BC is 144 = 150.796 m.
3
150.796
The time taken for any point on the train to move from B to C is = 12.687 s.
11.885
So, given that the length of each train is 30 m, to make sure that the rear of one train has
passed C before the front of the next train arrives at B, a minimum time between the trains
30
of (12.687 + ) s = 15.212 s is required. After including a small safety margin, each
11.885
train should be 16 s apart. Assuming that the trains are moving at a constant speed of 11.885
m s−1 , this implies that they should be at least 190 m apart.
54 HELM (2006):

Resisted Motion 34.3
Introduction
This Section returns to the simple models of projectiles considered in Section 34.1. It explores the
magnitude of air resistance effects and the effects of including simple models of air resistance on the
earlier analysis.
#
• be able to solve second order, constant
coefficient ODEs
Prerequisites
"
' !
$
• compute the effect of air resistance
proportional to velocity on particles moving
under gravity
Learning Outcomes
• define terminal velocity for linear and
quadratic dependence of resistance on
velocity
& %
HELM (2006): 55
Section 34.3: Resisted Motion
1. Resisted motion
Resistance proportional to velocity
In Section 34.2 we introduced methods of analysing the motion of projectiles on the assumption
that air resistance or drag can be neglected. In this Section we will consider the accuracy of this
assumption in some particular cases and take a look at the consequences which including air resistance
has for the vector analysis of forces and motion.
Consider the subsequent motion of an object that is thrown horizontally. Let us introduce coordinate
axes x (horizontal, unit vector i) and y (vertical upwards, unit vector j) and place the origin of
coordinates at the point of release. The forces on the object consist of the weight mgj and a
resisting force proportional to the velocity v. This force may be written
−cv = −cxi − cẏj,
where c is a constant of proportionality. Newton’s second law gives
ma = m(ẍi + ÿj) = (−cẋi − cẏj − mgj).
This can be separated into two equations:
mẍ = −cẋ (3.1)
and
mÿ = −cẏ − mg. (3.2)
These equations each involve only one variable so they are uncoupled. They can be solved separately.
Consider the Equation (3.1) for the horizontal motion, first in the form
mẍ + cẋ = 0.
Dividing through by m and using a new constant κ = c/m,
ẍ + κẋ = 0
A solution to this equation (see 19) is
x = A + Be−κt
where A and B are constants. These constants may be evaluated by means of the initial conditions
x(0) = 0 ẋ(0) = v0
where v0 is the speed with which the object is thrown (recall that it is thrown horizontally). The
first condition gives
0=A+B
which means that A = −B. The second gives
v0 = −Bκ
v0
which implies that B = − , so
κ
v0
1 − e−κt

x(t) = (3.3)
κ
56 HELM (2006):
The initial conditions for the vertical motion are
y(0) = 0 ẏ(0) = 0.
Equation (3.2), in the form
ÿ + κẏ = −g
may be solved by multiplying through by eκt ( 19) which enables us to write
d
ẏeκt = −geκt .

dt
After integrating with respect to t twice,
gt
y(t) = C + De−κt − .
κ
The initial conditions give
0=C +D and 0 = −κD − g/κ
which means that D = −g/κ2 , so C = g/κ2 and
g gt
y(t) = 2
(1 − e−κt ) − . (3.4)
κ κ
From Equation (3.1), the horizontal component of velocity is
ẋ(t) = v0 e−κt . (3.5)
The air resistance causes the horizontal component of velocity to decrease exponentially from its
original value. From Equation (3.2), the upward vertical component of velocity is
g −κt
ẏ(t) = e −1 . (3.6)
κ
For very large values of t, e−κt is near zero, so the vertical component of velocity is nearly constant
at −g/κ. The negative sign indicates that the object is moving downwards. g/κ represents the
terminal velocity for vertical motion under gravity for a particle subject to air resistance proportional
to velocity. Sketches of the variations of the components of velocity with time are shown in Figure
30.
Initial value Terminal
velocity
Horizontal Downward
component component
of velocity
of velocity
Time
Time
Figure 30
Velocity components of an object launched horizontally
and subject to resistance proportional to velocity
HELM (2006): 57
By combining the components of velocity given in (3.5) and (3.6), it is possible to obtain the
magnitude and direction of the velocity of an object projected
p horizontally at speed v0 and subject
to air resistance proportional to velocity, the magnitude is (ẋ(t))2 + (ẏ(t))2 and the direction is
tan−1 (ẏ(t)/ẋ(t)).
Note that the expression for terminal velocity could be obtained directly from (3.2), by setting ÿ = 0.
Example 16
At the time that the parachute opens a parachutist of mass 100 kg is travelling
horizontally at 20 m s−1 and is 200 m above the ground. Calculate (a) the
parachutist’s height above the ground and (b) the magnitude and direction of the
parachutist’s velocity after 10 s assuming that air resistance during the first 100 m
of fall may be modelled as proportional to velocity with constant of proportionality
c = 100.
Solution
(a) Substituting m = 100, g = 9.81 and c = 100 in Equation (3.4) gives the distance
dropped during 10 s as 88.3 m. So the parachutist will be 111.7 m above the ground
after 10 s. The model is valid up to this distance.
(b) The vertical component of velocity after 10 s is given by Equation (3.6) i.e. 9.81 m s−1 .
The horizontal component of velocity is given by Equation (3.5) i.e. 9.08 × 10−4 m s−1 ,
which is practically negligible. So, after 10 s, the parachutist will be moving more or less
vertically downwards at 9.81 m s−1 .
If the object is launched at some angle θ above the horizontal, then the initial conditions on velocity
are
ẋ(0) = v0 cos θ ẏ(0) = v0 sin θ
These lead to the following equations, replacing (3.3) and (3.4):
v0 cos θ
1 − e−κt

x(t) = (3.7)
κ

v0 sin θ g gt
y(t) = + 2 1 − e−κt − . (3.8)
κ κ κ
To obtain the trajectory of the object, (3.7) can be rearranged to give

−κt κx 1 κx
(1 − e ) = and t = − ln 1 − .
v0 cos θ κ v0 cos θ
These can be substituted in (3.8) to give

g g κx
y = x tan θ + + 2 ln 1 − . (3.9)
κv0 cos θ κ v0 cos θ
58 HELM (2006):
Figure 31 compares predictions from this result with those predicted from the result obtained by
ignoring air resistance (Equations (3.1) and (3.2)). The effect of including air resistance is to change
the projectile trajectory from a parabola, symmetrical about the highest point, to an asymmetric
curve, resulting in reduced maximum range.
50
40
height m 30
20
10
0
0 25 50 75 100 125 150 175
horizontal distance m
Figure 31
Predicted trajectories of an object projected at 45◦ with speed 40 m s−1 in the absence of air
resistance (solid line) and with air resistance proportional to velocity such that κ = 0.184 (broken line)
Quadratic resistance
Unfortunately, it is not often very accurate to model air resistance by a force that is simply proportional
to velocity. For a spherical object, a good approximation for the dependence of the air resistance
force vector R on the speed (v) and diameter (D) of the object is
R = (c1 D + c2 D2 |v|)v (3.10)
with c1 = 1.55 × 10−4 and c2 = 0.22 in SI units for air. As would be expected intuitively, the
bigger the sphere and the faster it is moving the greater the drag it will experience. If D and |v| are
very small then the second term in (3.10) can be neglected compared with the first and the linear
approximation is reasonable i.e.
R ' c1 Dv D|v| ≤ 10−5 . (3.11)
Note that c1 << c2 , so if D and |v| are not very small, for example a cricket ball (D = 0.7 m)
moving at 40 m s−1 , the first term in (3.10) can be neglected compared with the second. This gives
rise to the quadratic approximation
R ' c2 D2 |v|v 10−2 ≤ D|v| ≤ 1. (3.12)
The ranges of validity of these approximations are shown graphically in Figure 32 for a sphere of
diameter 0.01 m. In general the linear approximation is accurate for small slow-moving objects and
the quadratic approximation is satisfactory for larger faster objects. The linear approximation is
similar to Stokes’ law (first stated in 1845):
|R| = 6πµr|v| (3.13)
HELM (2006): 59
where µ is the coefficient of viscosity of the fluid surrounding a sphere of radius r. According to
Stokes’ law, c1 = 3πµ. This gives c1 = 0.17 × 10−4 kg m−1 s−1 for air. Similarly, the quadratic
approximation is consistent with a relationship deduced by Prandtl (first stated in 1917) for a sphere:
|R| = 0.625ρr2 |v|2 (3.14)
where ρ is the density of the fluid. This implies that c2 = 0.625ρ/4 = 0.202 kg m−3 for air.
1 × 10−2
1 × 10−3
1 × 10−4
Resistive Force N
1 × 10−5
1 × 10−6
1 × 10−7
1 × 10−8
1 × 10−9
1 × 10−5 0.0001 0.001 0.01 0.1 1
Diameter × speed m2 s−1
Figure 32
Resistive force, as a function of the product of diameter and speed,
predicted by Equation (3.10) (solid line) and the approximations Equation (3.11) (broken line)
and Equation (3.12) (dash-dot line), for a sphere of diameter 0.01m
The mathematical complexity of the equations for projectile motion in 2D resulting from the quadratic
approximation is considerable. Consider an object with an initial horizontal velocity and the same
coordinate axes as before, but this time the resistive force is given by c|v|v (the quadratic approxi-
mation). For this case Newton’s second law gives
p p
ma = m(ẍi + ÿj) = −cẋ (ẋ2 + ẏ 2 )i − cẏ (ẋ2 + ẏ 2 )j − mgj.
The corresponding scalar differential equations are
p
mẍ = −cẋ (ẋ2 + ẏ 2 )
and
60 HELM (2006):
p
mÿ = −cẏ (ẋ2 + ẏ 2 ) − mg.
You should note that ẋ and ẏ appear in both equations and cannot be separated out. These differential
equations are coupled. (Ways of dealing with such coupled equations is introduced in 20.)
Task
Suppose that the academic in Example 1.5 screws up sheets of paper into spheres
of radius 0.03 m and mass 0.01 kg. Calculate the effect of linear air resistance on
the likelihood of the chosen trajectory entering the waste paper basket.
Your solution
Answer
Since D|v| = 4.75 × 0.06 = 0.285, the linear approximation for air resistance is not valid. If however
it is assumed that it is, then κ = c1 D/m = 1.55 × 10−4 × 0.06/0.01 = 9.3 × 10−4 . A plot of the
resulting trajectory according to Equation (3.9) is shown in the diagram below.
height m 3 3.3
1
0.2
0
1 2 3 4 horizontal distance m
Predicted trajectory of paper balls with linear air resistance

With the stated assumptions, air resistance is predicted to have little or no effect on the trajectory
of the paper balls.
HELM (2006): 61
Vertical motion with quadratic resistance
Although it is not straightforward to model motion in 2D with resistance proportional to velocity

squared, it is possible to consider the motion of an object falling vertically under gravity experiencing
quadratic air resistance. In this case the equation of motion may be written in terms of the (vertical)
velocity ( ẏ = v) as
dv
m = mg − cv 2 .
dt
This nonlinear differential equation can be solved by using separation of variables ( 19). First
we rearrange the differential equation to give
dt −m/c
= mg .
dv − + v2
c
Then we integrate both sides with respect to v, and write κ1 = c/m (note that this c is different
from the c used for linear air resistance) which yields

1 a+v
t+C = √ ln
2 gκ1 a−v
p
where a = g/κ1 . If the object starts from rest C = 0, so
a+v √
= e2t gκ1 and
a−v
√
1 − e−2t gκ1

√
v=a −2t
√
gκ
= a tanh(t gκ1 ). (3.15)
(1 + e 1 )
p
Note that for t → ∞ this predicts that the terminal velocity vt = a = g/κ1 . This expression
for terminal velocity may be compared with that for linear air resistance (g/κ). So the quadratic
resistance model predicts a square root form for terminal velocity. Note that the expression for the
terminal velocity for vertical motion of a particle subject to resistance proportional to the square of
dv dv vt
the velocity could be obtained from m = mg − cv 2 by setting = 0. If we write τ = (note
dt dt g
that this has units of time), then Equations (3.6) and (3.15) may be written
v = vt 1 − e−t/τ

and
(1 − e−2t/τ )
v = vt .
(1 + e−2t/τ )
Using these expressions, it is possible to compare the variation of the ratio v/vT as a function of
time in units of τ as in Figure 33. The graph shows the intuitive result that a falling object subject
to quadratic resistance approaches its terminal velocity more rapidly than a falling object subject to
resistance proportional to velocity. For example, at t/τ = 5, v/vt is 0.993 with linear resistance and
0.9991 with quadratic resistance. Note however that the terminal velocities and the time steps used
62 HELM (2006):
in the graph are different.
v
vt
1
quadratic
linear
0.5
t
0 1 2 3 4 5 τ
Figure 33
Comparison of the variations in vertical velocities
for a falling object subject to linear and quadratic resistance
Task
Note that the curves in Figure 33 are very close to each other and almost straight
for small values of t/τ . Why should this be the case? As well as proposing an
intuitive explanation, consider the result of expanding the exponential term in (3.6)
in a Maclaurin power series.
Your solution
Answer
It is to be expected that, in the initial stages of motion when v and t are small, the gravitational
force will dominate over air resistance i.e. v ≈ −gt. A Maclaurin power series expansion of the
exponential term in (3.6) gives
1
e−κt = 1 − κt + (κt)2 − . . .
2
1 g 1
So e−kt − 1 = −κt + (κt)2 − . . . so v ≈ × (−κt + (κt)2 − . . . )
2 κ 2
If t is much smaller than 1/κ, then only the first term need be considered which gives v ≈ −gt.
HELM (2006): 63
Contents 35
Sets and Probability
35.1 Sets 2
35.2 Elementary Probability 14
35.3 Addition and Multiplication Laws of Probability 29
35.4 Total Probability and Bayes’ Theorem 44
Learning outcomes
In this Workbook you will learn about probability. In the first Section you will learn about
sets and how they may be combined together using the operations of union and
intersection. Then you will learn how to apply the notation of sets to the notion of
probability and learn about the fundamental laws of probability.

Sets 35.1
Introduction
If we can identify a property which is common to several objects, it is often useful to group them
together. Such a grouping is called a set. Engineers for example, may wish to study all components
of a production run which fail to meet some specified tolerance. Mathematicians may look at sets of
numbers with particular properties, for example, the set of all even numbers, or the set of all numbers
greater than zero. In this block we introduce some terminology that is commonly used to describe
sets, and practice using set notation. This notation will be particularly useful when we come to study
probability in Section 35.2.

Prerequisites • have knowledge of basic algebra


'
$
• state what is meant by a set
• use set notation
• explain the concepts of the intersection and

Learning Outcomes union of two sets
On completion you should be able to . . . • define what is meant by the complement of a
set
• use Venn diagrams to illustrate sets

& %
2 HELM (2006):
Workbook 35: Sets and Probability
®
1. Sets
A set is any collection of objects. Here, the word ‘object’ is used in its most general sense: an object
may be a diode, an aircraft, a number, or a letter, for example.
A set is often described by listing the collection of objects - these are the members or elements of
the set. We usually write this list of elements in curly brackets, and denote the full set by a capital
letter. For example,
C = {the resistors produced in a factory on a particular day}
D = {on, off}
E = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
The elements of set C, above, are the resistors produced in a factory on a particular day. These
could be individually labeled and listed individually but as the number is large it is not practical or
sensible to do this. Set D lists the two possible states of a simple switch, and the elements of set E
are the digits used in the decimal system.
Sometimes we can describe a set in words. For example,
‘A is the set all odd numbers’.
Clearly all the elements of this set A cannot be listed.
Similarly,
‘B is the set of binary digits’ i.e. B = {0, 1}.
B has only two elements.
A set with a finite number of elements is called a finite set. B, C, D and E are finite sets. The set
A has an infinite number of elements and so is not a finite set. It is called an infinite set.
Two sets are equal if they contain exactly the same elements. For example, the sets {9, 10, 14} and
{10, 14, 9} are equal since the order in which elements are written is unimportant. Note also that
repeated elements are ignored. The set {2, 3, 3, 3, 5, 5} is equal to the set {2, 3, 5}.
Subsets
Sometimes one set is contained completely within another set. For example if X = {2, 3, 4, 5, 6} and Y =
{2, 3, 6} then all the elements of Y are also elements of X. We say that Y is a subset of X and
write Y ⊆ X.
Example 1
Given A = {0, 1, 2, 3}, B = {0, 1, 2, 3, 4, 5, 6} and C = {0, 1}, state which sets
are subsets of other sets.
Solution
A is a subset of B, that is A ⊆ B
C is a subset of B, that is C ⊆ B
C is a subset of A, that is C ⊆ A.
HELM (2006): 3
Section 35.1: Sets
Task
A factory produces cars over a five day period; Monday to Friday. Consider the
following sets,
(a) A = {cars produced from Monday to Friday}

(b) B = {cars produced from Monday to Thursday}
(c) C = {cars produced on Friday}
(d) D = {cars produced on Wednesday}
(e) E = {cars produced on Wednesday or Thursday}
State which sets are subsets of other sets.
Your solution
Answer
(a) B is a subset of A, that is, B ⊆ A.
(b) C is a subset of A, that is, C ⊆ A.
(c) D is a subset of A, that is, D ⊆ A.
(d) E is a subset of A, that is, E ⊆ A.
(e) D is a subset of B, that is, D ⊆ B.
(f) E is a subset of B, that is, E ⊆ B.
(g) D is a subset of E, that is, D ⊆ E.
4 HELM (2006):
®
The symbol ∈
To show that an element belongs to a particular set we use the symbol ∈. This symbol means is a
member of or ‘belongs to’. The symbol 6∈ means is not a member of or ‘does not belong to’.
For example if X = {all even numbers} then we may write 4 ∈ X, 6 ∈ X, 7 6∈ X and 11 6∈ X.
The empty set and the universal set

Sometimes a set will contain no elements. For example, suppose we define the set K by
K = {all odd numbers which are divisible by 4}
Since there are no odd numbers which are divisible by 4, then K has no elements. The set with no
elements is called the empty set, and it is denoted by ∅.
On the other hand, the set containing all the objects of interest in a particular situation is called the
universal set, denoted by S. The precise universal set will depend upon the context. If, for example,
we are concerned only with whole numbers then S = {· · · − 5, −4, −3, −2, −1, 0, 1, 2, 3, 4, 5, . . . }.
If we are concerned only with the decimal digits then S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
The complement of a set

Given a set A and a universal set S we can define a new set, called the complement of A and
denoted by A0 . The complement of A contains all the elements of the universal set that are not in
A.
Example 2
Given A = {2, 3, 7}, B = {0, 1, 2, 3, 4} and S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} state
(a) A0 (b) B 0
Solution
(a) The elements of A0 are those which belong to S but not to A.

A0 = {0, 1, 4, 5, 6, 8, 9}
(b) B 0 = {5, 6, 7, 8, 9}
Sometimes a set is described in a mathematical way. Suppose the set Q contains all numbers which
are divisible by 4 and 7. We can write
Q = {x : x is divisible by 4 and x is divisible by 7}
The symbol : stands for ‘such that ’. We read the above as ‘Q is the set comprising all elements x,
such that x is divisible by 4 and by 7’.
HELM (2006): 5
Section 35.1: Sets
2. Venn diagrams
Sets are often represented pictorially by Venn diagrams (see Figure 1).
A B
D
C
Figure 1
Here A, B, C, D represent sets. The sets A, B have no items in common so are drawn as non-
intersecting regions whilst the sets C, D have some items in common so are drawn overlapping.
In a Venn diagram the universal set is represented by a rectangle and sets of interest by area regions
within this rectangle.
Example 3
Represent the sets A = {0, 1} and B = {0, 1, 2, 3, 4} using a Venn diagram.
Solution
The elements 0 and 1 are in set A, represented by the small circle in the diagram. The large circle
represents set B and so contains the elements 0,1,2,3 and 4. A suitable universal set in this case is
the set of all integers. The universal set is shown by the rectangle.
Note that A ⊆ B. This is shown in the Venn diagram by A being completely inside B.
0
1 A
2
3
4
B
S
Figure 2: The set A is contained completely within B
6 HELM (2006):
®
Task
Given A = {0, 1} and B = {2, 3, 4} draw Venn diagrams showing
(a) A and B (b) A0 (c) B 0
Your solution
(a)
Answer
Note that A and B have no elements in common. This is represented pictorially in the Venn
diagram by circles which are totally separate from each other as shown in the diagram.
A B
0 2
3
1 4
Your solution
(b)
Answer
The complement of A is the set whose elements do not belong to A. The set A0 is shown shaded
in the diagram.
A A’
A
The complement of A contains elements which are not in A.
HELM (2006): 7
Section 35.1: Sets
Your solution
(c)
Answer
The set B 0 is shown shaded in the diagram.
B’ B
3. The intersection and union of sets

Intersection
Given two sets, A and B, the intersection of A and B is a set which contains elements that are
common both to A and B. We write A ∩ B to denote the intersection of A and B. Mathematically
we write this as:
Key Point 1
Intersection of Sets
A ∩ B = {x : x ∈ A and x ∈ B}
This says that the intersection contains all the elements x such that x belongs to A and also x
belongs to B.
Note that A ∩ B and B ∩ A are identical. The intersection of two sets can be represented by a Venn
diagram as shown in Figure 3.
8 HELM (2006):
®
A B
S
A∩B
Figure 3: The overlapping area represents A ∩ B
Example 4
Given A = {3, 4, 5, 6}, B = {3, 5, 9, 10, 15} and C = {4, 6, 10} state
(a) A ∩ B, (b) B ∩ C and draw a Venn diagram representing these intersections.
Solution
(a) The elements common to both A and B are 3 and 5. Hence A ∩ B = {3, 5}
(b) The only element common to B and C is 10. Hence B ∩ C = {10}
5
3 6 4
B C
9 15
10
S
Figure 4
Task
Given D = {a, b, c} and F = {the entire alphabet} state D ∩ F .
Your solution
Answer
The elements common to D and F are a, b and c, and so D ∩ F = {a, b, c}
Note that D is a subset of F and so D ∩ F = D.
The intersection of three or more sets is possible, and is the subject of the next Example.
HELM (2006): 9
Section 35.1: Sets
Example 5
Given A = {0, 1, 2, 3}, B = {1, 2, 3, 4, 5} and C = {2, 3, 4, 7, 9} state
(a) A ∩ B (b) (A ∩ B) ∩ C (c) B ∩ C (d) A ∩ (B ∩ C)
Solution
(a) The elements common to A and B are 1, 2 and 3 so A ∩ B = {1, 2, 3}.

(b) We need to consider the sets (A ∩ B) and C. A ∩ B is given in (a). The elements
common to (A ∩ B) and C are 2 and 3. Hence (A ∩ B) ∩ C = {2, 3}.
(c) The elements common to B and C are 2, 3 and 4 so B ∩ C = {2, 3, 4}.
(d) We look at the sets A and (B ∩ C). The common elements are 2 and 3. Hence
A ∩ (B ∩ C) = {2, 3}.
Note from (b) and (d) that here (A ∩ B) ∩ C = A ∩ (B ∩ C).
The example illustrates a general rule. For any sets A, B and C it is true that
(A ∩ B) ∩ C = A ∩ (B ∩ C)
The position of the brackets is thus unimportant. They are usually omitted and we write A ∩ B ∩ C.
Suppose that sets A and B have no elements in common. Then their intersection contains no
elements and we say that A and B are disjoint sets. We express this as
A∩B =∅
Recall that ∅ is the empty set. Disjoint sets are represented by separate area regions in the Venn
diagram.
Union
The union of two sets A and B is a set which contains all the elements of A together with all the
elements of B. We write A ∪ B to denote the union of A and B. We can describe the set A ∪ B
formally by:
Key Point 2
Union of Sets
A ∪ B = {x : x ∈ A or x ∈ B or both}
10 HELM (2006):
®
Thus the elements of the set A ∪ B are those quantities x such that x is a member of A or a member
of B or a member of both A and B. The deeply shaded areas of Figure 5 represents A ∪ B.
A B A B
S S
A∪B A∪B
(a) (b)
Figure 5
In Figure 5(a) the sets intersect, whereas in Figure 5(b) the sets have no region in common. We say
they are disjoint.
Example 6
Given A = {0, 1}, B = {1, 2, 3} and C = {2, 3, 4, 5} write down
(a) A ∪ B
(b) A ∪ C
(c) B ∪ C
Solution
(a) A ∪ B = {0, 1, 2, 3}
(b) A ∪ C = {0, 1, 2, 3, 4, 5}
(c) B ∪ C = {1, 2, 3, 4, 5}.
Recall that there is no need to repeat elements in a set. Clearly the order of the union is unimportant
so A ∪ B = B ∪ A.
HELM (2006): 11
Section 35.1: Sets
Task
Given A = {2, 3, 4, 5, 6}, B = {2, 4, 6, 8, 10} and C = {3, 5, 7, 9, 11} state
(a) A ∪ B (b) (A ∪ B) ∩ C (c) A ∩ B (d) (A ∩ B) ∪ C (e) A ∪ B ∪ C
Your solution
Answer
(a) A ∪ B = {2, 3, 4, 5, 6, 8, 10}
(b) We need to look at the sets (A ∪ B) and C. The elements common to both of these
sets are 3 and 5. Hence (A ∪ B) ∩ C = {3, 5}.
(c) A ∩ B = {2, 4, 6}
(d) We consider the sets (A ∩ B) and C. We form the union of these two sets to obtain
(A ∩ B) ∪ C = {2, 3, 4, 5, 6, 7, 9, 11}.
(e) The set formed by the union of all three sets will contain all the elements from all the
sets:
A ∪ B ∪ C = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
12 HELM (2006):
®
Exercises
1. Given a set A, its complement A0 and a universal set S, state which of the following expressions
are true and which are false.
(a) A ∪ A0 = S (b) A ∩ S = ∅ (c) A ∩ A0 = ∅
(d) A ∩ A0 = S (e) A ∪ ∅ = S (f) A ∪ ∅ = A
(g) A ∪ ∅ = ∅ (h) A ∩ ∅ = A (i) A ∩ ∅ = ∅
(j) A ∪ S = A (k) A ∪ S = ∅ (l) A ∪ S = S
2. Given A = {a, b, c, d, e, f }, B = {a, c, d, f, h} and C = {e, f, x, y} obtain the sets:
(a) A ∪ B (b) B ∩ C (c) A ∩ (B ∪ C)
(d) C ∩ (B ∪ A) (e) A ∩ B ∩ C (f) B ∪ (A ∩ C)
3. List the elements of the following sets:
(a) A = {x : x is odd and x is greater than 0 and less than 12}
(b) B = {x : x is even and x is greater than 19 and less than 31}
4. Given A = {5, 6, 7, 9}, B = {0, 2, 4, 6, 8} and S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} list the elements
of each of the following sets:
(a) A0 (b) B 0 (c) A0 ∪ B 0
(d) A0 ∩ B 0 (e) A ∪ B (f) (A ∪ B)0
(g) (A ∩ B)0 (h) (A0 ∩ B)0 (i) (B 0 ∪ A)0
What do you notice about your answers to (c),(g)?
What do you notice about your answers to (d),(f)?

5. Given that A and B are intersecting sets, i.e. are not disjoint, show on a Venn diagran the
following sets
(a) A0 (b) B 0 (c) A ∪ B 0 (d) A0 ∪ B 0 (e) A0 ∩ B 0
Answers
1.(a) T, (b) F, (c) T, (d) F, (e) F, (f) T, (g) F, (h) F, (i) T, (j) F, (k) F), (l) T.
2.(a) {a, b, c, d, e, f, h}, (b) {f }, (c) {a, c, d, e, f }, (d) {e, f }, (e) {f }, (f) {a, c, d, e, f, h}.
3.(a) {1, 3, 5, 7, 9, 11}, (b) {20, 22, 24, 26, 28, 30}.
4.(a) {0, 1, 2, 3, 4, 8}, (b) {1, 3, 5, 7, 9}, (c) {0, 1, 2, 3, 4, 5, 7, 8, 9}, (d) {1, 3},
(e) {0, 2, 4, 5, 6, 7, 8, 9}, (f) {1, 3}, (g) {0, 1, 2, 3, 4, 5, 7, 8, 9}, (h) {1, 3, 5, 6, 7, 9},
(i) {0, 2, 4, 8}.
5.
A B A B A B A B A B
S S S S S
(a) (b) (c) (d) (e)
HELM (2006): 13
Section 35.1: Sets
Elementary
Probability 35.2
Introduction
Probability is about the study of uncertainty. Engineers are expected to design and produce systems
which are both useful and reliable. Essentially we are dealing with situations where ‘chance’ is at
work and probability theory gives us the theoretical underpinning necessary for a full understanding
of any experimental results we observe in practice. Probability theory also gives us the tools to set up
mathematical models of systems and processes which are affected by random occurrences or ‘chance’.
In fact the study of probability enables engineers to discuss the reliability of the processes they use
and the systems they produce in terms that other engineers, scientists and designers can understand.
It is worth noting that ‘chance’ is taken to be responsible for variations in simple manufactured
products such as screws, bolts, and light bulbs as well as complex products such as cars, ships and
aircraft. In each of these products, small chance variations in raw materials and production processes
may have a substantial effect on a product.

Prerequisites • understand the ideas of sets and subsets


'
$
• explain the terms ‘random experiment’ and
‘event’
Learning Outcomes • calculate the probability of an event occurring
• calculate the probability that an event does
not occur
& %
14 HELM (2006):
®
1. Introductory probability
Probability as an informal idea is something you will have been familiar with for a long time. In
conversation with friends, you must have used sentences such as
• ‘It might start raining soon’
• ‘I might be lucky and pass all my examinations’
• ‘It is very unlikely that my team will not win the Premiership this year’
• ‘Getting a good degree will improve my chances of getting a good job’
Essentially, when you are talking about whether some event is likely to happen, you are using the
concept of probability. In reality, we need to agree on some terminology so that misunderstanding
may be avoided.
Terminology
To start with there are four terms − experiment, outcome, event and sample space − that need
formal definition. There will, of course, be others as you progress through this Workbook.
1. Experiment: - an activity with an observable result, or set of results, for example
(a) tossing a coin, the result being a Head or a Tail

(b) testing a component, the result being a defective or non-defective component
(c) maximum speed testing of standard production cars;
(d) testing to destruction armour plating intended for use on tanks.
Some of the experiments outlined above have a very limited set of results (tossing a coin)
while others (destruction testing) may give a widely variable set of results. Also it is worth
noting that destruction testing is not appropriate for all products. Companies manufacturing
say trucks or explosives could not possibly test to destruction on a large scale - they would
have little or nothing left to market!
2. Outcome - an outcome is simply an observable result of an experiment, for example
(a) tossing a coin, the possible outcomes are Heads or Tails

(b) testing a component, the outcome being a defective or non-defective component
(c) maximum speed testing of standard production cars, the outcomes being a set of numbers
representing the maximum speeds of a set of vehicles
(d) testing to destruction armour plating intended for use on tanks, the outcomes might be
(for example) the numbers of direct hits sustained before destruction.
3. Event - this is just an outcome or set of outcomes to an experiment of interest to the experi-
menter.
4. Sample Space - a sample space is the set of all possible outcomes of an experiment.
HELM (2006): 15
Section 35.2: Elementary Probability
For example, if we throw a die then the sample space is {1, 2, 3, 4, 5, 6} and two possible events are
(a) a score of 3 or more, represented by the set: {3, 4, 5, 6}
(b) a score which is even, represented by the set: {2, 4, 6}.
Everyday examples include games of chance.
Example 7
Obtain the sample space of the experiment throwing a single coin.
Solution
Consider the experiment of throwing a coin which can land Heads up (H) or Tails up (T ). We list
the outcomes as a set {H, T } − the order being unimportant. {H, T } is the sample space. On
any particular throw of a coin, Heads or Tails are equally likely to occur. We say that, for a fair
coin, H and T are equally likely outcomes.
If the sample space can be written in the form of a list (possibly infinite) then it is called a discrete
sample space (e.g. number of tosses of a fair coin before Heads occurs). If this is not possible then
it is called a continuous sample space (e.g. positions where shells land in a tank battle).
Task
List the equally likely outcomes to the experiments:
(a) throwing a fair die with six faces labelled 1 to 6
[Note: ‘die’ is the singular of ‘dice’, although most people use ‘dice’ instead.]
(b) throwing three fair coins.
Your solution
Answer
(a) {1, 2, 3, 4, 5, 6} (b) {T T T, T T H, T HT, T HH, HT T, HT H, HHT, HHH}
16 HELM (2006):
®
Task
For the following list of experiments, list (if possible) a suitable sample space. If
you cannot write out a suitable sample space, describe one in words.
(a) Test a light switch

(b) Count the daily traffic accidents in Loughborough involving cyclists
(c) Measure the tensile strength of small gauge steel wire
(d) Test the maximum current carrying capacity of household mains cabling
(e) Test the number of on-off switchings that a new type of fluorescent
tube will cope with before failure
(f) Pressure test an underwater TV camera.
Your solution
Answer
Sample spaces might be
(a) {works, fails}

(b) {0,1, 2, 3, . . . }, hopefully a small upper limit!
(c) {Suitable continuous range (0 → ) depending on the wire}
(d) {Suitable continuous range (0 → ) depending on the type of cable}
(e) {0,1, 2, 3, . . . . }, hopefully a high upper limit!
(f) {Suitable continuous range (0 → ) }
HELM (2006): 17
Example 8
A car manufacturer offers certain options on its family cars. Customers may order:
(a) either automatic gearboxes or manual gearboxes

(b) either sunroof or air-conditioning
(c) either steel wheels or allow wheels
(d) either solid colour paint or metallic paint
Find the number of outcomes in the sample space of options that it is possible to
order and represent them using a suitable diagram.
Solution
A suitable diagram is shown in Figure 6. The diagram makes it easy to find the number of outcomes
simply by counting. It also points the way to a formula for calculating the number of outcomes.
auto manual
sun air-cond sun air-cond

roof roof
steel alloy steel alloy steel alloy steel alloy

metallic
metallic
metallic
metallic
metallic
solid
metallic
metallic
metallic
solid
solid
solid
solid
solid
solid
solid
Figure 6: Tree diagram

In this case there is a total of 16 outcomes in the sample space of options. Note that in each case
the customer makes two choices. This implies that there are
2 × 2 × 2 × 2 = 16
options in total.
Diagrams such as the one above are called tree diagrams. They are only suitable in simple situations.
18 HELM (2006):
®
Events
As we have already noted a collection of some or all of the outcomes of an experiment is called an
event. So an event is a subset of the sample space. For example, if we throw a die then the sample
space is {1, 2, 3, 4, 5, 6} and two possible events are
(a) a score of 3 or more, represented by the set: {3, 4, 5, 6}
(b) a score which is even, represented by the set: {2, 4, 6}.
Example 9
Two coins are thrown. List the ordered outcomes for the event when just one Tail
is obtained.
Solution
{H, T }, {T, H}
Note that here the order does matter unlike for sets in general.
Task
Three coins are thrown. List the ordered outcomes which belong to each of the
following events.
(a) two Tails are obtained
(b) at least two Tails are obtained
(c) at most two Tails are obtained
State the relationship between (a) and (b) and that between (a) and (c).
Your solution
Answer
(a) {T T H, T HT, HT T }
(b) {T T T, T T H, T HT, HT T }
(c) {T T H, T HT, HT T, T HH, HT H, HHT, HHH}
(a) is a subset of (b) and (a) is also a subset of (c).
HELM (2006): 19
Task
A new type of paint to be used in the manufacture of garden equipment is tested
for impact shock resistance to damage and scratch resistance to damage. The
results (50 samples) are as follows
Shock Resistance
Good Poor
Good 20 15
Scratch Resistance
Poor 12 3
If A is the event {High Shock Resistance} and B is the event

{High Scratch Resistance}, describe the following events and determine the
number of samples in each event.
(a) A ∪ B (b) A ∩ B (c) A0 (d) B 0
Your solution
Answer
(a) The event A ∪ B consists of those samples which have either good shock and good
scratch resistance (or both).
n(A ∪ B) = 47.
(b) The event A∩B consists of those samples which have both good shock and good scratch
resistance.
n(A ∩ B) = 20.
(c) The event A0 consists of those samples which do not have good shock resistance.
n(A0 ) = 18.
(d) The event B 0 consists of those samples which do not have good scratch resistance.
n(B 0 ) = 15.
20 HELM (2006):
®
Complement
We have met the complement before (Section 35.1 page 5) in relation to sets. We consider it again
here in relation to sample spaces and events. The complement of an event is the set of outcomes
which are not members of the event.
For example, the experiment of throwing a 6-faced die has sample space S = {1, 2, 3, 4, 5, 6}.
The event “score of 3 or more is obtained” is the set {3, 4, 5, 6}.
The complement of this event is {1, 2} which can be described in words as “score of 3 or more is
not obtained” or “score of 1 or 2 is obtained”.
The event: “even score is obtained” is the set {2, 4, 6}.

The complement of this event is {1, 3, 5} or, in words “ even score is not obtained” or “odd score
is obtained”.
In the last but one Task concerning tossing three coins:

• the complement of event (a) is {T T T, T HH, HT H, HHT, HHH},
• the complement of event (b) is {T HH, HT H, HHT, HHH}
• the complement of event (c) is {T T T }.
Task
State, in words, what are the complements of each of the following events in
relation to the experiment of throwing three coins (avoid using the word not):
(a) two Heads are obtained (b) at least two Heads are obtained (c) at most
two Heads are obtained.
Your solution
Answer
(a) no Heads, one Head or three Heads
(b) no Heads or one Head
(c) three Heads
Notation: It is customary to use a capital letter to denote an event. For example, A = {two Heads
are thrown}. The complementary event is denoted A0 .
Hence, in the case where A = {at least two Heads are thrown}, A0 is the event {fewer than two
Heads are thrown}.
HELM (2006): 21
2. Definitions of probability
Relative frequency applied to probability

Consider the experiment of throwing a single coin many times.
Suppose we throw a coin 10 times and obtain six Heads and four Tails; does this suggest that the
coin is biased? Clearly not! What about the case when we obtain 9 Heads and 1 Tail?
We conducted an experiment in which a coin was thrown 100 times and the result recorded each
time as 1 if a Head appeared face up and 0 if a Tail appeared. In Figure 7 we have plotted the
r
average score , where r is the number of Heads and n is the total number of throws, against n for
n
r
n = 10, 20, . . . , 100. The quantity is called the relative frequency of Heads.
n
r
n
0.70
0.50
0.45
10 20 30 40 50 60 70 80 90 100 n
Figure 7
1
As n increases the relative frequency settles down near the value . This is an experimental estimate
2
of the probability of throwing a Head with this particular coin. Note that when n = 50 this estimate
was 0.49 and when n = 100 this estimate was 0.51. When we repeated the whole experiment again,
r
the value of when n = 100 was 0.46. Hence the use of the word estimate. Normally, as the
n
number of trials is increased the estimate tends to settle down but this is not certain to occur.
1
Theoretically, the probability of obtaining a Head when a fair coin is thrown is . Experimentally,
2
1
we expect the relative frequency to approach as n increases.
2
Equi-probable spaces and the principle of equally likely outcomes

An equi-probable space is a sample space in which the chance that any one sample point occurs is
equal to the chance that any other sample point occurs. Whether a sample space is an equi-probable
space is usually determined by inspection or logic.
(a) Tossing a coin: the sample space S is, using an obvious notation:
S = {H, T }
H and T are the two simple outcomes and are equally likely to occur.
22 HELM (2006):
®
(b) Rolling a die: a sample space is:
S = {1, 2, 3, 4, 5, 6}
Each number from 1 to 6 is a simple outcome and is equally likely to occur.

(c) Tossing three coins: a sample space comprised of simple outcomes is:
S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }
Each of the eight outcomes stated is equally likely to occur.

(d) Counting the number of heads when three coins are tossed. Here S = {3 Heads, 2
Heads, 1 Head, 0 Heads}. This is not an equi-probable space and the outcomes are not
equally likely. (For example: the event {2 Heads} is the union of three simple events
{HHT, HT H, T HH} so must occur more often than the event {HHH}.
Of the sample spaces above all are equi-probable spaces except for (d).
Key Point 3
The Principle of Equally Likely Outcomes
This states that each simple outcome in an equi-probable space is equally likely to occur. This
principle enables us to deduce the probabilities that simple events (and hence more complicated
events which are combinations of simple events) occur.
Notation
If A is an event associated with a sample space S the the probability of A occurring is denoted by
P(A).
Referring to the examples above we may immediately deduce that
1
(a) P{H} = P{T } = 2
1
(b) P{1} = P{2} = P{3} = P{4} = P{5} = P{6} = 6
(c) P{HHH} = P{HHT } = P{HT H} = P{T HH} =
1
P{HT T } = P{T HT } = P{T T H} = P{T T T } = 8
HELM (2006): 23
Definition
We can now define probability using the Principle of Equally Likely Outcomes as follows:
If a sample space S consists of n simple outcomes which are equally likely and an event A consists
of m of those simple outcomes, then
m the number of simple outcomes in A
P(A) = =
n the number of simple outcomes in S
It follows from this definition that 0 ≤ P(A) ≤ 1.
• If P(A) = 1 we say that the event A is certain because A is identical to S.
• If P(A) = 0 we say that the event A is impossible because A is empty.
The set with no outcomes in it is called the empty set and written ∅; therefore P(∅) = 0.
Task
For each of the following events A, B, C, list and count the number of outcomes
it contains and hence calculate the probability of A, B or C occurring.
(a) A = “throwing 3 or higher with one die”,

(b) B = “throwing exactly two Heads with three coins”,
(c) C = “throwing a total score of 14 with two dice”.
Your solution
Answer
(a) There are six possible equally likely outcomes of the experiment and four of them, {3, 4, 5, 6},
constitute the event A; hence P(A) = 46 = 23 .
(b) There are eight equally likely outcomes of which three, {HHT, HT H, T HH} are elements
of B; hence P(B) = 38 .
(c) It is impossible to throw a total higher than 12 so that C = ∅ and P(C) = 0.
Not surprisingly, the probabilities of an event A and its complement are related. The probability of
the event A0 is easily found from the identity
number of outcomes in A number of outcomes not in A
+ ≡1
total number of outcomes total number of outcomes
so that P(A) + P(A0 ) ≡ 1
24 HELM (2006):
®
Key Point 4
The Complement Rule
P(A0 ) = 1 − P(A)
In words:
The probability of the complement of A occurring
is equal to 1 minus the probability of A occurring.
Task
For the events in (a) and (b) of the previous Task find P(A0 ) and P(B 0 ) and
describe in words what A0 and B 0 are in this case.
Your solution
Answer
(a) P(A) = 2
3
so that P(A0 ) = 13 .
A0 is the event of throwing a score of less than 3 on one die.
(b) P(B) = 3
8
so that P(B 0 ) = 5
8
B 0 is the event of throwing no Heads, exactly one Head or exactly three Heads with three
coins.
The use of the event A0 can sometimes simplify the calculation of the probability P(A). For example,
suppose that two dice are thrown and we require the probability of the event
A: that we obtain a total score of at least four.
There are many combinations that produce a total score of at least four; however there are only 3
combinations that produce a total score of two or three which is the complementary event to the
one of interest. The event A0 = {(1, 1), (1, 2), (2, 1)} (where we use an obvious notation of stating
the total score on the first die followed by the score on the second die) is the complement of A.
Now P(A0 ) = 3
36
since there are 6 × 6 possible combinations in throwing two dice. Thus
3 33 11
P(A) = 1 − P(A0 ) = 1 − = = .
36 36 12
HELM (2006): 25
Task
Find the probability of obtaining a total score of at least five when three dice are
thrown. Hint: identify A and A0 , then calculate P(A0 ), then P(A).
Your solution
Answer
There are 6 × 6 × 6 = 216 possible outcomes. If, for example, (1, 3, 6) denotes the scores of 1 on
die one, 3 on die two and 6 on die three and if A is the event ‘a total score of five or more’ then A0
is the event ‘a total score of less than 5’ i.e.
A0 = {(1, 1, 1), (2, 1, 1), (1, 2, 1), (1, 1, 2)}
4 212
There are four outcomes in A0 and hence P(A0 ) = so that P(A) = .
216 216
26 HELM (2006):
®
Exercises
1. For each of the following experiments, state whether the variable is discrete or continuous. In
each case state the sample space.
(a) The number of defective items in a batch of twenty is noted.

(b) The weight, in kg, of lubricating oil drained from a machine is determined using a spring
balance.
(c) The natural logarithm of the weight, in kg, according to a spring balance, of lubricating
oil drained from a machine, is noted.
2. An experiment consists of throwing two four-faced dice (regular tetrahedra) with faces labelled
1, 2, 3, 4.
(a) Write down the sample space of this experiment.

(b) If A is the event ‘total score is at least 4’ list the outcomes belonging to A0 .
(c) If each die is fair find the probability that the total score is at least 6 when the two dice
are thrown. What is the probability that the total score is less than 6?
(d) What is the probability that a double: i.e. (1, 1), (2, 2), (3, 3), (4, 4) will not be thrown?
(e) What is the probability that a double is not thrown and the score is less than 6?
3. A lot consists of 10 good articles, 4 articles with minor defects and 2 with major defects. One
article is chosen at random from the lot. Find the probability that:
(a) it has no defects,

(b) it has no major defects,
(c) it is either good or has major defects.
4. Propeller shafts for marine applications are inspected to ensure that they satisfy both diameter
requirements and surface finish requirements. The results of 400 inspections are as follows:
Diameter Requirements
Good Poor
Good 200 50
Surface Finish
Poor 80 70
(a) What is the probability that a shaft selected at random satisfies the surface finish require-
ments?
(b) What is the probability that a shaft selected at random satisfies both diameter and surface
finish requirements?
(c) What is the probability that a shaft selected at random satisfies either the diameter or
the surface finish requirements?
(d) What is the probability that a shaft selected at random satisfies neither the diameter nor
the surface finish requirements?
HELM (2006): 27
Answers
1. (a) The variable is discrete. The sample space is {1, 2, . . . , 20}.

(b) The variable is continuous. The sample space is the set of real numbers x such that
0 ≤ x < ∞.
(c) The variable is continuous. The sample space is the set of real numbers x such that
−∞ < x < ∞.
2. (a) S = {(1, 1), (1, 2), (1, 3), (1, 4)
(2, 1), (2, 2), (2, 3), (2, 4)
(3, 1), (3, 2), (3, 3), (3, 4)
(4, 1), (4, 2), (4, 3), (4, 4)}
(b) A0 = {(1, 1), (1, 2), (2, 1)}

(c) The outcomes in the event are {(2, 4), (3, 3), (3, 4), (4, 2), (4, 3), (4, 4)} so the probability
6 3 3 5
of this event occurring is = . The probability of the complement event is 1− = .
16 8 8 8
4
(d) The probability of a double occurring is so the probability of the complement (i.e,
16
4 3
double not thrown) is 1 − = .
16 4
(e) Here, consider the sample space in (a). If the doubles and those outcomes with a score
greater than 6 are removed we have left the event :
{(1, 2), (1, 3), (1, 4), (2, 1), (2, 3), (3, 1), (3, 2), (4, 1)}.
8 1
Hence the probability of this event occurring is = .
16 2
3. Let G be the event ‘article is good’ , Mn be the event ‘article has minor defect’ and Mj be
the event ‘article has major defect’
10 5
(a) Here we require P(G). Obviously P(G) = =
16 8
2 7
(b) We require P(Mj0 ) = 1− P(Mj ) = 1 − =
16 8
(c) The event we require is the complement of the event Mn .
4 1 1 3
Since P(Mn ) = = we have P(Mn0 ) = P(G or Mj ) = 1 − = .
16 4 4 4
10 2 12 3
Equivalently P(G) + P(Mn ) = + = =
16 16 16 4
250 200 330 70
4. (a) = 0.625 (b) = 0.5 (c) = 0.825 (d) = 0.175
400 400 400 400
28 HELM (2006):
®
Addition and
Multiplication Laws
of Probability 35.3

Introduction
When we require the probability of two events occurring simultaneously or the probability of one or
the other or both of two events occurring then we need probability laws to carry out the calculations.
For example, if a traffic management engineer looking at accident rates wishes to know the probability
that cyclists and motorcyclists are injured during a particular period in a city, he or she must take
into account the fact that a cyclist and a motorcyclist might collide. (Both events can happen
simultaneously.)

• understand the ideas of sets and subsets
Prerequisites
• understand the concepts of probability and
Before starting this Section you should . . . events

'
$
• state and use the addition law of probability
• define the term independent events

Learning Outcomes • state and use the multiplication law of
probability
• understand and explain the concept of
conditional probability
& %
HELM (2006): 29
Section 35.3: Addition and Multiplication Laws of Probability
1. The addition law
As we have already noted, the sample space S is the set of all possible outcomes of a given experiment.
Certain events A and B are subsets of S. In the previous Section we defined what was meant by
P(A), P(B) and their complements in the particular case in which the experiment had equally likely
outcomes.
Events, like sets, can be combined to produce new events.
• A ∪ B denotes the event that event A or event B (or both) occur when the experiment is
performed.
• A ∩ B denotes the event that both A and B occur together.

In this Section we obtain expressions for determining the probabilities of these combined events,
which are written P(A ∪ B) and P(A ∩ B) respectively.
Types of events
There are two types of events you will need to able to identify and work with: mutually exclusive
events and independent events. (We deal with independent events in subsection 3.)
Mutually exclusive events

Mutually exclusive events are events that by definition cannot happen together. For example, when
tossing a coin, the events ‘head’ and ‘tail’ are mutually exclusive; when testing a switch ‘operate’
and ‘fail’ are mutually exclusive; and when testing the tensile strength of a piece of wire, ‘hold’ and
‘snap’ are mutually exclusive. In such cases, the probability of both events occurring together must
be zero. Hence, using the usual set theory notation for events A and B, we may write:
P(A ∩ B) = 0, provided that A and B are mutually exclusive events
Task
Decide which of the following pairs of events (A and B) arising from the experi-
ments described are mutually exclusive.
(a) Two cards are drawn from a pack
A = {a red card is drawn}

B = {a picture card is drawn}
(b) The daily traffic accidents in Loughborough involving pedal cyclists and
motor cyclists are counted
A = {three motor cyclists are injured in collisions with cars}

B = {one pedal cyclist is injured when hit by a bus}
(c) A box contains 20 nuts. Some have a metric thread, some have a
British Standard Fine (BSF) thread and some have a British Standard
Whitworth (BSW) thread.
A = {first nut picked out of the box is BSF}

B = {second nut picked out of the box is metric }
30 HELM (2006):
®
Your solution
Answer
(a) A and B are not mutually exclusive.
(b) A and B are mutually exclusive.
(c) A and B are not mutually exclusive.
Key Point 5
The Addition Law of Probability - Simple Case
If two events A and B are mutually exclusive then
P(A ∪ B) = P(A) + P(B)
Key Point 6
The Addition Law of Probability - General Case
If two events are A and B then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
If A ∩ B = ∅, i.e. A and B are mutually exclusive, then P(A ∩ B) = P(∅) = 0, and this general
expression reduces to the simpler case.
This rule can be extended to three or more events, for example:
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) − P(B ∩ C) + P(A ∩ B ∩ C)
HELM (2006): 31
Example 10
Consider a pack of 52 playing cards. A card is selected at random. What is the
probability that the card is either a diamond or a ten?
Solution
If A is the event {a diamond is selected} and B is the event {a ten is selected} then obviously
13 4
P(A) = and P(B) = . The intersection event A ∩ B consists of only one member - the ten
52 52
1
of diamonds - which gets counted twice hence P(A ∩ B) = .
52
13 4 1 16
Therefore P(A ∪ B) = + − = .
52 52 52 52
Task
A bag contains 20 balls, 3 are coloured red, 6 are coloured green, 4 are coloured
blue, 2 are coloured white and 5 are coloured yellow. One ball is selected at
random. Find the probabilities of the following events.
(a) the ball is either red or green

(b) the ball is not blue
(c) the ball is either red or white or blue. (Hint: consider the complementary event.)
Your solution
32 HELM (2006):
®
Answer
Note that a ball has only one colour, designated by the letters R, G, B, W, Y .
3 6 9
(a) P(R ∪ G) = P(R) + P(G) = + = .
20 20 20
4 16 4
(b) P(B 0 ) = 1 − P(B) = 1 − = = .
20 20 5
6 5 11
(c) The complementary event is G ∪ Y , P(G ∪ Y ) = + = .
20 20 20
11 9
Hence P(R ∪ W ∪ B) = 1 − =
20 20
In the last Task part (c) we could alternatively have used an obvious extension of the law of addition
for mutually exclusive events:
3 2 4 9
P(R ∪ W ∪ B) = P(R) + P(W ) + P(B) = + + = .
20 20 20 20
Task
The diagram shows a simplified circuit in which two independent components a
and b are connected in parallel.
The circuit functions if either or both of the components are operational. It is known that if A is the
event ‘component a is operating’ and B is the event ‘component b is operating’ then P(A) = 0.99,
P(B) = 0.98 and P(A ∩ B) = 0.9702. Find the probability that the circuit is functioning.
Your solution
Answer
The probability that the circuit is functioning is P(A ∪ B). In words: either a or b or both must be
functioning if the circuit is to function. Using the keypoint:
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

= 0.99 + 0.98 − 0.9702 = 0.9998
Not surprisingly the probability that the circuit functions is greater than the probability that either
of the individual components functions.
HELM (2006): 33
Exercises
1. The following people are in a room: 5 men aged 21 and over, 4 men under 21, 6 women aged
21 and over, and 3 women under 21. One person is chosen at random. The following events are
defined: A = {the person is aged 21 and over}; B = {the person is under 21}; C = {the
person is male}; D = { the person is female}. Evalute the following:
(a) P(B ∪ D)
(b) P(A0 ∩ C 0 )
Express the meaning of these events in words.
2. A card is drawn at random from a deck of 52 playing cards. What is the probability that it is
an ace or a face card (i.e. K, Q, J)?
3. In a single throw of two dice, what is the probability that neither a double nor a sum of 9 will
appear?
Answers
1. (a) P(B ∪ D) = P(B) + P(D) − P(B ∩ D)

7 9 3
P(B) = , P(D) = , P(B ∩ D) =
18 18 18
7 9 3 13
∴ P(B ∪ D) = + − =
18 18 18 18
(b) P(A0 ∩ C 0 ) A0 = {people under 21} C 0 = {people who are female}
3 1
∴ P(A0 ∩ C 0 ) = =
18 6
12 4
2. F = {face card} A = {card is ace} P(F ) = , P(A) =
52 52
12 4 16
∴ P(F ∪ A) = P(F ) + P(A) − P(F ∩ A) = + −0=
52 52 52
3. D = {double is thrown} N = {sum is 9}
6
P(D) = (36 possible outcomes in an experiment in which all the outcomes are equally
36
probable).
4
P(N ) = P{(6 ∩ 3) ∪ (5 ∩ 4) ∪ (4 ∩ 5) ∪ (3 ∩ 6)} =
36
6 4 10
P(D ∪ N ) = P(D) + P(N ) − P(D ∩ N ) = + −0=
36 36 36
10 26
P((D ∪ N )0 ) = 1 − P(D ∪ N ) = 1 − =
36 36
34 HELM (2006):
®
2. Conditional probability - dependent events

Suppose a bag contains 6 balls, 3 red and 3 white. Two balls are chosen (without replacement) at
random, one after the other. Consider the two events R, W :
R is event “first ball chosen is red”
W is event “second ball chosen is white”
3 1
We easily find P(R) = = . However, determining the probability of W is not quite so straight-
6 2
forward. If the first ball chosen is red then the bag subsequently contains 2 red balls and 3 white. In
3
this case P(W ) = . However, if the first ball chosen is white then the bag subsequently contains 3
5
2
red balls and 2 white. In this case P(W ) = . What this example shows is that the probability that
5
W occurs is clearly dependent upon whether or not the event R has occurred. The probability of
W occurring is conditional on the occurrence or otherwise of R.
The conditional probability of an event B occurring given that event A has occurred is written
P(B|A). In this particular example
3 2
P(W |R) = and P(W |R0 ) = .
5 5
Consider, more generally, the performance of an experiment in which the outcome is a member of
an event A. We can therefore say that the event A has occurred. What is the probability that B
then occurs? That is what is P(B|A)? In a sense we have a new sample space which is the event
A. For B to occur some of its members must also be members of event A. So, for example, in an
equi-probable space, P(B|A) must be the number of outcomes in A ∩ B divided by the number of
outcomes in A. That is
number of outcomes in A ∩ B
P(B|A) = .
number of outcomes in A
Now if we divide both the top and bottom of this fraction by the total number of outcomes of the
experiment we obtain an expression for the conditional probability of B occurring given that A has
occurred:
Key Point 7
Conditional Probability
P(A ∩ B)
P(B|A) = or, equivalently P(A ∩ B) = P(B|A)P(A)
P(A)
To illustrate the use of conditional probability concepts we return to the example of the bag containing
3 red and 3 white balls in which we consider two events:
• R is event “first ball is red” • W is event “second ball is white”
Let the red balls be numbered 1 to 3 and the white balls 4 to 6. If, for example, (3, 5) represents
the fact that the first ball is 3 (red) and the second ball is 5 (white) then we see that there are
6 × 5 = 30 possible outcomes to the experiment (no ball can be selected twice).
HELM (2006): 35
If the first ball is red then only the fifteen outcomes (1, x), (2, y), (3, z) are then possible (here x 6= 1,
y 6= 2 and z 6= 3). Of these fifteen, the six outcomes {(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)}
will produce the required result, i.e. the event in which both balls chosen are red, giving a probability:
6 2
P(B|A) = = .
15 5
Example 11
A box contains six 10 Ω resistors and ten 30 Ω resistors. The resistors are all
unmarked and are of the same physical size.
(a) One resistor is picked at random from the box; find the probability that:
(i) It is a 10 Ω resistor.
(ii) It is a 30 Ω resistor.
(b) At the start, two resistors are selected from the box. Find the proba-
bility that:
(i) Both are 10 Ω resistors.
(ii) The first is a 10 Ω resistor and the second is a 30 Ω resistor.
(iii) Both are 30 Ω resistors.
Solution
(a) (i) As there are six 10 Ω resistors in the box that contains a total of 6 + 10 = 16
resistors, and there is an equally likely chance of any resistor being selected, then
6 3
P(10 Ω) = =
16 8
(ii) As there are ten 30 Ω resistors in the box that contains a total of 6 + 10 = 16
10 5
P(30 Ω) = =
16 8
(b) (i) As there are six 10 Ω resistors in the box that contains a total of 6 + 10 = 16
6 3
P(first selected is a 10 Ω resistor) = =
16 8
If the first resistor selected was a 10 Ω one, then when the second resistor is selected,
there are only five 10 Ω resistors left in the box which now contains 5 + 10 = 15
resistors.
5 1
Hence, P(second selected is also a 10 Ω resistor) = =
15 3
3 1 1
And, P(both are 10 Ω resistors) = × =
8 3 8
36 HELM (2006):
®
Solution (contd.)
6 3
(b) (ii) As before, P(first selected is a 10 Ω resistor) = =
18 8
If the first resistor selected was a 10 Ω one, then when the second resistor is selected, there are
still ten 30 Ω resistors left in the box which now contains 5 + 10 = 15 resistors. Hence,
10 2
P(second selected is a 30 Ω resistor) = =
15 3
3 2 1
And, P(first was a 10 Ω resistor and second was a 30 Ω resistor) = × =
8 3 4
(b) (iii) As there are ten 30 Ω resistors in the box that contains a total of 6 + 10 = 16 resistors,
and there is an equally likely chance of any resistor being selected, then
10 5
P(first selected is a 30 Ω resistor) = ×
16 8
If the first resistor selected was a 30 Ω one, then when the second resistor is selected,
there are only nine 30 Ω resistors left in the box which now contains 5 + 10 = 15 resistors.
9 3
Hence, P(second selected is also a 30 Ω resistor) = =
15 5
5 3 3
And, P(both are 30 Ω resistors) = × =
8 5 8
3. Independent events
If the occurrence of one event A does not affect, nor is affected by, the occurrence of another event
B then we say that A and B are independent events. Clearly, if A and B are independent then
P(B|A) = P(B) and P(A|B) = P(A)
Then, using the Key Point 7 formula P(A ∪ B) =P(B|A)P(A) we have, for independent events:
Key Point 8
The Multiplication Law
If A and B are independent events then
P(A ∩ B) = P(A) × P(B)
In words
‘The probability of independent events A and B occurring is the product of the probabilities of the
events occurring separately.’
HELM (2006): 37
In Figure 8 two components a and b are connected in series.
a b
Figure 8
Define two events
• A is the event ‘component a is operating’
• B is the event ‘component b is operating’
Previous testing has indicated that P(A) = 0.99, and P(B) = 0.98. The circuit functions only if a
and b are both operating simultaneously. The components are assumed to be independent.
Then the probability that the circuit is operating is given by
P(A ∩ B) = P(A)P(B) = 0.99 × 0.98 = 0.9702
Note that this probability is smaller then either P(A) or P(B).
Task
Decide which of the following pairs (A and B) of events arising from the experi-
ments described are independent.
(a) One card is drawn from each of two packs
A = {a red card is drawn from pack 1}

B = {a picture card is drawn from pack 2}
(b) The daily traffic accidents in Hull involving pedal cyclists and motor
cyclists are counted
A = {three motor cyclists are injured in separate collisions with cars}

B = {one pedal cyclist is injured when hit by a bus}
(c) Two boxes contains 20 nuts each, some have a metric thread, some
have a British Standard Fine (BSF) threads and some have a British
Standard Whitworth (BSW) thread. A nut is picked out of each box.
A = {nut picked out of the first box is BSF}

B = {nut picked out of the second box is metric }
(d) A box contains 20 nuts, some have a metric thread, some have a
British Standard Fine (BSF) threads and some have British Standard
Whitworth (BSW) thread. Two nuts are picked out of the box.
A = {first nut picked out of the box is BSF}

B = {second nut picked out of the box is metric }
38 HELM (2006):
®
Your solution
Answer
(a), (b), (c): A and B are independent. (d) A and B are not independent.
Key Point 9
Laws of Elementary Probability
Let a sample space S consist of the n simple distinct events E1 , E2 . . . En and let A and B be
events contained in S.
Then:
(a) 0 ≤ P(A) ≤ 1. P(A) = 0 is interpreted as meaning that the event A cannot occur and
P(A) = 1 is interpreted as meaning that the event A is certain to occur.
(b) P(A) + P(A0 ) = 1 where the event A0 is the complement of the event A
(c) P(E1 ) + P(E2 ) + · · · + P(En ) = 1 where E1 , E2 , . . . En form the sample space
(d) If A and B are any two events then P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
(e) If A and B are two mutually exclusive events then P(A ∪ B) = P(A) + P(B)
(f) If A and B are two independent events then P(A ∩ B) = P(A) × P(B).
Example 12
A circuit has three independent switches A, B and C wired in parallel as shown in
the figure below.
A
B
C
Figure 9
Current can only flow through the bank of switches if at least one of them is closed.
The probability that any given switch is closed is 0.9. Calculate the probability
that current can flow through the bank of switches.
HELM (2006): 39
Solution
Assume that A is the event {switch A is closed}. Similarly for switches B and C. We require
P(A ∪ B ∪ C), the probability that at least one switch is closed. Using set theory,
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − [P(A ∩ B) + P(B ∩ C) + P(C ∩ A)]

+P(A ∩ B ∩ C)
Using the fact that the switches operate independently,
P(A ∪ B ∪ C) = 0.9 + 0.9 + 0.9 − [0.9 × 0.9 + 0.9 × 0.9 + 0.9 × 0.9]
+0.9 × 0.9 × 0.9
= 2.7 − 2.43 + 0.729 = 0.999
Note that the result implies that the system is more likely to allow current to flow than any single
switch in the system. This is why replication is built into systems requiring a high degree of reliability
such as aircraft control systems.
Task
A circuit has four independent switches A, B, C and D wired in parallel as shown
in the diagram below.
A
B
C
D
Current can only flow through the bank of switches if at least one of them is
closed. The probabilities that switches A, B, C and D are closed are 0.9, 0.8, 0.7
and 0.6 respectively. Calculate the probability that current can flow through the
bank of switches.
Answer
Denoting the switches by A, B, C and D we have:
P(A ∪ B ∪ C ∪ D)
= P(A) + P(B) + P(C) + P(D)
−P(A ∩ B) − P(B ∩ C) − P(C ∩ D) − P(D ∩ A) − P(A ∩ C) − P(B ∩ D)
+P(A ∩ B ∩ C) + P(B ∩ C ∩ D) + P(C ∩ D ∩ A) + P(D ∩ A ∩ B)
−P(A ∩ B ∩ C ∩ D)
Using the fact that the switches operate independently and substituting gives:
P(A ∪ B ∪ C ∪ D) = 3 − 3.35 + 1.65 − 0.3024 = 0.9976
Hence, the probability that current can flow through the bank of switches is 0.9976.
40 HELM (2006):
®
Exercises
1. A box contains 4 bad tubes and 6 good tubes. Two are drawn out together. One of them is
tested and found to be good. What is the probability that the other one is also good?
2. A man owns a house in town and a cottage in the country. In any one year the probability of
the town house being burgled is 0.01 and the probability of the country cottage being burgled
is 0.05. In any one year what is the probability that:
(a) both will be burgled? (b) one or the other (but not both) will be burgled ?
3. In a Baseball Series, teams A and B play until one team has won 4 games. If team A has
probability 2/3 of winning against B in a single game, what is the probability that the Series
will end only after 7 games are played?
4. The probability that a single aircraft engine will fail during flight is q. A multi-engine plane
makes a successful flight if at least half its engines run. Assuming that the engines operate
independently, find the values of q for which a two-engine plane is to be preferred to a four-
engine plane.
5. Current flows through a relay only if it is closed. The probability of any relay being closed is
0.95. Calculate the probability that a current will flow through a circuit composed of 3 relays
in parallel. What assumption must be made?
6. A central heating installation and maintenance engineer keeps a record of the causes of failure
of systems he is called out to repair. The causes of failure are classified as ‘electrical’, ‘gas’,
or in some cases ‘other’. A summary of the records kept of failures involving either gas or
electrical faults is as follows:
Electrical
Yes No
Yes 53 11
Gas
No 23 13
(a) Find the probability that failure involves gas given that it involves electricity.
(b) Find the probability that failure involves electricity given that it involves gas.
HELM (2006): 41
Answers
1. Let Gi = {ith tube is good} Bi = {ith tube is bad}

5
P(G2 |G1 ) = 9
(only 5 good tubes left out of 9).
2. (a) H = {house is burgled} C = {cottage is burgled}
(b) P(H ∩ C) = P(H)P(C) = (0.01)(0.05) = 0.0005 since events independent
P(one or the other (but not both)) = P((H ∩ C 0 ) ∪ (H 0 ∩ C)) = P(H ∩ C 0 ) + P(H 0 ∩ C)
= P(H)P(C 0 ) + P(H 0 )P(C)
= (0.01)(0.95) + (0.99)(0.05) = 0.059.
3. Let Ai be event {A wins the ith game}
required event is {A1 ∩ A2 ∩ A3 ∩ A04 ∩ A05 ∩ A06 ) ∩ (. . .)

| {z }
no. of ways of arranging 3 in 6 i.e. 6C3
P(required event) = 6C3 P(A1 ∩ A2 ∩ A3 ∩ A04 ∩ A05 ∩ A06 ) = 6C3 [P(A1 )]3 P[A01 ]3 = 160
729
4. Let Ei be event {ith engine success}
Two-engine plane: flight success if {(E1 ∩ E2 ) ∪ (E 0 1 ∩ E2 )(¸E1 ∩ E 0 2 )} occurs
P(required event) = P(E1 )P(E2 ) + P(E10 )P(E2 ) + P(E1 )P(E20 )

= (1 − q)2 + 2q(1 − q) = 1 − q 2
Four-engine plane: success if following event occurs
{E1 ∩ E2 ∩ E30 ∩ E40 } ∩ {E1 ∩ E2 ∩ E3 ∩ E40 } ∩ {E1 ∩ E2 ∩ E3 ∩ E4 }

| {z } | {z } | {z }
4C ways 4C ways 4C ways
2 1 0
required probability= 6(1 − q)2 q 2 + 4(1 − q)3 q + (1 − q)4 = 3q 4 − 4q 3 + 1
Two-engine plane is preferred if
1 − q 2 > 3q 4 − 4q 3 + 1 i.e. if 0 > q 2 (3q − 1)(q − 1)
Let y = (3q − 1)(q − 1). By drawing a graph of this quadratic you will quickly see that a
two-engine plane is preferred if 31 < q < 1.
42 HELM (2006):
®
Answers
5. Let A be event {relay A is closed}: Similarly for B, C
required event is {A ∩ B ∩ C} ∪ {A0 ∩ B ∩ C} ∩ {A0 ∩ B 0 ∩ C}
| {z } | {z }
3C 3C
1 2
P(required event) = (0.95)3 + 3(0.95)2 (0.05) + 3(0.95)(0.05)2 = 0.999875

( or 1 − P(all relays open) = 1 − (0.05)3 = 0.999875.)
The assumption is that relays operate independently.
6 (a) A total of 76 failures involved electrical faults. Of the 76 some 53 involved gas. Hence
53
P{Gas Failure | Electrical Failure} = = 0.697
76
6 (b) A total of 64 failures involved electrical faults. Of the 64 some 53 involved gas. Hence
53
P{Electrical Failure | Gas Failure} = = 0.828
64
HELM (2006): 43
Total Probability and
Bayes’ Theorem 35.4
Introduction
When the ideas of probability are applied to engineering (and many other areas) there are occasions
when we need to calculate conditional probabilities other than those already known. For example, if
production runs of ball bearings involve say, four machines, we might know the probability that any
given machine produces faulty ball bearings. If we are inspecting the total output prior to distribution
to users, we might need to know the probability that a faulty ball bearing came from a particular
machine. Even though we do not address the area of statistics known as Bayesian Statistics here, it
is worth noting that Bayes’ theorem is the basis of this branch of the subject.
' $
• understand the ideas of sets and subsets.
• understand the concepts of probability and

Prerequisites events.
Before starting this Section you should . . . • understand the addition and multiplication
laws and the concept of conditional
probability.
&
' %
$
• understand the term ‘partition of a sample
space’
• understand the special case of Bayes’

Learning Outcomes theorem arising when a sample space is
On completion you should be able to . . . partitioned by a set and its complement
• be able to apply Bayes’ theorem to solve

basic engineering related problems
& %
44 HELM (2006):
®
1. The theorem of total probability

To establish this result we start with the definition of a partition of a sample space.
A partition of a sample space

The collection of events A1 , A2 , . . . An is said to partition a sample space S if
(a) A1 ∪ A2 ∪ · · · ∪ An = S
(b) Ai ∩ Aj = ∅ for all i, j
(c) Ai 6= ∅ for all i
In essence, a partition is a collection of non-empty, non-overlapping subsets of a sample space whose
union is the sample space itself. The definition is illustrated by Figure 10.
S
A1 A4
A3
A5
A6
A2
Figure 10
If B is any event within S then we can express B as the union of subsets:
B = (B ∩ A1 ) ∪ (B ∩ A2 ) ∪ · · · ∪ (B ∩ An )
The definition is illustrated in Figure 11 in which an event B in S is represented by the shaded region.
S
A1 A4
A3
B
A5
A6
A2
Figure 11
The bracketed events (B ∩ A1 ), (B ∩ A2 ) . . . (B ∩ An ) are mutually exclusive (if one occurs then none
of the others can occur) and so, using the addition law of probability for mutually exclusive events:
P(B) = P(B ∩ A1 ) + P(B ∩ A2 ) + · · · + P(B ∩ An )
Each of the probabilities on the right-hand side may be expressed in terms of conditional probabilities:
P(B ∩ Ai ) = P(B|Ai )P(Ai ) for all i
Using these in the expression for P(B), above, gives:
P(B) = P(B|A1 )P(A1 ) + P(B|A2 )P(A2 ) + · · · + P(B|An )P(An )
Xn
= P(B|Ai )P(Ai )
i=1
This is the theorem of Total Probability. A related theorem with many applications in statistics can
be deduced from this, known as Bayes’ theorem.
HELM (2006): 45
Section 35.4: Total Probability and Bayes’ Theorem
2. Bayes’ theorem
We again consider the conditional probability statement:
P(A ∩ B) P(A ∩ B)
P(A|B) = =
P(B) P(B|A1 )P(A1 ) + P(B|A2 )P(A2 ) + · · · + P(B|An )P(An )
in which we have used the theorem of Total Probability to replace P(B). Now
P(A ∩ B) = P(B ∩ A) = P(B|A) × P(A)
Substituting this in the expression for P(A|B) we immediately obtain the result
P(B|A) × P(A)
P(A|B) =
P(B|A1 )P(A1 ) + P(B|A2 )P(A2 ) + · · · + P(B|An )P(An )
This is true for any event A and so, replacing A by Ai gives the result, known as Bayes’ theorem as
P(B|Ai ) × P(Ai )
P(Ai |B) =
P(B|A1 )P(A1 ) + P(B|A2 )P(A2 ) + · · · + P(B|An )P(An )
3. Special cases
In the case where we consider A to be an event in a sample space S (the sample space is partitioned
by A and A0 ) we can state simplified versions of the theorem of Total Probability and Bayes theorem
as shown below.
The theorem of total probability: special case

This special case enables us to find the probability that an event B occurs taking into account the
fact that another event A may or may not have occurred.
The theorem becomes
P(B) = P(B|A) × P(A) + P(B|A0 ) × P(A0 )
The result is easily seen by considering the general result already derived or it may be derived directly
as follows. Consider Figure 12:
A B
A ∩ B! A∩B B ∩ A!
Figure 12
It is easy to see that the event B consists of the union of the (disjoint) events A ∩ B and B ∩ A0 so
that we may write B as the union of these disjoint events. We have
B = (A ∩ B) ∪ (B ∩ A0 )
Since the events A ∩ B and B ∩ A0 are disjoint, they must be independent and so
P(B) = P(A ∩ B) + P(B ∩ A0 )
46 HELM (2006):
®
Using the conditional probability results we already have we may write
P(B) = P(A ∩ B) + P(B ∩ A0 )

= P(B ∩ A) + P(B ∩ A0 )
= P(B|A) × P(A) + P(B|A0 ) × P(A0 )
The result we have derived is

P(B) = P(B|A) × P(A) + P(B|A0 ) × P(A0 )
Bayes’ theorem: special case

This result is obtained by supposing that the sample space S is partitioned by event A and its
complement A0 to give:
P(B|A) × P(A)
P(A|B) =
P(B|A) × P(A) + P(B|A0 ) × P(A0 )
Example 13
At a certain university, 4% of men are over 6 feet tall and 1% of women are over
6 feet tall. The total student population is divided in the ratio 3:2 in favour of
women. If a student is selected at random from among all those over six feet tall,
what is the probability that the student is a woman?
Solution
Let M ={Student is Male}, F ={Student is Female}.
Note that M and F partition the sample space of students.
Let T ={Student is over 6 feet tall}.
We know that P(M ) = 2/5, P(F ) = 3/5, P(T |M ) = 4/100 and P(T |F ) = 1/100.
We require P(F |T ). Using Bayes’ theorem we have:
P(T |F )P(F )
P(F |T ) =
P(T |F )P(F ) + P(T |M )P(M )
1 3
×
= 100 5
1 3 4 2
× + ×
100 5 100 5
3
=
11
HELM (2006): 47
Example 14
A factory production line is manufacturing bolts using three machines, A, B and
C. Of the total output, machine A is responsible for 25%, machine B for 35% and
machine C for the rest. It is known from previous experience with the machines
that 5% of the output from machine A is defective, 4% from machine B and 2%
from machine C. A bolt is chosen at random from the production line and found
to be defective. What is the probability that it came from
(a) machine A (b) machine B (c) machine C?
Solution
Let
D={bolt is defective},
A={bolt is from machine A},
B={bolt is from machine B},
C={bolt is from machine C}.
We know that P(A) = 0.25, P(B) = 0.35 and P(C) = 0.4.
Also
P(D|A) = 0.05, P(D|B) = 0.04, P(D|C) = 0.02.
A statement of Bayes’ theorem for three events A, B and C is
P(D|A)P(A)
P(A|D) =
P(D|A)P(A) + P(D|B)P(B) + P(D|C)P(C)
0.05 × 0.25
=
0.05 × 0.25 + 0.04 × 0.35 + 0.02 × 0.4
= 0.362
Similarly
0.04 × 0.35
P(B|D) =
0.05 × 0.25 + 0.04 × 0.35 + 0.02 × 0.4
= 0.406
0.02 × 0.4
P(C|D) =
0.05 × 0.25 + 0.04 × 0.35 + 0.02 × 0.4
= 0.232
48 HELM (2006):
®
Task
An engineering company advertises a job in three newspapers, A, B and C. It
is known that these papers attract undergraduate engineering readerships in the
proportions 2:3:1. The probabilities that an engineering undergraduate sees and
replies to the job advertisement in these papers are 0.002, 0.001 and 0.005 respec-
tively. Assume that the undergraduate sees only one job advertisement.
(a) If the engineering company receives only one reply to it advertisements,

calculate the probability that the applicant has seen the job advertised
in place A.
(i) A, (ii) B, (iii) C.

(b) If the company receives two replies, what is the probability that both
applicants saw the job advertised in paper A?
Your solution
HELM (2006): 49
Answer
Let
A = {Person is a reader of paper A},
B = {Person is a reader of paper B},
C = {Person is a reader of paper C},
R = {Reader applies for the job}.
We have the probabilities
(a)
P(A) = 1/3 P(R|A) = 0.002

P(B) = 1/2 P(R|B) = 0.001
P(C) = 1/6 P(R|C) = 0.005
P(R|A)P(A) 1
P(A|R) = =
P(R|A)P(A) + P(R|B)P(B) + P(R|C)P(C) 3
Similarly
1 5
P(B|R) = and P(C|R) =
4 12
(b) Now, assuming that the replies and readerships are independent
P(Both applicants read paperA) = P(A|R) × P(A|R)

1 1
= ×
3 3
1
=
9
50 HELM (2006):
®
Exercises
1. Obtain the sample space of an experiment that consists of a fair coin being tossed four times.
Consider the following events:
A is the event ‘all four results are the same.’

B is the event ‘exactly one Head occurs.’
C is the event ‘at least two Heads occur.’
17
Show that P(A) + P(B) + P(C) = and explain why P(A) + P(B) + P(C) > 1.
16
2. The table below show the number of complete years a group of people have been working in
their current employment.
Years of Employment Number of People

0 or 1 year 15
2 or 3 years 12
4 or 5 years 9
6 or 8 years 6
8 to 11 years 6
12 years and over 2
What is the probability that a person from the group, selected at random;
(a) is in the modal group
(b) has been working there for less than 4 years
(c) has been working there for at least 8 years.
3. It is a fact that if A and B are independent events then it is also true that A0 and B 0 are
independent events. If A and B are independent events such that the probability that they
1 3
both occur simultaneously is and the probability that neither of them will occur is , find:
8 8
(a) the probability that event A will occur
(b) the probability that event B will occur.
4. If A and B are two events associated with an experiment and P(A) = 0.4,
P(A ∪ B) = 0.7 and P(B) = p, find:
(a) the choice of p for which A and B are mutually exclusive
(b) the choice of p for which A and B are independent.
5. The probability that each relay closes in the circuit shown below is p. Assuming that each relay
functions independently of the others, find the probability that current can flow from L to R.
A B
L R
C D
HELM (2006): 51
6. From a batch of 100 items of which 20 are defective, exactly two items are chosen, one at a
time, without replacement. Calculate the probabilities that:
(a) the first item chosen is defective
(b) both items chosen are defective
(c) the second item chosen is defective.
7. A garage mechanic keeps a box of good springs to use as replacements on customers cars. The
box contains 5 springs. A colleague, thinking that the springs are for scrap, tosses three faulty
springs into the box. The mechanic picks two springs out of the box while servicing a car. Find
the probability that:
(a) the first spring drawn is faulty (b) the second spring drawn is faulty.
8. Two coins are tossed. Find the conditional probability that two Heads will occur given that at
least one occurs.
9. Machines A and B produce 10% and 90% respectively of the production of a component
intended for the motor industry. From experience, it is known that the probability that machine
A produces a defective component is 0.01 while the probability that machine B produces a
defective component is 0.05. If a component is selected at random from a day’s production
and is found to be defective, find the probability that it was made by
(a) machine A (b) machine B.
Answers
2 4 11 17
1. P(A) = , P(B) = , P(C) = , P(A) + P(B) + P(C) =
16 16 16 16
A, B and C are not mutually exclusive since events A and C have outcomes in common. This
17
is the reason why P(A) + P(B) + P(C) = ; we are adding the probabilities corresponding
16
to common outcomes more than once.
15
2. (a) P(person falls in the modal group) =
50
27
(b) P(person has been working for less than 4 years) =
50
8
(c) P(person has been working for more than 8 years) =
50
1 3
3. P(A) × P(B) = and (1 − P(A)) × (1 − P(B)) =
8 8
1 3
Treat these equations as xy = and (1 − x)(1 − y) = and solve to get:
8 8
1 1 1 1
P(A) = (or ) and P(B) = (or )
2 4 4 2
4. (a) P(A ∪ B) = P(A) + P(B) so 0.7 = 0.4 + p implying p = 0.3
(b) P(A ∪ B) = P(A) + P(B) − P(A) × P(B) so 0.7 = 0.4 + p − 0.4 × p implying p = 0.5.
52 HELM (2006):
®
Answers
5. P((A ∩ B) ∪ (C ∩ D)) = P(A ∩ B) + P(C ∩ D) − P(A ∩ B ∩ C ∩ D)
= p2 + p2 − p4
= 2p2 − p4
6. Let A={first item chosen is defective}, B ={second item chosen is defective}
20 1
(a) P(A) = =
100 5
19 20 19
(b) P(A ∩ B) = P(A|B)P(A) = × =
99 100 495
19 20 20 80 198 1
(c) P(B) = P(B|A)P(A) + P(B|A0 )P(A0 ) = × + × = =
99 100 99 100 990 5
7. Let A ={first spring chosen is faulty}, B ={second spring chosen is faulty}

3
(a) P(A) =
8
2 3 3 5 21 3
(b) P(B) = P(B|A)P(A) + P(B|A0 )P(A0 ) = × + × = =
7 8 7 8 56 8
8. Let A = {at least one Head occurs}, B = {two Heads occur}

1 1
P(A ∩ B) P(A) × P(B) × 1
P(B) = = = 2 2 =
P(A ∪ B) P(A) + P(B) − P(A) × P(B) 1 1 1 1 3
+ − ×
2 2 2 2
9. Let A = {item from machine A}, B = {item from machine B}, D = {item is defective}.
We know that: P(A) = 0.1, P(B) = 0.9, P(D|A) = 0.01, P(D|B) = 0.05.
(a)
P(D|A)P(A)
P(A|D) =
P(D|A)P(A) + P(D|B)P(B)
0.01 × 0.1
=
0.01 × 0.1 + 0.05 × 0.9
= 0.02
(b) Similarly P(B|D) = 0.98
HELM (2006): 53
Contents 36
Descriptive Statistics
36.1 Describing Data 2
36.2 Exploring Data 26
Learning outcomes
In the first Section of this Workbook you will learn how to describe data sets and represent
them numerically using, for example, means and variances. In the second Section you will
learn how to explore data sets and arrive at conclusions, which will be essential if you are
to apply statistics meaningfully to real situations.

Describing Data 36.1
Introduction
Statistics is a scientific method of data analysis applied throughout business, engineering and all of
the social and physical sciences. Engineers have to experiment, analyse data and reach defensible
conclusions about the outcomes of their experiments to determine how products behave when tested
under real conditions. Work done on new products and processes may involve decisions that have to
be made which can have a major economic impact on companies and their employees. Throughout
industry, production and distribution processes must be organised and monitored to ensure maximum
efficiency and reliability. One important branch of applied statistics is quality control. Quality control
is an essential part of any production process which aims to ensure that high quality products are
made, surely a principle aim of any practical engineer.
This Workbook is intended to give you an introduction to the subject and to enable you to under-
stand in reasonable depth the meaning and interpretation of numerical and diagrammatic statements
involving data. This first Section concentrates on the basic tabular and diagrammatic techniques for
displaying data and the calculation of elementary statistics representing location and spread.


( 35.1)

'
$
• explain why statistics is important for
engineers.
• explain what is meant by the term descriptive

Learning Outcomes statistics
On completion you should be able to . . . • calculate means, medians, modes and
standard deviations
• draw a variety of statistical diagrams

& %
2 HELM (2006):
Workbook 36: Descriptive Statistics
1. Introduction to descriptive statistics
Many students taking degree courses involving the sciences and technology have to study statistics.
This Workbook will enable you to understand the meaning and interpretation of numerical and
diagrammatic statements involving data.
Consider the following ‘everyday’ statements, all of which contain numbers:
1. My son plays in his school cricket team, his batting average over the season was 28.9 runs.
2. Police estimate that 4,000 people took part in the protest march.
3. About 11,000,000 drivers will take to the roads during the coming Bank Holiday.
4. The average life of this type of tyre is between 20,000 and 25,000 miles.
The four statements are all of the type that you may meet in the course of your everyday life. In a
sense, there is nothing special about them and yet they all use numbers in different ways. Statement
1 implies that a numerical calculation has been performed on a data set, statement 2 implies that a
point estimate can represent a data set, statement 3 is making a prediction about an event which has
not yet happened and statement 4 is making a prediction about an event which depends on several
interrelated factors and is based on past experience.
All four statements are concerned with the collection, organisation and analysis of data.
Essentially, this last sentence summarises descriptive statistics. We start with the organisation of
data and look at techniques for examining data − these are called exploratory techniques and enable
us to understand and communicate to others meaning that may be hidden within a given data set.
2. Frequency tables
Data are often presented to statisticians in raw form - it needs organising so that statisticians and
non-statisticians alike can view the information contained in the data. Simple columns of figures do
not mean a lot to most people! As a start, we usually organise the data into a frequency table. The
way in which this may be done is illustrated below.
The following data are the heights (to the nearest tenth of a centimetre) of 30 students studying
engineering statistics.
150.2 167.2 176.2
160.1 151.8 166.3
162.3 167.4 178.3
181.2 175.7 161.1
179.3 168.9 164.8
165.0 177.1 183.2
172.1 180.2 168.2
173.8 164.3 176.8
184.2 170.9 172.2
168.5 169.8 176.7
Notice first of all that all of the numbers lie in the range 150 cm. - 185 cm. This suggests that we
try to organize the data into classes as shown below. This first attempt has deliberately taken easy
class intervals which give a reasonable number of classes and span the numerical range covered by
the data.
HELM (2006): 3
Section 36.1: Describing Data
Class Class Interval
1 150 - 155
2 155 - 160
3 160 - 165
4 165 - 170
5 170 - 175
6 175 - 180
7 180 - 185
Note that in extreme we could argue that the original data are already represented by one class with
thirty members or we could say that we already have 30 classes with one member each!
Neither interpretation is helpful and usually look to use about 5 to 8 classes. Note that this range
may be varied depending on the data under investigation.
When we attempt to allocate data to classes, difficulties can arise, for example, to which class should
the number 165 be allocated? Clearly we do not have a reason for choosing the class 160-165 in
preference to the class 165-170, either class would do equally well.
Rather than adopt an arbitrary convention such as always placing boundary values in the higher (or
lower) class we usually define the class boundaries in such a way that such difficulties do not occur.
This can always be done by using one more decimal place for the class boundaries than is used in the
data themselves although sometimes it is not necessary to use an extra decimal place. Two possible
alternatives for the data set above are shown below.
Class Class Interval 1 Class Interval 2
1 149.5 - 154.5 149.55 - 154.55
2 154.5 - 159.5 154.55 - 159.55
3 159.5 - 164.5 159.55 - 164.55
4 164.5 - 169.5 164.55 - 169.55
5 169.5 - 174.5 169.55 - 174.55
6 174.5 - 179.5 174.55 - 179.55
7 179.5 - 184.5 179.55 - 184.55
Notice that no member of the original data set can possibly lie on a boundary in the case of Class
Intervals 2 - this is the advantage of using an extra decimal place to define the boundaries. Notice
also that in this particular case the first alternative suffices since is happens that no member of the
original data set lies on a boundary defined by Class Intervals 1.
Since Class Intervals 1 is the simpler of the two alternative, we shall use it to obtain a frequency
table of our data.
The data is organised into a frequency table using a tally count. To do a tally count you simply
lightly mark or cross off a data item with a pencil as you work through the data set to determine how
many members belong to each class. Light pencil marks enable you to check that you have allocated
all of the data to a class when you have finished. The number of tally marks must equal the number
of data items. This process gives the tally marks and the corresponding frequencies as shown below.
4 HELM (2006):
Class Interval (cm) Tally Frequency
149.5 − 154.5 11 2
154.5 − 159.5 0
159.5 − 164.5 1111 4
164.5 − 169.5 11111111 8
169.5 − 174.5 11111 5
174.5 − 179.5 1111111 7
179.5 − 184.5 1111 4
It is now easier to see some of the information contained in the original data set. For example, we
now know that there is no data in the class 154.5 - 159.5 and that the class 164.5 - 169.5 contains
the most entries.
Understanding the information contained in the original table is now rather easier but, as in all
branches of mathematics, diagrams make the situation easier to visualise.
3. Diagrammatic representations
The histogram
Notice that the data we are dealing with is continuous, a measurement can take any value. Values
are not restricted to whole number (integer) values for example. When we are using continuous data
we normally represent frequency distributions pictorially by means of a histogram.
The class intervals are plotted on the horizontal axis and the frequencies on the vertical axis. Strictly
speaking, the areas of the blocks forming the histogram represent the frequencies since this gives the
histogram the necessary flexibility to deal with frequency tables whose class intervals are not constant.
In our case, the class intervals are constant and the heights of the blocks are made proportional to
the frequencies.
Sometimes the approximate shape of the distribution of data is indicated by a frequency polygon
which is formed by joining the mid-points of the tops of the blocks forming the histogram with
straight lines. Not all histograms are presented along with frequency polygons.
The complete diagram is shown in Figure 1.
Frequency
8
0 Height (cm)
149.5 154.5 159.5 164.5 169.5 174.5 179.5 184.5
Figure 1
HELM (2006): 5
Task
The following data are the heights (to the nearest tenth of a centimetre) of a
second sample of 30 students studying engineering statistics. Another way of
determining class intervals is as follows.
Class intervals may be taken as (for example)
145 -
150 -
155 -
160 -
165 -
170 -
175 -
180 -
185 -
The intervals are read as ‘145 cm. up to but not including 150 cm’, then ‘150 cm.
up to but not including 155 cm’ and so on. The class intervals are chosen in such
a way as to cover the data but still give a reasonable number of classes.
Organise the data into classes using the above method of defining class intervals
and draw up a frequency table of the data. Use your table to represent the data
diagrammatically using a histogram.
Hint:- All the data values lie in the range 145-190.
155.3 177.3 146.2 163.1 161.8 146.3 167.9 165.4 172.3 188.2
178.8 151.1 189.4 164.9 174.8 160.2 187.1 163.2 147.1 182.2
178.2 172.8 164.4 177.8 154.6 154.9 176.3 148.5 161.8 178.4
Your solution
6 HELM (2006):
Answer
145 - 1111 4
150 - 111 3
155 - 1 1
160 - 11111 11 7
165 - 11 2
170 - 111 3
175 - 11111 1 6
180 - 1 1
185 - 111 3
The histogram is shown below.

Frequency
0
145 150 155 160 165 170 175 180 185 190 Height (cm)
The bar chart

The bar chart looks superficially like the histogram, indeed, many people confuse the two. However,
there are important differences that you should be aware of. Firstly, the bar chart is usually used to
represent discrete data or categorical data. Secondly, the length of a bar is directly proportional to
the frequency it represents. Remember that in the case of the histogram, the area of a bar is directly
proportional to the frequency it represents and that the histogram is normally used to represent
continuous data. To be clear, discrete data is data that can only take specific values. An example
would be the amount of money you have in your pocket. The amount can only take certain values,
you cannot, for example, have 34.229 pence in your pocket. Categorical data is, as you might expect,
data which is organized by category. Favourite foods (pies, chips, pizzas, cakes and fruit for example)
or preferred colours for cars (red, blue, silver or black for example).
Absenteeism can be a problem for some engineering firms. The following discrete data represents the
number of days off taken by 50 employees of a small engineering company. Note that in the context
of this example, the term discrete means that the data can only take whole number values (number
of days off), nothing in between.
6 4 4 5 0 4 3 6 1 3
8 3 6 1 0 6 11 5 10 8
2 4 6 6 6 6 5 13 11 6
4 8 4 7 7 6 8 3 3 6
3 2 3 6 2 2 3 2 4 0
In order to construct a bar chart we follow a simple set of instructions akin to those to form a
frequency distribution.
HELM (2006): 7
1. Find the range of values covered by the data (0 - 13 in this case).
2. Tally the number of absentees corresponding to each number of days taken off work.
3. Draw a diagram with the range (0 - 13) on one axis and the number of days corresponding to
each value (number of days off) on the other. The length of each bar is proportional to the
frequency (that is proportional to the number of staff taking that number of days off).
The results appear as follows:
12 Absenteeism
Absenteeism
Frequency of Absences
13
12
10
11
10
Days Absent
9 8
8
7 6
6
5
4 4
3
2 2
1
0
0
0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Frequency of Absences Days Absent
Figure 2
It is perfectly possible and acceptable to draw the bar chart with the bars appearing as vertically
instead of horizontally.
Task
The following data give the number of rejects in fifty batches of engine components
delivered to a motor manufacturer. Draw two bar charts representing the data, one
with the bars vertical and one with the bars horizontal. Draw one chart manually
and one using a suitable computer package.
2 3 5 6 8 1 2 0 3 4
8 3 6 1 0 6 11 5 10 8
3 5 9 12 3 8 5 11 15 3
4 8 4 7 7 6 8 3 3 6
3 2 3 6 2 2 3 2 4 0
8 HELM (2006):
Your solution
Answer
14
12
10
Number of Batches
10
Number of Rejects
6
6
5
4
4
3
2
2
1
0
0
0 1 2 3 4 5 6 10 0 2 4 6 8 10 12 14
Number of Rejects Number of Batches
The pie chart

One of the more common diagrams that you must have seen in magazines and newspapers is the
pie chart, examples of which can be found in virtually any text book on descriptive statistics. A pie
chart is simply a circular diagram whose sectors are proportional to the quantity represented. Put
more accurately, the angle subtended at the centre of the pie by a sector of the circle is proportional
to the size of the subset of the whole set represented by the sector. The whole set is, of course,
represented by the whole circle.
Pie charts demonstrate percentages and proportions well and are suitable for representing categorical
data. The following data represents the time spent weekly on a variety of activities by the full-time
employees of a local engineering company.
HELM (2006): 9
Hours spent on: Males Females
Travel to and from work 10.5 8.4
Paid activities in employment 47.0 37.0
Personal sport and leisure activities 8.2 3.6
Personal development 5.6 6.4
Family activities 8.4 18.2
Sleep 56.0 56.0
Other 32.3 28.4
To construct a pie chart showing how the male employees spend their time we proceed as follows.
Note that the total number of hours spent is 168 (7 × 24).
1. Express the time spent on any given activity as a proportion of the total time spent;
2. Multiply the number obtained by 360 thus converting the proportion to an angle;
3. Draw a chart consisting of (in this case) 6 sectors having the angles given in the chart below
subtended at the centre of the circle.
Hours spent on: Males Proportion of Time Sector Angle
10.5 10.5
Travel to and from work 10.5 168 168
× 360 = 22.5
47 47
Paid activities in employment 47.0 168 168
× 360 = 100.7
8.2 8.2
Personal sport and leisure activities 8.2 168 168
× 360 = 17.6
5.6 5.6
Personal development 5.6 168 168
× 360 = 12
8.4 8.4
Family activities 8.4 168 168
× 360 = 18
56 56
Sleep 56.0 168 168
× 360 = 120
32.3 32.3
Other 32.3 168 168
× 360 = 69.2
The pie chart obtained is illustrated below.
Travel to and from work
Other
Paid activities in
employment
Sleep
Personal sport and
leisure activities
Personal
Family development
activities
Figure 3
10 HELM (2006):
Task
Construct a pie chart for the female employees of the company and use both it
and the pie chart in Figure 3 to comment on any differences between male and
female employees that are illustrated.
Your solution
Answer
Hours spent on: Feales Proportion of Time Sector Angle
8.4 8.4
Travel to and from work 10.5 168 168
× 360 = 18
37 37
Paid activities in employment 47.0 168 168
× 360 = 79.3
3.6 3.6
Personal sport and leisure activities 8.2 168 168
× 360 = 7.7
6.4 6.4
Personal development 5.6 168 168
× 360 = 13.7
18.2 18.2
Family activities 8.4 168 168
× 360 = 39
56 56
Sleep 56.0 168 168
× 360 = 120
38.4 38.4
Other 38.4 168 168
× 360 = 82.3
Travel to and from work
Other
Paid activities in
employment
Personal sport and

leisure activities
Personal
development
Sleep
Family
activities
Comments: Proportionally less time spent travelling, more on family activities etc.
HELM (2006): 11
Quartiles and the ogive
Later in this Workbook we shall be looking at the statistics derived from data which are placed in
rank order. Ranking data simply means that the data are placed in order from the highest to the
lowest or from the lowest to the highest. Three important statistics can be derived from ranked
data, these are the Median, the Lower Quartile and the Upper Quartile. As you will see the Ogive or
Cumulative Frequency Curve enables us to find these statistics for large data sets. The definitions of
the three statistics referred to are given below.
Key Point 1
The Median; this is the central value of a distribution. It should be noted that if the data set
contains an even number of values, the median is defined as the average of the middle pair.
The Lower Quartile; this is the least number which has 25% of the distribution below it or equal
to it.
The Upper Quartile, this is the greatest number which has 25% of the distribution above it or
equal to it.
For the simple data set 1.2, 3.0, 2.5, 5.1, 3.5, 4.1, 3.1, 2.4 the process is illustrated by placing
the members of the data set below in rank order:
5.1, 4.1, 3.5, 3.1, 3.0, 2.5, 2.4, 1.2
Here we have an even number of values and so the median is calculated as follows:
3.1 + 3.0
Median = the average of the two central values, = 3.05 The lower quartile and the upper
2
quartile are easily read off using the definition given above:
Lower Quartile = 2.4 Upper Quartile = 4.1
It can be difficult to decide on realistic values when the distribution contains only a small number of
values.
Task
Find the median, lower quartile and upper quartile for the data set:
5.0, 4.1, 3.5, 3.1, 3.0, 2.5, 2.4, 1.2, 0.7
Your solution
12 HELM (2006):
Answer
Lower Quartile = (1.2 + 2.4)/2 = 1.8
Median = 3.0
Upper Quartile = (4.1 + 3.5)/2 = 3.8
Note Check the answer carefully when you have completed the exercise, finding the median is
easy but deciding on the values of the upper and lower quartiles is more difficult.
In the case of larger distributions the quantities can be approximated by using a cumulative fre-
quency curve or ogive.
The cumulative frequency distribution for the distribution of the heights of the 30 students given
earlier is shown below. Notice that here, the class intervals are defined in such a way that the
frequencies accumulate (hence the term cumulative frequency) as the table is built up.
Height Cumulative Frequency
less than 149.5 0
less than 154.5 2
less than 159.5 2
less than 164.5 6
less than 169.5 14
less than 174.5 19
less than 179.5 26
less than 184.5 30
To plot the ogive or cumulative frequency curve, we plot the heights on the horizontal axis and the
cumulative frequencies on the vertical axis. The corresponding ogive is shown below.
30
25
UQ = Upper Quartile
Cumulative 20
M = Median
Frequency
15 LQ = Lower Quartile
10
0 LQ M UQ
149.5 154.5 159.5 164.5 169.5 174.5 179.5 184.5 Height (cm)
Figure 4
In general, ogives are ‘S’ - shaped curves. The three statistics defined above can be read off the
diagram as indicated. For the data set giving the heights of the 30 students, the three statistics are
defined as shown below.
HELM (2006): 13
1. The Median, this is the average of the 15th and 16th values (170.4) since we have an even
number of data;
2. The Lower Quartile, 25% of 30 = 7.5 and so we take the average of the 7th and 8th values
(164.9) read off from the bottom of the distribution to have 25% of the distribution less than
or equal to it;
3. The Upper Quartile, again 75% of 30 = 22.5 and so we take the average of the 22nd and
23rd values (177) read off from the top of the distribution to have 25% of the distribution
greater than or equal to it.
4. Location and spread

Very often we can summarize a distribution by specifying two values which measure the location
or mean value of the distribution and dispersion or spread of the distribution about its mean. You
will see later (see subsection 3 below) that not all distributions can be adequately represented by
simply measuring location and spread - the shape of a distribution is also of fundamental importance.
Assume, for the purposes of this Section that the distribution is reasonably symmetrical and roughly
follows the bell-shaped distribution illustrated below.
Frequency
Classes
Figure 5
In order to summarise a distribution as briefly as possible we shall now attempt to measure the centre
or location of the distribution and the spread or dispersion of the distribution about its centre.
Notation
The symbols µ and σ are used to represent the mean and standard deviation of a population and x̄
and s are used to represent the mean and standard deviation of a sample taken from a population.
This Section of the Workbook will show you how to calculate the mean.
Measures of location
There are three widely used measures of location, these are:
• The Mean, the arithmetic average of the data;
• The Median, the central value of the data;
• The Mode, the most frequently occurring value in the data set.
This Section of the booklet will show you how to calculate the mean.
14 HELM (2006):
Key Point 2
If we take a set of numbers x1 , x2 , . . . , xn , its mean value is defined as:
x1 + x2 , +x3 + . . . , +xn
x̄ =
n
This is usually shortened to:
n
1X 1X
xi and written as: x̄ = x
n i=1 n
In words, this formula says

sum the values of x and divide by the number of numbers you have summed.
Calculating mean values from raw data is accurate but very time-consuming and tedious. It is much
more usual to work from a frequency distribution which makes the calculation much easier but may
involve a slight loss of accuracy. In order to calculate the mean of a distribution from a frequency
table we make the major assumption that each class interval can be represented accurately by its
Mid-Interval Value (MIV). Essentially, this means that we are assuming that the class values are
evenly spread above and below the MIV for each class in the distribution so that the sum of
the values in each class is approximately equal to the MIV multiplied by the number of members in
the class.
The calculation resulting from this assumption is illustrated below for the data on heights of students
introduced on page 3 of this Section.
Class M IV (x) F requency (f ) fx
149.5 − 154.5 152 2 304
154.5 − 159.5 157 0 0
159.5 − 164.5 162 4 648
164.5 − 169.5 167 8 1336
169.5 − 174.5 172 5 860
174.5 − 179.5 177 7 1239
179.5 − 184.5 182 P 4 P 728
f = 30 f x = 5115
P
fx 5115
The average value of the distribution is given by x̄ = P = = 170.5
f 30
P
fx
The formula usually used to calculate the mean value is x̄ = P
f
There are techniques for simplifying the arithmetic but the wide-spread use of electronic calculators
(many of which will do the calculation almost at the push of a button) and computers has made a
working knowledge of such techniques redundant.
HELM (2006): 15
Task
Use the following data set of heights of a sample of 30 students (met before in
the Task on page 6) to form a frequency distribution and calculate the mean of
the data.
155.3 177.3 146.2 163.1 161.8 146.3 167.9 165.4 172.3 188.2
178.8 151.1 189.4 164.9 174.8 160.2 187.1 163.2 147.1 182.2
178.2 172.8 164.4 177.8 154.6 154.9 176.3 148.5 161.8 178.4
Your solution
Answer
Class M IV (x) F requency (f ) fx
145− 147.5 4 590
150− 152.5 3 457.5
155− 157.5 1 157.5
160− 162.5 7 1137.5
165− 167.5 2 335
170− 172.5 3 517.5
175− 177.5 6 1065
180− 182.5 1 182.5
185− 187.5 3 562.5
Sum = 30 Sum=5005
Mean = 166.83
Measures of spread
The members of a distribution may be scattered about a mean in many different ways so that a
single value describing the central location of a distribution cannot be sufficient to completely define
the distribution.
The two data sets below have the same mean of 7 but clearly have different spreads about the mean.
Data set A: 5, 6, 7, 8, 9
Data set B: 1, 2, 7, 12, 13
16 HELM (2006):
There are several ways in which one can measure the spread of a distribution about a mean, for
example
• the range - the difference between the greatest and least values;
• the inter-quartile range - the difference between the upper and lower quartiles;
• the mean deviation - the average deviation of the members of the distribution
from the mean.
Each of these measures has advantages and problems associated with it.
Measure of Spread Advantages Disadvantages
Range Easy to calculate Depends on two extreme values
and does not take into account
any intermediate values
Inter-Quartile Range Is not susceptible to the
influence of extreme values. Measures only the central 50%
of a distribution.
Mean Deviation Takes into account every
member of a distribution. Always has the value zero for a
symmetrical distribution.
By far the most common measure of the spread of a distribution is the standard deviation which
is obtained by using the procedure outlined below.
Consider the two data sets A and B given above. Before writing down the formula for calculating
the standard deviation we shall look at the tables below and discuss how a measure of spread might
evolve.
DATA SET A DATA SET B
x x − x̄ (x − x̄)2 x x − x̄ (x − x̄)2
5 −2 4 1 −6 36
6 −1 1 2 −5 25
7 0 0 7 0 0
8 1 1 12 5 25
9 2 4 13 6 36
(x − x̄)2 = 10 (x − x̄)2 = 122
P P
Notice that the ranges of the data sets are 4 and 12 respectively and that the mean deviations are
both zero. Clearly the spreads of the two data sets are different and the zero value for the mean
deviations, while factually correct, has no meaning in practice.
To avoid problems inherent in the mean deviation (cancelling to give zero with a symmetrical distri-
bution for example) it is usual to look at the squares of the mean deviations and then average them.
This gives a value in square units and it is usual to take the square root of this value so that the
spread is measured in the same units as the original values. The quantity obtained by following the
routine outlined above is called the standard deviation.
The symbol used to denote the standard deviation is s so that the standard deviations of the two
data sets are:
r r
10 122
sA = = 1.41 and sB = = 4.95
5 5
HELM (2006): 17
The two distributions and their spreads are illustrated by the diagrams below.
Frequency s 1.41 s 1.41
1 2 3 4 5 6 7 8 9 10 11 12 13 x
Data Set A
Frequency s 4.95 s 4.95

1
1 x
2 3 4 5 6 7 8 9 10 11 12 13
Data Set B
Figure 6
Task
Calculate the standard deviation of the data set: 3, 4, 5, 6, 6, 6, 7, 8, 9
Your solution
Answer
Data x x − mean (x − mean)2
3 −3 9
4 −2 4
5 −1 1
6 0 0
6 0 0
6 0 0
7 1 1
8 2 4
9 3 9
(x − mean)2 = 28
P P
mean = 6 (x − mean) = 0
standard deviation = 1.76383421
18 HELM (2006):
Summary
The procedure for calculating the standard deviation may be summarized as follows:
from every raw data value, subtract the mean, square the
results, average them and then take the square root.
In terms of a formula this procedure is given in Key Point 3:
Key Point 3
Formula forr
Standard Deviation
P
(x − x̄)2
s=
n
You will often need a quantity called the variance of a set of data, this simply the square of the
standard deviation and is denoted by s2 . Calculating the variance is exactly like calculating the
standard deviation except that you do not take the square root at the end as in Key Point 4:
Key Point 4
Formula for
PVariance
(x − x̄)2
s2 =
n
Very often our data represent a sample of size n from some population. If we could observe every
member of the population then we could work out the mean and standard deviation for the whole
population. Often we cannot do this and can only observe a sample. The population mean and
population variance are therefore unknown but we can regard the sample mean and sample variance
as estimates of them. To make the distinction clear, we usually use Greek letters for population
parameters. So, the population mean is µ and the population variance is σ 2 . The population standard
deviation is, of course, σ.
When we are estimating µ and σ 2 using a sample of data we use a slightly different formula in the
case of the variance. This formula is given in Key Point 5. It is discussed further in Workbook 40.
The difference is simply that we divide by n − 1 instead of by n. In the rest of this Workbook we will
use the notation s2n if we are dividing by n and s2n−1 if we are dividing by n − 1. We will use sn and
sn−1 for the corresponding standard deviations which are simply the square roots of these variances.
HELM (2006): 19
Key Point 5
Formula for Estimating Variance
(x − x̄)2
P
2
sn−1 =
n−1
where s2n−1 is the estimate of the population variance σ 2 and x̄ is the mean of the data in the sample
of size n taken from a population.
For data represented by a frequency distribution, in which each quantity x appears with frequency
f , the formula in Key Point 4 becomes
f (x − x̄)2
P
2
sn = P
f
This formula can be simplified as shown below to give a formula which lends itself to a calculation
based on a frequency distribution. The derivation of the variance formula is shown below.
f (x − x̄)2
P
s2n = P
f
f (x − 2x̄x + x̄2 )
2
P
= P
f
P 2
f x − 2x̄ f x + x̄2 f
P P
= P
f
P 2
fx
= P − 2x̄2 + x̄2
f
P 2 P 2
fx fx
= P − P
f f
This formula is not as complicated as it looks at first sight. If you look back at the calculation for the
mean you will seePthat2 you only need one more quantity in order to calculate the standard deviation,
this quantity is fx .
Calculation of the variance

The complete calculation of the mean and the variance for a frequency distribution (heights of 30
students, page 3) is shown below.
Class M IV (x) Frequency(f ) fx f x2
149.5 − 154.5 152 2 304 46, 208
154.5 − 159.5 157 0 0 0
159.5 − 164.5 162 4 648 104, 976
164.5 − 169.5 167 8 1336 223, 112
169.5 − 174.5 172 5 860 147, 920
174.5 − 179.5 177 7 1239 219, 303
179.5 − 184.5 182 P 4 P 728 P 132, 496
f = 30 f x = 5115 f x2 = 874015
Once the appropriate columns are summed, the calculation is completed by substituting the values
into the formulae for the mean and the standard deviation.
20 HELM (2006):
The mean value is
P
fx
x̄ = P
f
5115
=
30
= 170.5
The variance is
P 2 P 2
2 fx fx
sn = P − P
f f
2
874015 5115
= −
30 30
= 63.58
Taking the square root gives the standard deviation as sn = 7.97.

So far, you have only met the suggestion that a distribution can be represented by its mean and
its standard deviation. This is a reasonable assertion provided that the distribution is single-peaked
and symmetrical. Fortunately, many of the distributions met in practice are single-peaked and sym-
metrical. In particular, the so-called normal distribution which is bell-shaped and symmetrical about
its mean is usually summarized numerically by its mean and standard deviation or by its mean and
variance. A typical normal distribution is illustrated below.
Frequency
x
µ 3σ µ 2σ µ σ µ µ+σ µ+2σ µ+3σ
Figure 7
It is sometimes found that data cannot be assumed to be normally distributed and techniques have
been developed which enable such data to be explored, illustrated, analysed and represented using
statistics other than the mean and standard deviation.
HELM (2006): 21
Task
Use the following data set of student heights (taken from the Task on page 5)
to form a frequency distribution and calculate the mean, variance and standard
deviation of the data.
155.3 177.3 146.2 163.1 161.8 146.3 167.9 165.4 172.3

188.2 178.8 151.1 189.4 164.9 174.8 160.2 187.1 163.2
147.1 182.2 178.2 172.8 164.4 177.8 154.6 154.9 176.3
148.5 161.8 178.4
Your solution
Answer
Class M IV (x) Frequency(f ) fx f x2
145− 147.5 4 590 87025
150− 152.5 3 457.5 69768.75
155− 157.5 1 157.5 24806.25
160− 162.5 7 1137.5 184843.75
165− 167.5 2 335 56112.5
170− 172.5 3 517.5 89268.75
175− 177.5 6 1065 189037.5
180− 182.5 1 182.5 33306.25
185− 187.5 P 3 P 562.5 P 105468.75
f = 30 f x = 5005 f x2 = 839637.5
Mean = 166.83 Variance = 154.56 Standard Deviation = 12.43
22 HELM (2006):
Exercises
1. Find (a) the mean and standard deviation, (b) the median and inter-quartile range, of the
following data set:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Would you say that either summary set is preferable to the other?
If the number 10 is replaced by the number 100 so that the data set becomes
1, 2, 3, 4, 5, 6, 7, 8, 9, 100
calculate the same statistics again and comment on which set you would use to summarise the
data.
2. (a) The following data give the number of calls per day received by the service department
of a central heating firm during a period of 24 working days.
16, 12, 1, 6, 44, 28, 1, 19, 15, 11, 18, 35,

21, 3, 3, 14, 22, 5, 13, 15, 15, 25, 18, 16
Organise the data into a frequency table using the class intervals
1 − 10, 11 − 20, 21 − 30, 31 − 40, 41 − 50
Construct a histogram representing the data and calculate the mean and variance of the
data.
(b) Repeat question (a) using the data set given below:
11, 12, 1, 2, 41, 21, 1, 11, 12, 11, 11, 32,

21, 3, 3, 11, 21, 2, 11, 12, 11, 21, 12, 11
What do you notice about the histograms that you have produced? What do you notice
about the means and variances of the two distributions?
Do the results surprise you? If so, say why.
3. For each of the data sets in Question 2, calculate the mean and variance from the raw data
and compare the results with those obtained from the frequency tables. Comment on any
differences that you find and explain them.
4. A lecturer gives a science test to two classes and calculates the results as follows:
Class A - average mark 36% Class B - average mark 40%
The lecturer reports to her Head of Department that the average mark over the two classes
must be 38%. The Head of Department disagrees, who is right?
Do you need any additional information, if so what, to make a decision as to who is right?
HELM (2006): 23
Answers
1. Mean = 5.50, standard deviation = 2.87, mid-spread = 6. Very little to choose between
the summary statistics, mean = median and inter-quartile range is approximately twice the
standard deviation.
For the second set of data mean = 14.50, standard deviation = 29.99 and the
inter-quartile range = 6. Here the median and inter-quartile range are preferable to the mean
and standard deviation - they represent the bulk of the data much more realistically.
2. (a) Calculations from the raw data: Mean = 15.67, standard deviation = 10.23
However, from the frequency table: Mean = 16.75, standard deviation = 9.71
Question 2a - Histogram
14.00
12.00
10.00
Frequency
8.00
6.00
4.00
2.00
0.00
1 to 10 11 to 20 21 to 30 31 to 40 41 to 50
Classes
(b) Mean = 12.71, standard deviation = 9.50
However, from the frequency table: Mean = 16.75, standard deviation = 9.71
Question 2b - Histogram
14.00
12.00
10.00
Frequency
8.00
6.00
4.00
2.00
0.00
1 to 10 11 to 20 21 to 30 31 to 40 41 to 50
Classes
The mean, standard deviation and histogram are all identical since the classes and frequencies
are. This may be surprising since the data sets are different!
24 HELM (2006):
Answers
3. The means and standard deviations calculated from the raw data are clearly the ones to use.
The data given in Question 2(a) has a reasonably uniform spread throughout the classes,
hence the reasonable agreement in the calculated means and standard deviations.
The data given in Question 2(b) is biased towards the bottom of the classes, hence the high
value of the calculated mean from the frequency distribution which assumes a reasonable
spread of data throughout the classes. The actual spread of the data is the same (hence the
same standard deviations) but the data in Question 2(b) is shifted down relative to that give
in Question 2(a).
4. The Head of Department is right. The lecturer is only correct if both classes have the same
number of students. Example: if class A has 20 students and class B has 60 students, the
average mark will be: (20 × 36 + 60 × 40)/(20 + 60) = 39%.
HELM (2006): 25

Exploring Data 36.2
Introduction
Techniques for exploring data to enable valid conclusions to be drawn are described in this Section.
The diagrammatic methods of stem-and-leaf and box-and-whisker are given prominence.
You will also learn how to summarize data using sets of statistics which have meaning in cases where
a data set is not symmetrical. You should note that statistics such as the mean and variance are
of limited use in such situations. Finally, you will encounter outliers. These are values which lie
outside the main body of the data set and can enable you to reach important conclusions about the
behaviour of the data.


( 35.1)

'
$
• undertake Exploratory Data Analysis (EDA)
• construct stem-and-leaf diagrams and

Learning Outcomes box-and-whisker plots
On completion you should be able to . . . • explain the significance of outliers, skewness,
gaps and multiple peaks
& %
26 HELM (2006):
®
1. Exploratory data analysis
Introduction
The title ‘Exploratory Data Analysis’ (EDA) is usually taken to mean the activity by which data is
explored and organized in order that information it contains is made clear. This branch of statistics
usually deals with summary statistics which are resistant to departures from normality. The techniques
used in EDA were first developed by the statistician John Tukey and for details of EDA which are
beyond this open learning booklet, you are referred to the text Exploratory Data Analysis, by J.W.
Tukey, Addison-Wesley, 1977. Tukey’s techniques have been used in innumerable papers and books
since that date.
The basics of EDA

The basic principles followed in EDA are:
• To measure the location and spread of a distribution we use statistics which are
resistant to departures from normality;
• To summarise shape location and spread we use several statistics rather than just two;
• Visual displays as well as numerical displays are used to summarise information obtained
about shape, location and spread.
You can see these principles illustrated below.
Traditionally, the location and spread of a distribution are measured by calculating its mean and
standard deviation. The problem with these statistics is that are sensitive to the influence of extreme
values. For example, the data set
1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6
has mean x̄ = 3.5 and standard deviation sn−1 = 1.45. These values are quite acceptable since the
distribution is symmetrical about its mean of 3.5. The symmetry is easily seen simply be inspecting
the data although the bar chart below might make the symmetry more obvious.
Data Bar Chart

3
Frequency
2.5
2
1.5
1
0.5
0
1 2 3 4 5 6
Classes
Figure 8
HELM (2006): 27
Section 36.2: Exploring Data
The shape of the distribution may also be shown by the stem-and-leaf diagram below. Notice that
the stem consists of the numbers 1 to 6 and the leaves are just the members of each class.
1 1
2 2 2
3 3 3 3
4 4 4 4
5 5 5
6 6
Figure 9
You will study the stem-and-leaf diagram in more detail later in this Workbook.
The effects of changes in extreme values are easily illustrated by looking at what happens if we take
the last number to be 60 instead of 6. This destroys the symmetry of the distribution and gives mean
x̄ = 8 and standard deviation sn−1 = 16.42. Clearly, these values do not describe the distribution
very well at all, a mean which is higher than 92% of the members of the distribution can hardly be
described as representative!
The simplest and most common examples of resistant statistics are those based on the idea of rank
order - we simply order a distribution starting at the highest value and ending at the lowest value (or
lowest to highest).
Key Point 5
The five essential statistics based on rank order are illustrated in the diagram below:
Highest Value
25%
Upper Hinge
Distribution
25%
Median
25%
Lower Hinge
25%
Lowest Value
28 HELM (2006):
®
Key Point 6
Using the values in Key Point 5 other statistics which represent the shape or spread of the distribution
may be defined. These statistics are known as the Mid-Spread, High-Spread and Low-Spread and
their definition is indicated in the diagram below.
Highest Value
25%
Upper Hinge High-Spread
Distribution
25%
Median Mid-Spread
25%
Lower Hinge Low-Spread
25%
Lowest Value
Elementary EDA recommends the use of a five-number summary consisting of:
1. the lowest value;
2. the lower hinge;
3. the median;
4. the upper hinge;
5. the highest value.
to summarize a distribution. You will find that the five-number summary, especially when used in
conjunction with the three spreads shown in the diagram above gives an adequate representation of
a non-symmetrical distribution.
Notice that:
• the spreads shown in the diagram above are easily calculated once the
five-number summary is known;
• the median and the hinges are unaffected by changes in extreme values.
HELM (2006): 29
Task
Find the five number summary and the mid-spread, high-spread and low-spread
for the distribution given below.
1 9 17 2 9 17 3 10 18 3 11 19 4 12 19
5 12 20 6 13 21 6 13 22 7 14 23 8 16 27
Your solution
Answer
1 Lowest Value = 1
2
3
3
4
5
6 Lower Hinge = 6 Low-Spread = 11
6
7
8
9
9
10
11
12 Median = 12 Mid-Spread = 11.5
12
13
13
14
16
17
17
18
19 Upper Hinge = 17.5 High-Spread = 15
19
20
21
22
23
27 Highest Value = 27
30 HELM (2006):
®
The stem-and-leaf diagram

You have already seen a basic stem-and-leaf diagram and you know that it shows the shape of a
distribution well. Here you will learn how to handle larger amounts of data to form stem-and-leaf
diagrams. As you will see, one set of data can give rise to more than one stem-and-leaf diagram and
highlight different aspects of the data. Look at the data set below:
11 9 6 27 17 2 19 12 8 17 3 10 23 6 18
13 11 22 13 19 4 12 23 34 19 15 7 40 16 20
Using the numbers to the left of the stem to represent 10s and the numbers to the right to represent
units we obtain the stem-and-leaf diagram shown below.
0 2 3 4 6 6 7 8 9
1 0 1 1 2 2 3 3 5 6 7 7 8 9 9 9
2 0 2 3 3 7
3 4
4 0
Notice that the skewed nature of the data stands out immediately. What also stands out are the
following:
• the 10s class has the highest number of members;
• the modal (most frequently occurring) value is 19;
• the 30s and 40s tie for the least number of members (one each).
This is not new information, we could have written these fact down after properly inspecting the
original raw data. The advantage of the stem-and-leaf diagram is that it enables these facts to be
expressed in a clear and obvious way. As a further illustrative example, look at the data in the table
below which we will use to draw two stem-and-leaf diagrams.
9.5 11.9 20.0 33.4 40.1 50.0 12.7 21.0 33.6 40.6
50.0 15.5 26.4 35.4 41.1 50.0 17.7 37.9 41.3 50.0
41.9 50.4 43.0 43.3 43.6 43.7 43.8 44.7 44.9 45.0
45.1 45.2 45.3 45.5 46.1 46.5 46.6 47.1 48.0 48.2
48.5 48.4 48.6 48.7 48.8 48.9 49.4 49.5 49.6 49.8
Drawing a stem-and-leaf diagram

We can start by looking at the data as it is displayed by a stem-and-leaf diagram. Here we will use
two-digit leaves with the first digit representing units and the second digit representing tenths. The
tens are represented by the numbers to the left of the stem.
0 | 95
1 | 19, 27, 55, 77
2 | 00, 10, 64
3 | 34, 36, 54, 79
4 | 01, 06, 11, 13, 19, 30, 33, 36, 37, 38, 47, 49, 50, 51, 52, 53, 55, 61, 65, 66, 71, 80, 82, 84, 85, 86, 87, 88, 89, 94, 95, 96, 98
5 | 00, 00, 00, 00, 04
Notice that all we have really done is rank the data from the lowest value to the highest value reading
from top to bottom. This particular display has over half of its members crushed into one class - the
4-class.
HELM (2006): 31
It may be informative to split the classes and look more closely at the data.
This can be done by:
1. rounding the raw data to two figures;
2. splitting each class according to the rule

second digit 0 - 4 ........... * second digit 5 - 9 ........... •
The rounded raw data now appear as follows

10 12 20 33 40 50 13 21 34 41
50 16 26 35 41 50 18 38 41 50
42 50 43 43 44 44 44 45 45 45
45 45 45 46 46 47 47 47 48 48
49 48 49 49 49 49 49 50 50 50
The stem and leaf diagram now becomes
0
0
1 0 2 3
1 6 8
2 0 1
2 6
3 3 4
3 5 8
4 0 1 1 1 2 3 3 4 4 4
4 5 5 5 5 5 5 6 6 7 7 7 8 8 8 9 9 9 9 9 9
5 0 0 0 0 0 0 0 0
Essentially, the classes have been split according to the usual rule for rounding decimals. This process
can make certain information contained in the data a little more obvious than the previous stem and
leaf diagram. For example:
• the values in the 3-class are evenly distributed between both halves of the class in the
sense that each half has two members;
• the 4-class is split in the ratio 2:1 in favour of the upper half of the class;
• the values in the 5-class are all in the lower half of the class.
You should have realised that:
• this is not new information - the new display has merely highlighted certain aspects
of the raw data;
• some of the conclusions may have been affected by the rounding process.
32 HELM (2006):
®
Looking at the original stem and leaf diagram of the Inter-party data it is easy to produce a five-
number summary of the data. The summary is:
1. The lowest value, this is 9.50;
2. The lower hinge, this is 39 (to find the lower hinge average the 12th and 13th values);
3. The median, this is 45.05 (the average of the 25th and 26th values);
4. The upper hinge, this is 48.55 (to find the upper hinge average the 37th and 38th values);
5. The highest value, this is 50.4.
The corresponding spreads are:
1. The low-spread, this is 45.05 - 9.50 = 35.55;
2. The mid-spread, this is 48.55 - 39.00 = 9.55;
3. The high-spread, this is 50.40 - 45.05 = 5.35.
Notice that the spreads indicate a considerable deviation from normality.
For an ideal normal distribution, we would expect:
• The distances between the median and hinges to be equal
• The high-spread and low-spread to be equal
• The distances between the hinges and the extremes to be equal
as shown in the following diagram.
Lowest Lower Upper Highest

Value Hinge Median Hinge Value
Hinge to Hinge to
Extreme Extreme
Median Median
to to
Hinge Hinge
Low-Spread High-Spread
Figure 10
HELM (2006): 33
Task
Using the rounded data given on page 32 find the five number summary. Use your
summary to check the data for normality and comment on any deviations from
normality that you find.
Your solution
34 HELM (2006):
®
Answer
Data
10 Lowest Value = 10
12
13
16
18
20
21
26
33
34
35
38 Lower Hinge = 39 Low-Spread = 35
40
41 Hinge to
41 Extreme = 29
41
42
43
43
44
44
44
45
45
45 Median = 45 Median to
45 Lower Hinge = 6
45
45 Median to
46 Upper Hinge = 4
46
47
47
47
48
48
48
49 Upper Hinge = 49 High-Spread = 5
49
49
49
49
49
50 Hinge to
50 Extreme = 1
50
50
50
50
50
50 Highest Value = 50
Comparing values as indicated by the diagram on page 24 gives the following results:
Low-Spread = 35 High-Spread = 5
Lower Hinge to Extreme = 29 Upper Hinge to Extreme = 1
Median to Lower Hinge = 6 Median to Upper Hinge = 4
While there are no hard-and-fast rules for comparing figures such as those obtained here, many
authors suggest that the figures should be within 10% of each other before normality can be assumed.
This is clearly not the case here. We conclude that the distribution of data being investigated is
not symmetrical. In fact the figures above suggest that the distribution is skewed to the left, a fact
supported by the stem-and-leaf diagram of the same data to be found above. [Note: skewness is
defined on page 41.]
HELM (2006): 35
Answer
Remember that the term ‘skewness’ refers to the location of the ‘tail’ of a distribution.
Right Skew Left Skew
The box-and-whisker diagram

In order to visually summarise a data set we can use a box and whisker plot as well as a stem-and-
leaf diagram. A box-and-whisker diagram of the original (unrounded) Inter-Party Competition data
is shown below and the procedure necessary for drawing a plot is discussed.
You should note that there are several similar methods recommended by different authors for drawing
box-and-whisker plots and so the methods recommended in statistical texts may vary a little from
those given below.
33.4 50.4
39 45.05 48.55
Figure 11
The diagram is constructed as follows:
1. The Box
(a) The left-hand vertical is placed at the lower hinge (39);

(b) The right-hand vertical is placed at the upper hinge (48.65);
(c) The vertical in the box is placed at the median (45.05).
2. The Whiskers
Notice that the mid-spread of the data (the difference between the hinges) is 9.65.
(a) Find the greatest value which is within one mid-spread (9.65) of the upper hinge (48.65).
Here 48.65 + 9.65 = 58.3 so the greatest value is 50.4.
(b) Find the least value which is within one mid-spread (9.65) of the lower hinge (39). Here
39 − 9.65 = 29.35 so the least value is 33.4.
Connect the greatest and least values to the box by means of dashed lines.
3. The Outlying Values
Mark as large dots any values which are more than 1.5 mid-spreads from the hinges. In this case
1.5 mid-spreads give a value of about 14.33 and so we mark dots which represent values which are
higher than 48.65 + 14.33 = 62.88 and values which are lower than 39 - 14.33 = 24.67. In this
example there are no values greater than 62.88, but there are 7 values which are less than 24.67.
Notice that half of the data values lie in the box and that the tails show up well in the diagram. The
diagram shows the left-skew (skewness refers to the tail) present in the data.
36 HELM (2006):
®
2. Outliers
Outliers are values which are well outside the range covered by the vast bulk of a data set - a precise
definition is impossible although some simple criteria do exist which may be used to detect outliers
and accept or reject outliers. The seven values shown as large dots above illustrate the concept of
outliers. Outliers can be extremely important since they may be (for example) erroneous data or they
may point the way to further investigations of a data set.
For example, one statistic used to measure the state of the industrial development of a nation is
the number of miles of railway track built per square mile of land. The box-and-whisker plot below
summarises this variable for a total of 26 nations in the year 1972 according to one author.
Barbados Jamaica Cuba
0.0 65.7 518.8

Figure 12
The figure for Cuba literally means that the whole island is covered by tracks which are placed about
3m apart! Clearly, there is an error in the data. In fact the 1972 Statistical Abstract of Latin America
gives the figure for Cuba as 71.75 miles of railway per square mile of land. Note that the figure is
still an outlier but is much more believable.
Task
Place the items in the data set below in rank order and use your rank ordering to
find the five number summary of the data.
155.3 177.3 146.2 163.1 161.8 146.3 167.9 165.4 172.3 188.2
178.8 151.1 189.4 164.9 174.8 160.2 187.1 163.2 147.1 182.2
178.2 172.8 164.4 177.8 154.6 154.9 176.3 148.5 161.8 178.4
Construct a box-and-whisker diagram representing the data.
Does the box-and-whisker diagram tell you that the data set that you are working
with is symmetrical? Record the reasons for your comments.
Your solution
Work the solution on a separate piece of paper. Record the main stages in the calculation and your
conclusions here.
HELM (2006): 37
Answer
Data
146.2 Lowest Value = 146.2

146.3
147.1
148.5
151.1
154.6 Lower Hinge = 155.1
154.9 Low-Spread = 132.2
155.3
160.2
161.8
161.8
163.1
163.2
164.4
164.9 Median = 165.15 Mid-Spread = 22.90
165.4
167.9
172.3
172.8
174.8
176.3
177.3
177.8 Upper Hinge = 177.55
178.2 High-Spread = 200.9
178.4
178.8
182.2
187.1
188.2
189.4 Highest Value = 189.4
The Box-and-Whisker plot is:

Lower Upper
Hinge Median Hinge Highest Value
Lowest Value
146.2 189.9
155.1 165.15 177.55
The plot indicates that the distribution is not symmetrical, for example you would expect the median
value to appear midway between the hinges for a symmetrical distribution.
Criteria for rejecting outliers

As you already know, outliers may be taken to be observations which lie well outside the range of
most of a sample. They are important for several reasons:
1. they can have misleading effect on statistics such as the mean and standard deviation;
2. their occurrence may be due to incorrect observation, measurement or recording. In this case
it is often possible to correct the data;
3. their presence can induce a false skewness in a data set;
4. they may actually be members of a population not under consideration. For example, a study
of urban families may involve recording the number of children in a family, say between 0 and
4 for the sake of discussion. An outlier might be caused by a rural family with, say, 10 children,
living in temporary urban accommodation. This family is part of a different population.
38 HELM (2006):
®
Simple criteria exist which facilitate the detection of outliers. These criteria should be used with some
caution and never automatically used simply to reject an outlier. You should always ask why such
a value occurred in the first place and work to answer such a question sensibly before considering
rejection. Two criteria for the detection of outliers are given below. Criterion 1 may be applied to
data sets that are known to be normal in shape. Criterion 2 uses the five-number summary discussed
above and may be applied to any data sets.
Criterion 1
Knowing that some 99.7% of a normal population lies within 3 standard deviations of the mean, we
could treat any value further than say 3.3 standard deviations from the mean as on outlier. This
choice essentially implies that a value has less than 1 in a 1000 of chance of occurring naturally
outside the range defined by 3.3 standard deviations from the mean. Using standardized scores with
as the potential outlier we can state the criterion

x0 − x̄ x0 − x̄
Accept x0 if ≤ 3.3 Investigate x0 if > 3.3
sn−1 sn−1
Note that x̄ and sn−1 are sample estimates of the mean and standard deviation of the population.
Criterion 2
Using a five-number summary of a data set one can easily set up a criterion which may be used to
classify outliers as either ‘moderate’ or ‘extreme’.
The following diagram illustrates the situation where IQR is the Inter-Quartile Range.
Median
LQ UQ
1.5IQR 1.5IQR
3IQR 3IQR
IQR
Figure 13
While all values classified as outliers should be investigated, this is particularly true of those classified
as extreme outliers.
Task
Manufacturing processes generally result in a certain amount of wasted material.
For reasons of cost, companies need to keep such wastage to a minimum. The
following data were gathered over a two week period by a manufacturing company
whose production lines run seven days per week. The numbers given represent the
percentage wastage of the amount of material used in the manufacturing process.
Daily Losses (%) 6 8 10 12 12 13 14 14 18 18 19 20 22 26
(a) Find the mean and standard deviation of the percentage losses of ma-
terial over the two week period.
(b) Assuming that the losses are roughly normally distributed, apply an
appropriate criterion to decide whether any of the losses are smaller or
larger than might be expected by chance.
HELM (2006): 39
Your solution
Answer
(a) We will treat any value further than 3.3 standard deviations from the mean as an outlier
(criterion 1). Using standardized
scores with x0 as the potential outlier we need to
x0 − x̄
calculate the quantity and then accept x0 as a member of the distribution if
s n−1

x0 − x̄
sn−1 ≤ 3.3. Otherwise we reject x0 as an outlier.

Calculation gives:

−x̄
x x − x̄ (x − x̄)2 xs0n−1
6.00 −9.14 83.59 1.63
8.00 −7.14 51.02 1.28
10.00 −5.14 26.45 0.92
12.00 −3.14 9.88 0.56
12.00 −3.14 9.88 0.56
13.00 −2.14 4.59 0.38
14.00 −1.14 1.31 0.20
14.00 −1.14 1.31 0.20
18.00 2.86 8.16 0.51
18.00 2.86 8.16 0.51
19.00 3.86 14.88 0.69
20.00 4.86 23.59 0.87
22.00 6.86 47.02 1.22
26.00 10.86 117.88 1.94
x̄ = 15.14 sn−1 = 5.60

x0 − x̄
(b) The calculation shows that all values of ≤ 3.3 and so we conclude that the
sn−1
daily losses are within the range indicated by chance variation.
40 HELM (2006):
®
3. Skewness, gaps and multiple peaks

When exploring a data set, four properties worth looking for are outliers, skewness, gaps and multiple
peaks. Outliers have been dealt with in some detail above so the comments given below briefly
address skewness, gaps and multiple peaks.
Skewness
If a skewed distribution is represented purely by two numbers, say the mean and standard deviation,
then the representation will be inadequate. Remember that the term ‘skewness’ refers to the location
of the ‘tail’ of a distribution.
Right Skew Left Skew
As an example, the data set below gives the current required to burn out a component under test.
9.5 11.9 20.0 33.4 40.1 50.0 12.7 21.0 33.6 40.6
50.0 15.5 26.4 35.4 41.1 50.0 17.7 37.9 41.3 50.0
41.9 50.4 43.0 43.3 43.6 43.7 43.8 44.7 44.9 45.0
45.1 45.2 45.3 46.1 46.5 46.6 47.1 48.0 48.2 45.3
48.5 48.4 48.6 48.7 48.8 48.9 49.4 49.5 49.6 49.8
The data were obtained by measuring the current in mA applied to an electronic component under
conditions of destructive testing, gives the following values for the mean, standard deviation, median
and mid-spread:
x̄ = 40.72 sn−1 = 11.49 median = 45.05 and mid-spread = 9.55
The values of x̄ and sn−1 indicate that a lower average current with a greater spread will result in
the destruction of the component than that indicated by the median and mid-spread. Clearly, further
investigation is necessary to resolve this situation.
HELM (2006): 41
Gaps and multiple peaks
Distributions with gaps and multiple peaks can be very difficult to summarise easily. The stem-and-
leaf and box-and-whisker plots shown below summarise some 1972 data concerning adult literacy.
The leaves are single digit and the range of achievement reached in the field of literacy ranges from
2% to 100%.
0 2 3 3 3 5 5 5 5 8 8 8 8 8 8 8 8
1 0 0 0 0 0 0 0 3 3 5 5 5 8
2 0 0 0 0 0 3 3 3 5 8
3 0 0 0 2 3 5 5 5 5 5 5 8
4 1 1 3 3 5 5 6 7 9
5 0 0
6 0 0 0 1 1 2 5 5 5 5 7 8
7 0 1 1 2 2 2 3 5 5 5 6 7
8 0 0 0 1 2 4 4 5 5 6 6 7
9 0 0 1 2 5 6 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9
10 0 0 0 0 0 0
Figure 15
The virtual lack of data between 50 and 60 indicates a gap and suggests that we are in fact dealing
with two separate distributions which have the following properties:
1. 2% - 50% literacy having right skew
2. 60% - 100% literacy having left skew.
Notice that the term skewness refers to the tail of a distribution.
The usual summary statistics that you might be tempted to calculate are:
x̄ = 54 and sn−1 = 34 or median = 60 and mid-spread = 65
In this case, neither set of statistics is of much use since neither set indicates the gap or the skewness.
Without visual representation, a single peaked distribution tends to be assumed, this is, of course,
opposite to the truth in this case.
The stem-and-leaf plot is more informative than the box-and-whisker plot since it shows the gap.
In practice we would work with the two constituent distributions and attempt to relate the results in
a practical way.
Final comments on data representations
1. You should not rely on summary statistics such as the mean and standard deviation or median
and mid-spread alone to represent a data set. Remember that if a distribution has outliers,
gaps, skewness or multiple peaks, then shape is probably more important than location and
spread.
2. The shape of a distribution is better shown visually than numerically. Remember that a stem-
and-leaf diagram retains the data and arranges the data in rank order and that a box-and-
whisker plot emphasises the detail contained in the tails of a distribution.
42 HELM (2006):
®
Exercises
1. The following data give the lifetimes in hours of 50 electric lamps.
1337 1437 1214 1300 1124 1065 1470 1488 1103 978
1177 1289 1045 947 969 1339 1594 812 1277 1032
1167 974 1131 974 1727 1378 1385 1330 1672 1604
1493 1521 1235 1682 1136 1229 803 1166 1494 1733
978 1110 1055 1438 1436 1424 766 1283 829 1652
(a) Represent the data using a stem-and-leaf diagram with two-digit leaves.
(b) Calculate the mean lifetime from these data.
(c) Does the mean lifetime give a good indication of the expected lifetime of a lamp?
2. During the winter of 1893/94 Lord Rayleigh conducted an investigation into the density of
nitrogen gas taken from various sources. He had previously found discrepancies between the
density of nitrogen obtained by chemical decomposition and nitrogen obtained by removing
oxygen from air. Lord Rayleigh’s investigations led to the discovery of argon. The raw data
obtained during his investigations are given below.
Date Source Weight Date Source Weight

29/11/93 NO 2.30143 26/12/93 N2 O 2.29889
05/12/93 NO 2.29816 28/12/93 N2 O 2.29940
06/12/93 NO 2.30182 09/01/94 NH4 NO2 2.29849
08/12/93 NO 2.29890 13/01/94 NH4 NO2 2.29889
12/12/93 Air 2.31017 29/01/94 Air 2.31024
14/12/93 Air 2.30986 30/01/94 Air 2.31030
19/12/93 Air 2.31010 01/02/94 Air 2.31028
22/12/93 Air 2.31001
(a) Organise the data into a frequency table using the classes 2.29-2.30, 2.30-2.31, 2.31-2.32.
Draw the histogram representing the data and comment on any unusual features that you
may see.
(b) Classify the data according to the two sources ‘Air’ and ‘Other’ . Order each data set
and hence find the median, the hinges and the mid-spreads for each data set. Plot
box-and-whisker diagrams for the data on a diagram similar to the one shown below.
Weight
2.320
2.315
2.310
2.305
2.300
2.295
Air Other Source
HELM (2006): 43
Comment on any unusual features that you see. What do the box-and-whisker plots tell you
about the nitrogen obtained from the two sources?
3. Answer the following questions:
(a) Is the variance measured in the same units as the mean?
(b) Is the mean measured in the same units as the median?
(c) Is the standard deviation measured in the same units as the mode?
(d) Is the mode measured in the same units as the mid-spread?
(e) Is the high-spread measured in the same units as the low-spread?
(f) Is the mid-spread measured in the same units as the hinges?
Answers
1. (a) Stem and leaf diagram (2 digit leaves – tens and units).
7 66
8 03,12,29
9 47,69,74,74,78,78
10 32,45,55,65
11 03,10,24,31,36,66,67,77
12 14,29,35,77,83,89
13 00,30,37,39,78,85
14 24,36,37,38,70,88,93,94
15 21,94
16 04,52,72,82
17 27,33
P
(b) The sum of the lifetimes is x = 62802. So the mean is
62802
= 1256.04.
50
(c) Yes. The mean lifetime gives a reasonable indication of what can be expected since the
distribution is fairly symmetrical. However it does not, of course, give any indication
of the spread.
44 HELM (2006):
®
Answers
2. (a)
Lord Rayleigh’s Results
10
Frequency 5
0
2.29 - 2.30 2.30 - 2.31 2.31 - 2.32
Classes
The lowest class is obtained entirely from non-air sources, the highest class is obtained
entirely from air.
(b)
Weight
2.320
2.315
2.310
2.305
2.300
2.295
Air Other Source
Comment. Box-and-whisker plot tells us that some other element is present in Air which
is responsible for the additional weight. This additional element subsequently proved to
be the inert gas argon.
3. (a) No (b) Yes (c) Yes (d) Yes (e) Yes (f) Yes
HELM (2006): 45
Contents 37
Discrete Probability
Distributions
37.1 Discrete Probability Distributions 2

37.2 The Binomial Distribution 17
37.3 The Poisson Distribution 37
37.4 The Hypergeometric Distribution 53
Learning outcomes
In this Workbook you will learn what a discrete random variable is. You will find how to
calculate the expectation and variance of a discrete random variable. You will then
examine two of the most important examples of discrete random variables: the binomial
distribution and Poisson distribution.
The Poisson distribution can be deduced from the binomial distribution and is often
used as a way of finding good approximations to the binomial probabilities. The binomial
is a finite discrete random variable whereas the Poisson distribution has an infinite
number of possibilities.
Finally you will learn about anotherimportant distribution - the hypergeometric.

Discrete Probability
Distributions 37.1
Introduction
It is often possible to model real systems by using the same or similar random experiments and
their associated random variables. Numerical random variables may be classified in two broad but
distinct categories called discrete random variables and continuous random variables. Often, discrete
random variables are associated with counting while continuous random variables are associated with
measuring. In 42. you will meet contingency tables and deal with non-numerical random
variables. Generally speaking, discrete random variables can take values which are separate and can
be listed. Strictly speaking, the real situation is a little more complex but it is sufficient for our
purposes to equate the word discrete with a finite list. In contrast, continuous random variables
can take values anywhere within a specified range. This Section will familiarize you with the idea
of a discrete random variable and the associated probability distributions. The Workbook makes
no attempt to cover the whole of this large and important branch of statistics but concentrates on
the discrete distributions most commonly met in engineering. These are the binomial, Poisson and
hypergeometric distributions.

Prerequisites • understand the concepts of probability


'
$
• explain what is meant by the term discrete
random variable
Learning Outcomes • explain what is meant by the term discrete

probability distribution
• use some of the discrete probability
distributions which are important to engineers
& %
2 HELM (2006):
Workbook 37: Discrete Probability Distributions
®
1. Discrete probability distributions

We shall look at discrete distributions in this Workbook and continuous distributions in 38.
In order to get a good understanding of discrete distributions it is advisable to familiarise yourself
with two related topics: permutations and combinations. Essentially we shall be using this area of
mathematics as a calculating device which will enable us to deal sensibly with situations where choice
leads to the use of very large numbers of possibilities. We shall use combinations to express and
manipulate these numbers in a compact and efficient way.
Permutations and Combinations

You may recall from 35.2 concerned with probability that if we define the probability that an
event A occurs by using the definition:
The number of equally likely experimental outcomes favourable to A a
P(A) = =
The total number of equally likely outcomes forming the sample space n
then we can only find P(A) provided that we can find both a and n. In practice, these numbers
can be very large and difficult if not impossible to find by a simple counting process. Permutations
and combinations help us to calculate probabilities in cases where counting is simply not a realistic
possibility.
Before discussing permutations, we will look briefly at the idea and notation of a factorial.
Factorials
The factorial of an integer n commonly called ‘factorial n’ and written n! is defined as follows:
n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1 n≥1
Simple examples are:
3! = 3×2×1 = 24 5! = 5×4×3×2×1 = 120 8! = 8×7×6×5×4×3×2×1 = 40320
As you can see, factorial notation enables us to express large numbers in a very compact format. You
will see that this characteristic is very useful when we discuss the topic of permutations. A further
point is that the definition above falls down when n = 0 and we define
0! = 1
Permutations
A permutation of a set of distinct objects places the objects in order. For example the set of three
numbers {1, 2, 3} can be placed in the following orders:
1,2,3 1,3,2 2,1,3 2,3,1 3,2,1 3,1,2
Note that we can choose the first item in 3 ways, the second in 2 ways and the third in 1 way. This
gives us 3×2×1 = 3! = 6 distinct orders. We say that the set {1, 2, 3} has the distinct permutations
1,2,3 1,3,2 2,1,3 2,3,1 3,2,1 3,1,2
HELM (2006): 3
Section 37.1: Discrete Probability Distributions
Example 1
Write out the possible permutations of the letters A, B, C and D.
Solution
The possible permutations are
ABCD ABDC ADBC ADCB ACBD ACDB
BADC BACD BCDA BCAD BDAC BDCA
CABD CADB CDBA CDAB CBAD CBDA
DABC DACB DCAB DCBA DBAC DBCA
There are 4! = 24 permutations of the four letters A, B, C and D.
In general we can order n distinct objects in n! ways.

Suppose we have r different types of object. It follows that if we have n1 objects of one kind, n2 of
another kind and so on then the n1 objects can be ordered in n1 ! ways, the n2 objects in n2 ! ways
and so on. If n1 + n2 + · · · + nr = n and if p is the number of permutations possible from n objects
we may write
p × (n1 ! × n2 ! × · · · × nr !) = n!
and so p is given by the formula
n!
p=
n1 ! × n2 ! × · · · × nr !
Very often we will find it useful to be able to calculate the number of permutations of n objects
taken r at a time. Assuming that we do not allow repetitions, we may choose the first object in n
ways, the second in n − 1 ways, the third in n − 2 ways and so on so that the rth object may be
chosen in n − r + 1 ways.
Example 2
Find the number of permutations of the four letters A, B, C and D taken three
at a time.
Solution
We may choose the first letter in 4 ways, either A, B, C or D. Suppose, for the purposes of
illustration we choose A. We may choose the second letter in 3 ways, either B, C or D. Suppose,
for the purposes of illustration we choose B. We may choose the third letter in 2 ways, either C
or D. Suppose, for the purposes of illustration we choose C. The total number of choices made is
4 × 3 × 2 = 24.
4 HELM (2006):
®
In general the numbers of permutations of n objects taken r at a time is

n!
n(n − 1)(n − 2) . . . (n − r + 1) which is the same as
(n − r)!
This is usually denoted by n Pr so that
n n!
Pr =
(n − r)!
If we allow repetitions the number of permutations becomes nr (can you see why?).
Example 3
Find the number of permutations of the four letters A, B, C and D taken two at
a time.
Solution
We may choose the first letter in 4 ways and the second letter in 3 ways giving us
4×3×2×1 4!
4×3= = = 12 permutations
1×2 2!
Combinations
A combination of objects takes no account of order whereas a permutation does. The formula
n n!
Pr = gives us the number of ordered sets of r objects chosen from n. Suppose the number
(n − r)!
of sets of r objects (taken from n objects) in which order is not taken into account is C. It follows
that
n! n!
C × r! = and so C is given by the formula C=
(n − r)! r!(n − r)!
We normally denote the right-hand side of this expression by n Cr so that

n n! n n
Cr = A common alternative notation for Cr is .
r!(n − r)! r
Example 4
How many car registrations are there beginning with N P 05 followed by three
letters? Note that, conventionally, I, O and Q may not be chosen.
Solution
We have to choose 3 letters from 23 allowing repetition. Hence the number of registrations beginning
with N P 05 must be 233 = 12167.
HELM (2006): 5
Task
(a) How many different signals consisting of five symbols can be sent using the
dot and dash of Morse code?
(b) How many can be sent if five symbols or less can be sent?
Your solution
Answer
(a) Clearly, the order of the symbols is important. We can choose each symbol in two ways, either
a dot or a dash. The number of distinct signals is
2 × 2 × 2 × 2× = 25 = 32
(b) If five or less symbols may be used, the total number of signals may be calculated as follows:
Using one symbol: 2 ways
Using two symbols: 2 × 2 = 4 ways
Using three symbols: 2 × 2 × 2 = 8 ways
Using four symbols: 2 × 2 × 2 × 2 = 16 ways
Using five symbols: 2 × 2 × 2 × 2 × 2 = 32 ways
The total number of signals which may be sent is 62.
Task
A box contains 50 resistors of which 20 are deemed to be ‘very high quality’ , 20
‘high quality’ and 10 ‘standard’. In how many ways can a batch of 5 resistors be
chosen if it is to contain 2 ‘very high quality’, 2 ‘high quality’ and 1 ‘standard’
resistor?
Your solution
Answers The order in which the resistors are chosen does not matter so that the number of ways
in which the batch of 5 can be chosen is:
20 20! 20! 10! 20 × 19 20 × 19 10
C2 ×20 C2 ×10 C1 = × × = × × = 361000
18! × 2! 18! × 2! 9! × 1! 1×2 1×2 1
6 HELM (2006):
®
2. Random variables
A random variable X is a quantity whose value cannot be predicted with certainty. We assume that
for every real number a the probability P(X = a) in a trial is well-defined. In practice, engineers are
often concerned with two broad types of variables and their probability distributions: discrete random
variables and their distributions, and continuous random variables and their distributions. Discrete
distributions arise from experiments involving counting, for example, road deaths, car production
and aircraft sales, while continuous distributions arise from experiments involving measurement, for
example, voltage, corrosion and oil pressure.
Discrete random variables and probability distributions

A random variable X and its distribution are said to be discrete if the values of X can be presented
as an ordered list say x1 , x2 , x3 , . . . with probability values p1 , p2 , p3 , . . . . That is P(X = xi ) = pi .
For example, the number of times a particular machine fails during the course of one calendar year
is a discrete random variable.
More generally a discrete distribution f (x) may be defined by:

pi if x = xi i = 1, 2, 3, . . .
f (x) =
0 otherwise
The distribution function F (x) (sometimes called the cumulative distribution function) is obtained
by taking sums as defined by
X X
F (x) = f (xi ) = pi
xi ≤x xi ≤x
We sum the probabilities pi for which xi is less than or equal to x. This gives a step function with
jumps of size pi at each value xi of X. The step function is defined for all values, not just the values
xi of X.
Key Point 1
Probability Distribution of a Discrete Random Variable
Let X be a random variable associated with an experiment. Let the values of X be denoted by
x1 , x2 , . . . , xn and let P(X = xi ) be the probability that xi occurs. We have two necessary conditions
for a valid probability distribution:
• P(X = xi ) ≥ 0 for all xi
n
X
• P(X = xi ) = 1
i=1
Note that n may be uncountably large (infinite).

(These two statements are sufficient to guarantee that P(X = xi ) ≤ 1 for all xi .)
HELM (2006): 7
Example 5
Turbo Generators plc manufacture seven large turbines for a customer. Three of
these turbines do not meet the customer’s specification. Quality control inspectors
choose two turbines at random. Let the discrete random variable X be defined to
be the number of turbines inspected which meet the customer’s specification.
(a) Find the probabilities that X takes the values 0, 1 or 2.
(b) Find and graph the cumulative distribution function.
Solution
(a) The possible values of X are clearly 0, 1 or 2 and may occur as follows:
Sample Space Value of X
Turbine faulty, Turbine faulty 0
Turbine faulty, Turbine good 1
Turbine good, Turbine faulty 1
Turbine good, Turbine good 2
We can easily calculate the probability that X takes the values 0, 1 or 2 as follows:
3 2 1 4 3 3 4 4 4 3 2
P(X = 0) = × = P(X = 1) = × + × = P(X = 2) = × =
7 6 7 7 6 7 6 7 7 6 7
X
The values of F (x) = P(X = xi ) are clearly
xi ≤x
1 5 7
F (0) = F (1) = and F (2) = = 1
7 7 7
(b) The graph of the step function F (x) is shown below.
F (x)
5/7
1/7
x
0 1 2
Figure 1
8 HELM (2006):
®
3. Mean and variance of a discrete probability distribution

If an experiment is performed N times in which the n possible outcomes X = x1 , x2 , x3 , . . . , xn are
observed with frequencies f1 , f2 , f3 , . . . , fn respectively, we know that the mean of the distribution
of outcomes is given by
n
X
fi xi n n
f1 x1 + f2 x2 + . . . + fn xn i=1 1 X X fi
x̄ = = n = fi xi = xi
f1 + f2 + . . . + fn X N i=1 i=1
N
fi
i=1
n
X
(Note that fi = f1 + f2 + · · · + fn = N .)
i=1
fi
The quantity is called the relative frequency of the observation xi . Relative frequencies may be
N
thought of as akin to probabilities; informally we would say that the chance of observing the outcome
fi
xi is . Formally, we consider what happens as the number of experiments becomes very large. In
N
fi fi
order to give meaning to the quantity we consider the limit (if it exists) of the quantity as
N N
N → ∞ . Essentially, we define the probability pi as
fi
pi = lim
N →∞ N
fi
Replacing with the probability pi leads to the following definition of the mean or expectation of
N
the discrete random variable X.
Key Point 2
The Expectation of a Discrete Random Variable

Let X be a random variable with values x1 , x2 , . . . , xn . Let the probability that X takes the value
xi (i.e. P(X = xi )) be denoted by pi . The mean or expected value or expectation of X, which
is written E(X) is defined as:
n
X
E(X) = xi P(X = xi ) = p1 x1 + p2 x2 + · · · + pn xn
i=1
The symbol µ is sometimes used to denote E(X).
The expectation E(X) of X is the value of X which we expect on average. In a similar way we can
write down the expected value of the function g(X) as E[g(X)], the value of g(X) we expect on
average. We have
HELM (2006): 9
n
X
E[g(X)] = g(xi )f (xi )
i
n
X
In particular if g(X) = X 2 , we obtain E[X 2 ] = x2i f (xi )
i
2
The variance is usually written as σ . For a frequency distribution it is:
n
2 1 X
σ = fi (xi − µ)2 where µ is the mean value
N i=1
and can be expanded and ‘simplified’ to appear as:
n
1 X 2
σ2 = fi xi − µ2
N i=1
This is often quoted in words:
The variance is equal to the mean of the squares minus the square of the mean.
We now extend the concept of variance to a random variable.
Key Point 3
The Variance of a Discrete Random Variable
Let X be a random variable with values x1 , x2 , . . . , xn . The variance of X, which is written V(X)
is defined by
X n
V(X) = pi (xi − µ)2
i=1
where µ ≡ E(X). We note that V(X) can be written in the alternative form
V(X) = E(X 2 ) − [E(X)]2

p
The standard deviation σ of a random variable is V(X).
10 HELM (2006):
®
Example 6
A traffic engineer is interested in the number of vehicles reaching a particular
crossroads during periods of relatively low traffic flow. The engineer finds that
the number of vehicles X reaching the crossroads per minute is governed by the
probability distribution:
x 0 1 2 3 4
P(X = x) 0.37 0.39 0.19 0.04 0.01
(a) Calculate the expected value, the variance and the standard deviation of the
random variable X.
(b) Graph the probability distribution P(X = x) and the corresponding cumulative
X
probability distribution F (x) = P(X = xi ).
xi ≤x
Solution
(a) The expectation, variance and standard deviation and cumulative probability values are calculated
as follows:
x x2 P(X = x) F (x)
0 0 0.37 0.37
1 1 0.39 0.76
2 4 0.19 0.95
3 9 0.04 0.99
4 16 0.01 1.00
4
X
E(X) = xP(X = x)
x=0
= 0 × 0.37 + 1 × 0.39 + 2 × 0.19 + 3 × 0.04 + 4 × 0.01
= 0.93
V(X) = E(X 2 ) − [E(X)]2

4
" 4
#2
X X
= x2 P(X = x) − xP(X = x)
x=0 x=0
= 0 × 0.37 + 1 × 0.39 + 4 × 0.19 + 9 × 0.04 + 16 × 0.01 − (0.93)2
= 0.8051
p
The standard deviation is given by σ = V(X) = 0.8973
HELM (2006): 11
Solution (contd.)
(b)
F (x) P(X = x)
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
x x
0 1 2 3 4 0 1 2 3 4
Figure 2
Task
Find the expectation, variance and standard deviation of the number of Heads in
the three-coin toss experiment.
Your solution
Answer
1 3 3 1 12
E(X) = ×0+ ×1+ ×2+×3=
8 8 8 8 8
X 1 3 3 1
pi x2i = × 02 + × 12 + × 22 + × 32
8 8 8 8
1 3 3 1
= ×0+ ×1+ ×4+ ×9=3
8 8 8 8
3
V(X) = 3 − 2.25 = 0.75 =
√ 4
3
σ =
2
12 HELM (2006):
®
Exercises
1. A machine is operated by two workers. There are sixteen workers available. How many possible
teams of two workers are there?
2. A factory has 52 machines. Two of these have been given an experimental modification. In the
first week after this modification, problems are reported with thirteen of the machines. What
is the probability that both of the modified machines are among the thirteen with problems
assuming that all machines are equally likely to give problems,?
3. A factory has 52 machines. Four of these have been given an experimental modification. In
the first week after this modification, problems are reported with thirteen of the machines.
What is the probability that exactly two of the modified machines are among the thirteen with
problems assuming that all machines are equally likely to give problems?
4. A random number generator produces sequences of independent digits, each of which is as

likely to be any digit from 0 to 9 as any other. If X denotes any single digit, find E(X).
5. A hand-held calculator has a clock cycle time of 100 nanoseconds; these are positions numbered
0, 1, . . . , 99. Assume a flag is set during a particular cycle at a random position. Thus, if X is
the position number at which the flag is set.
1
P(X = k) = k = 0, 1, 2, . . . , 99.
100
Evaluate the average position number E(X), and σ, the standard deviation.
(Hint: The sum of the first k integers is k(k + 1)/2 and the sum of their squares is:
k(k + 1)(2k + 1)/6.)
6. Concentric circles of radii 1 cm and 3 cm are drawn on a circular target radius 5 cm. A darts
player receives 10, 5 or 3 points for hitting the target inside the smaller circle, middle annular
region and outer annular region respectively. The player has only a 50-50 chance of hitting the
target at all but if he does hit it he is just as likely to hit any one point on it as any other. If
X = ‘number of points scored on a single throw of a dart’ calculate the expected value of X.
HELM (2006): 13
Answers
1. The required number is

16 16 × 15
= = 120.
2 2×1
2. There are

52
13
possible different selections of 13 machines and all are equally likely. There is only

2
=1
2
way to pick two machines from those which were modified but there are

50
11
different choices for the 11 other machines with problems so this is the number of possible
selections containing the 2 modified machines.
Hence the required probability is

2 50 50
2 11 11
=
52 52
13 13
50!/(11!39!)
=
52!/(13!39!)
50!13!
=
52!11!
13 × 12
= ≈ 0.0588
52 × 51
Alternatively, let S be the event “first modified machine is in the group of 13” and C be the
event “second modified machine is in the group of 13”. Then the required probability is
13 12
P(S) × P(C | S) = × .
52 51
14 HELM (2006):
®
Answers

52 4
3. There are different selections of 13, different choices of two modified machines
13 2
48
and different choices of 11 non-modified machines.
11
Thus the required probability is

4 48
2 11 (4!/2!2!)(48!/11!37!)
=
52 (52!/13!39!)
13
4!48!13!39!
=
52!2!2!11!37!
4 × 3 × 13 × 12 × 39 × 38
= ≈ 0.2135
52 × 51 × 50 × 49 × 2
Alternatively, let I(i) be the event “modified machine i is in the group of 13” and O(i)
be the negation of this, for i = 1, 2, 3, 4. The number of choices of two modified machines is

4
2
so the required probability is

4
P{I(1)} × P{I(2) | I(1)} × P{O(3) | I(1), I(2)} × P{O(4) | I(1)I(2)O(3)}
2

4 13 12 39 38
= × × ×
2 52 51 50 49
4 × 3 × 13 × 12 × 39 × 38
=
52 × 51 × 50 × 49 × 2
x 0 1 2 3 4 5 6 7 8 9
4. 1 1 1 1 1 1 1 1 1 1
P(X = x) /10 /10 /10 /10 /10 /10 /10 /10 /10 /10
1
E(X) = {0 + 1 + 2 + 3 + . . . + 9} = 4.5
10
HELM (2006): 15
Answers
5. Same as Q.4 but with 100 positions

1 1 99(99 + 1))
E(X) = {0 + 1 + 2 + 3 + . . . + 99} = = 49.5
100 100 2
σ 2 = mean of squares − square of means
1 2
∴ σ2 = [1 + 22 + . . . + 992 ] − (49.5)2
100
1 [99(100)(199)]
= − 49.52 = 833.25
100 6
√
so the standard deviation is σ = 833.25 = 28.87
6. X can take 4 values 0, 3, 5 or 10
P(X = 0) = 0.5 [only 50/50 chance of hitting target]
The probability that a particular points score is obtained is related to the areas of the annular
regions which are, from the centre: π, (9π − π) = 8π, (25π − 9π) = 16π
P(X = 3 ) = P[(3 is scored) ∩ (target is hit)]
= P(3 is scored | target is hit) × P(target is hit)
16π 1 16
= . =
25π 2 50
P(X = 5 ) = P(5 is scored | target is hit) × P(target is hit)
8π 1 8
= . =
25π 2 50
P(X = 10) = P(10 is scored | target is hit) × P(target is hit)
π 1 1
= . =
25π 2 50
x 0 3 5 10
25 16 8 1
P(X = x) /50 /50 /50 /50
48 + 40 + 10
∴ E(X) = = 1.96.
50
16 HELM (2006):
®
The Binomial
Distribution 37.2
Introduction
A situation in which an experiment (or trial) is repeated a fixed number of times can be modelled,
under certain assumptions, by the binomial distribution. Within each trial we focus attention on
a particular outcome. If the outcome occurs we label this as a success. The binomial distribution
allows us to calculate the probability of observing a certain number of successes in a given number
of trials.
You should note that the term ‘success’ (and by implication ‘failure’) are simply labels and as such
might be misleading. For example counting the number of defective items produced by a machine
might be thought of as counting successes if you are looking for defective items! Trials with two
possible outcomes are often used as the building blocks of random experiments and can be useful to
engineers. Two examples are:
1. A particular mobile phone link is known to transmit 6% of ‘bits’ of information in error. As an

engineer you might need to know the probability that two bits out of the next ten transmitted
are in error.
2. A machine is known to produce, on average, 2% defective components. As an engineer you

might need to know the probability that 3 items are defective in the next 20 produced.
The binomial distribution will help you to answer such questions.

Prerequisites • understand the concepts of probability


#
• recognise and use the formula for
binomial probabilities
Learning Outcomes
On completion you should be able to . . . • state the assumptions on which the binomial
model is based
" !
HELM (2006): 17
Section 37.2: The Binomial Distribution
1. The binomial model
We have introduced random variables from a general perspective and have seen that there are two
basic types: discrete and continuous. We examine four particular examples of distributions for
random variables which occur often in practice and have been given special names. They are the
binomial distribution, the Poisson distribution, the Hypergeometric distribution and the Normal
distribution. The first three are distributions for discrete random variables and the fourth is for a
continuous random variable. In this Section we focus attention on the binomial distribution.
The binomial distribution can be used in situations in which a given experiment (often referred to,
in this context, as a trial) is repeated a number of times. For the binomial model to be applied the
following four criteria must be satisfied:
• the trial is carried out a fixed number of times n
• the outcomes of each trial can be classified into two ‘types’ conventionally named ‘success’ or
‘failure’
• the probability p of success remains constant for each trial
• the individual trials are independent of each other.

For example, if we consider throwing a coin 7 times what is the probability that exactly 4 Heads
occur? This problem can be modelled by the binomial distribution since the four basic criteria are
assumed satisfied as we see.
• here the trial is ‘throwing a coin’ which is carried out 7 times
• the occurrence of Heads on any given trial (i.e. throw) may be called a ‘success’ and Tails
called a ‘failure’
1
• the probability of success is p = 2
and remains constant for each trial
• each throw of the coin is independent of the others.

The reader will be able to complete the solution to this example once we have constructed the general
binomial model.
The following two scenarios are typical of those met by engineers. The reader should check that the
criteria stated above are met by each scenario.
1. An electronic product has a total of 30 integrated circuits built into it. The product is capable
of operating successfully only if at least 27 of the circuits operate properly. What is the
probability that the product operates successfully if the probability of any integrated circuit
failing to operate is 0.01?
2. Digital communication is achieved by transmitting information in “bits”. Errors do occur in

data transmissions. Suppose that the number of bits in error is represented by the random
variable X and that the probability of a communication error in a bit is 0.001. If at most 2
errors are present in a 1000 bit transmission, the transmission can be successfully decoded. If
a 1000 bit message is transmitted, find the probability that it can be successfully decoded.
Before developing the general binomial distribution we consider the following examples which, as you
will soon recognise, have the basic characteristics of a binomial distribution.
18 HELM (2006):
®
Example 7
In a box of floppy discs it is known that 95% will work. A sample of three of the
discs is selected at random.
Find the probability that (a) none (b) 1, (c) 2, (d) all 3 of the sample will work.
Solution
Let the event {the disc works} be W and the event {the disc fails} be F . The probability that a
disc will work is denoted by P(W ) and the probability that a disc will fail is denoted by P(F ). Then
P(W ) = 0.95 and P(F ) = 1 − P(W ) = 1 − 0.95 = 0.05.
(a) The probability that none of the discs works equals the probability that all 3 discs fail.
This is given by:
P(none work) = P(F F F ) = P(F )×P(F )×P(F ) as the events are independent
3
= 0.05×0.05×0.05 = 0.05 = 0.000125
(b) If only one disc works then you could select the three discs in the following orders
(F F W ) or (F W F ) or (W F F ) hence
P(one works) = P(F F W )+P(F W F )+P(W F F )

= P(F )×P(F )×P(W )+P(F )×P(W )×P(F )+P(W )×P(F )×P(F )
= (0.05×0.05×0.95)+(0.05×0.95×0.05)+(0.95×0.05×0.05)
= 3×(0.05)2 ×0.95 = 0.007125
(c) If 2 discs work you could select them in order
(F W W ) or (W F W ) or (W W F ) hence
P(two work) = P(F W W )+P(W F W )+P(W W F )

= P(F )×P(W )×P(W )+P(W )×P(F )×P(W )+P(W )×P(W )×P(F )
= (0.05×0.95×0.95)+(0.95×0.05×0.95)+(0.95×0.95×0.05)
= 3×(0.05)×(0.95)2 = 0.135375
(d) The probability that all 3 discs work is given by P(W W W ) = 0.953 = 0.857375.
Notice that since the 4 outcomes we have dealt with are all possible outcomes
of selecting 3 discs, the probabilities should add up to 1. It is an easy check to verify
that they do.
One of the most important assumptions above is that of independence.The probability
of selecting a working disc remains unchanged no matter whether the previous selected
disc worked or not.
HELM (2006): 19
Example 8
A worn machine is known to produce 10% defective components. If the random
variable X is the number of defective components produced in a run of 3 compo-
nents, find the probabilities that X takes the values 0 to 3.
Solution
Assuming that the production of components is independent and that the probability p = 0.1 of
producing a defective component remains constant, the following table summarizes the production
run. We let G represent a good component and let D represent a defective component.
Note that since we are only dealing with two possible outcomes, we can say that the probability q of
the machine producing a good component is 1 − 0.1 = 0.9. More generally, we know that q+p = 1
if we are dealing with a binomial distribution.
Outcome Value of X Probability of Occurrence
GGG 0 (0.9)(0.9)(0.9) = (0.9)3
GGD 1 (0.9)(0.9)(0.1) = (0.9)2 (0.1)
GDG 1 (0.9)(0.1)(0.9) = (0.9)2 (0.1)
DGG 1 (0.1)(0.9)(0.9) = (0.9)2 (0.1)
DDG 2 (0.1)(0.1)(0.9) = (0.9)(0.1)2
DGD 2 (0.1)(0.9)(0.1) = (0.9)(0.1)2
GDD 2 (0.9)(0.1)(0.1) = (0.9)(0.1)2
DDD 3 (0.1)(0.1)(0.1) = (0.1)3
From this table it is easy to see that
P(X = 0) = (0.9)3
P(X = 1) = 3 × (0.9)2 (0.1)
P(X = 2) = 3 × (0.9)(0.1)2
P(X = 3) = (0.1)3
Clearly, a pattern is developing. In fact you may have already realized that the probabilities we have
found are just the terms of the expansion of the expression (0.9 + 0.1)3 since
(0.9 + 0.1)3 = (0.9)3 + 3 × (0.9)2 (0.1) + 3 × (0.9)(0.1)2 + (0.1)3
We now develop the binomial distribution from a more general perspective. If you find the theory
getting a bit heavy simply refer back to this example to help clarify the situation.
First we shall find it convenient to denote the probability of failure on a trial, which is 1 − p, by q,
that is:
q = 1 − p.
What we shall do is to calculate probabilities of the number of ‘successes’ occurring in n trials,
beginning with n = 1.
n=1 With only one trial we can observe either 1 success (with probability p) or 0 successes
(with probability q).
20 HELM (2006):
®
n=2 Here there are 3 possibilities: We can observe 2, 1 or 0 successes. Let S denote a success
and F denote a failure. So a failure followed by a success would be denoted by F S whilst two failures
followed by one success would be denoted by F F S and so on.
Then
P(2 successes in 2 trials) = P(SS) = P(S)P(S) = p2
(where we have used the assumption of independence between trials and hence multiplied probabili-
ties). Now, using the usual rules of basic probability, we have:
P(1 success in 2 trials) = P[(SF ) ∪ (F S)] = P(SF ) + P(F S) = pq + qp = 2pq
P(0 successes in 2 trials) = P(F F ) = P(F )P(F ) = q 2

The three probabilities we have found − q 2 , 2qp, p2 − are in fact the terms which arise in the
binomial expansion of (q + p)2 = q 2 + 2qp + p2 . We also note that since q = 1 − p the probabilities
sum to 1 (as we should expect):
q 2 + 2qp + p2 = (q + p)2 = ((1 − p) + p)2 = 1
Task
List the outcomes for the binomial model for the case n = 3, calculate their
probabilities and display the results in a table.
Your solution
Answer
{three successes, two successes, one success, no successes}
Three successes occur only as SSS with probability p3 .
Two successes can occur as SSF with probability (p2 q), as SF S with probability (pqp) or as F SS
with probability (qp2 ).
These are mutually exclusive events so the combined probability is the sum 3p2 q.
Similarly, we can calculate the other probabilities and obtain the following table of results.
Number of successes 3 2 1 0
Probability p3 3p2 q 3pq 2 q3
HELM (2006): 21
Note that the probabilities you have obtained:
q 3 , 3q 2 p, 3qp2 , p3
are the terms which arise in the binomial expansion of (q + p)3 = q 3 + 3q 2 p + 3qp2 + p3
Task
Repeat the previous Task for the binomial model for the case with n = 4.
Your solution
Answer
Number of successes 4 3 2 1 0
Probability p4 4p3 q 6p2 q 2 4pq 3 q4
Again we explore the connection between the probabilities and the terms in the binomial expansion
of (q + p)4 . Consider this expansion
(q + p)4 = q 4 + 4q 3 p + 6q 2 p2 + 4qp3 + p4
Then, for example, the term 4p3 q, is the probability of 3 successes in the four trials. These successes
can occur anywhere in the four trials and there must be one failure hence the p3 and q components
which are multiplied together. The remaining part of this term, 4, is the number of ways of selecting
three objects from 4.
4!
Similarly there are 4C2 = = 6 ways of selecting two objects from 4 so that the coefficient 6
2!2!
2 2
combines with p and q to give the probability of two successes (and hence two failures) in four
trials.
The approach described here can be extended for any number n of trials.
22 HELM (2006):
®
Key Point 4
The Binomial Probabilities

Let X be a discrete random variable, being the number of successes occurring in n independent
trials of an experiment. If X is to be described by the binomial model, the probability of exactly r
successes in n trials is given by
P(X = r) = nCr pr q n−r .
Here there are r successes (each with probability p), n − r failures (each with probability q) and
n n!
Cr = is the number of ways of placing the r successes among the n trials.
r!(n − r)!
Notation
If a random variable X follows a binomial distribution in which an experiment is repeated n times
each with probability p of success then we write X ∼ B(n, p).
Example 9
A worn machine is known to produce 10% defective components. If the random
variable X is the number of defective components produced in a run of 4 compo-
nents, find the probabilities that X takes the values 0 to 4.
Solution
From Example 8, we know that the probabilities required are the terms of the expansion of the
expression:
(0.9 + 0.1)4 so X ∼ B(4, 0.1)
Hence the required probabilities are (using the general formula with n = 4 and p = 0.1)
P(X = 0) = (0.9)4 = 0.6561
P(X = 1) = 4(0.9)3 (0.1) = 0.2916
4×3
P(X = 2) = (0.9)2 (0.1)2 = 0.0486
1×2
4×3×2
P(X = 3) = (0.9)(0.1)3 = 0.0036
1×2×3
P(X = 4) = (0.1)4 = 0.0001
Also, since we are using the expansion of (0.9 + 0.1)4 , the probabilities should sum to 1, This is a
useful check on your arithmetic when you are using a binomial distribution.
HELM (2006): 23
Example 10
In a box of switches it is known 10% of the switches are faulty. A technician is
wiring 30 circuits, each of which needs one switch. What is the probability that
(a) all 30 work, (b) at most 2 of the circuits do not work?
Solution
The answers involve binomial distributions because there are only two states for each circuit - it
either works or it doesn’t work.
A trial is the operation of testing each circuit.
A success is that it works. We are given P(success) = p = 0.9
Also we have the number of trials n = 30
n
Applying the binomial distribution P(X = r) = Cr pr (1 − p)n−r .
30
(a) Probability that all 30 work is P(X = 30) = C30 (0.9)30 (0.1)0 = 0.04239
(b) The statement that “at most 2 circuits do not work” implies that 28, 29 or 30 work.
That is X ≥ 28
P(X ≥ 28) = P(X = 28) + P(X = 29) + P(X = 30)

P(X = 30) = 30 C30 (0.9)30 (0.1)0 = 0.04239
P(X = 29) = 30 C29 (0.9)29 (0.1)1 = 0.14130
P(X = 28) = 30 C28 (0.9)28 (0.1)2 = 0.22766
Hence P(X ≥ 28) = 0.41135
24 HELM (2006):
®
Example 11
A University Engineering Department has introduced a new software package called
SOLVIT. To save money, the University’s Purchasing Department has negotiated
a bargain price for a 4-user licence that allows only four students to use SOLVIT
at any one time. It is estimated that this should allow 90% of students to use
the package when they need it. The Students’ Union has asked for more licences
to be bought since engineering students report having to queue excessively to use
SOLVIT. As a result the Computer Centre monitors the use of the software. Their
findings show that on average 20 students are logged on at peak times and 4 of
these want to use SOLVIT. Was the Purchasing Department’s estimate correct?
Solution
4
P(student wanted to use SOLVIT ) = = 0.2
20
Let X be the number of students wanting to use SOLVIT at any one time, then
20
P(X = 0) = C0 (0.2)0 (0.8)20 = 0.0115
20
P(X = 1) = C1 (0.2)1 (0.8)19 = 0.0576
20
P(X = 2) = C2 (0.2)2 (0.8)18 = 0.1369
20
P(X = 3) = C3 (0.2)3 (0.8)17 = 0.2054
20
P(X = 4) = C4 (0.2)4 (0.8)16 = 0.2182
Therefore
P(X ≤ 4) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4)

= 0.01152 + 0.0576 + 0.1369 + 0.2054 + 0.2182
= 0.61862
The probability that more than 4 students will want to use SOLVIT is
P(X > 4) = 1 − P(X ≤ 4) = 0.38138
That is, 38% of the time there will be more than 4 students wanting to use the software. The
Purchasing Department has grossly overestimated the availability of the software on the basis of a
4-user licence.
HELM (2006): 25
Task
1
Using the binomial model, and assuming that a success occurs with probability 5
in each trial, find the probability that in 6 trials there are
(a) 0 successes (b) 3 successes (c) 2 failures.
Let X be the number of successes in 6 independent trials.

Your solution
(a) P(X = 0) =
Answer
1 4
In each case p = and q = 1 − p = .
5 5
Here r = 0 and
6
4
6 4096
P(X = 0) = q = = ≈ 0.262
5 15625
Your solution
(b) P(X = 3) =
Answer
3 3
6 6×5×4
3 3 1 4 20 × 64 12 × 80
r = 3 and P(X = 3) = C3 p q = × × = 6
= = 0.0819
1×2×3 5 5 5 15625
Your solution
(c) P(X = 4) =
Answer
4 2
6 6×5 4 2 1 4 15 × 42 240
Here r = 4 and P(X = 4) = C4 p q = × × = = = 0.01536
1×2 5 5 56 15625
26 HELM (2006):
®
2. Expectation and variance of the binomial distribution

For a binomial distribution X ∼ B(n, p), the mean and variance, as we shall see, have a simple form.
While we will not prove the formulae in general terms - the algebra can be rather tedious - we will
illustrate the results for cases involving small values of n.
The case n = 2
Essentially, we have a random variable X which follows a binomial distribution X ∼ B(2, p) so that
the values taken by X (and X 2 - needed to calculate the variance) are shown in the following table:
x x2 P(X = x) xP(X = x) x2 P(X = x)
0 0 q2 0 0
1 1 2qp 2qp 2qp
2 4 p2 2p2 4p2
We can now calculate the mean of this distribution:
E(X) = xP(X = x) = 0 + 2qp + 2p2 = 2p(q + p) = 2p
P
since q + p = 1
Similarly, the variance V (X) is given by
V (X) = E(X 2 ) − [E(X)]2 = 0 + 2qp + 4p2 − (2p)2 = 2qp
Task
Calculate the mean and variance of a random variable X which follows a binomial
distribution X ∼ B(3, p).
Your solution
HELM (2006): 27
Answer
The table of values appropriate to this case is:
x x2 P(X = x) xP(X = x) x2 P(X = x)
0 0 q3 0 0
1 1 3q 2 p 3q 2 p 3q 2 p
2 4 3qp2 6qp2 12qp2
3 9 p3 3p3 9p3
xP(X = x) = 0 + 3q 2 p + 6qp2 + 3p3 = 3p(q + p)2 = 3p
P
Hence E(X) = since q + p = 1
V (X) = E(X 2 ) − [E(X)]2

= 0 + 3q 2 p + 12qp2 + 9p3 − (3p)2
= 3p(q 2 + 4qp + 3p2 − 3p)
= 3p((1 − p)2 + 4(1 − p)p + 3p2 − 3p)
= 3p(1 − 2p + p2 + 4p − 4p2 + 3p2 − 3p) = 3p(1 − p) = 3pq
From the results given above, it is reasonable to asert the following result in Key Point 5.
Key Point 5
Expectation and Variance of the Binomial Distribution
If a random variable X which can assume the values 0, 1, 2, 3, . . . , n follows a binomial distribution
X ∼ B(n, p) so that
P(X = r) = n Cr pr q n−r = n Cr pr (1 − p)n−r
then the expectation and variance of the distribution are given by the formulae
E(X) = np and V (X) = np(1 − p) = npq
Task
A die is thrown repeatedly 36 times in all. Find E(X) and V (X) where X is the
number of sixes obtained.
Your solution
28 HELM (2006):
®
Answer
Consider the occurrence of a six, with X being the number of sixes thrown in 36 trials.
The random variable X follows a binomial distribution. (Why? Refer to page 18 for the criteria
if necessary). A trial is the operation of throwing a die. A success is the occurrence of a 6 on a
particular trial, so p = 16 . We have n = 36, p = 16 so that
1 1 5
E(X) = np = 36 × =6 and V (X) = npq = 36 × × = 5.
6 6 6
√
Hence the standard deviation is σ = 5 ' 2.236.
E(X) = 6 implies that in 36 throws of a fair die we would expect, on average, to see 6 sixes. This
makes perfect sense, of course.
HELM (2006): 29
Exercises
1. The probability that a mountain-bike rider travelling along a certain track will have a tyre burst
is 0.05. Find the probability that among 17 riders:
(a) exactly one has a burst tyre

(b) at most three have a burst tyre
(c) two or more have burst tyres.
2. (a) A transmission channel transmits zeros and ones in strings of length 8, called ‘words’.
Possible distortion may change a one to a zero or vice versa; assume this distortion occurs
with probability .01 for each digit, independently. An error-correcting code is employed
in the construction of the word such that the receiver can deduce the word correctly if at
most one digit is in error. What is the probability the word is decoded incorrectly?
(b) Assume that a word is a sequence of 10 zeros or ones and, as before, the probability of
incorrect transmission of a digit is .01. If the error-correcting code allows correct decoding
of the word if no more than two digits are incorrect, compute the probability that the
word is decoded correctly.
3. An examination consists of 10 multi-choice questions, in each of which a candidate has to

deduce which one of five suggested answers is correct. A completely unprepared student
guesses each answer completely randomly. What is the probability that this student gets 8 or
more questions correct? Draw the appropriate moral!
4. The probability that a machine will produce all bolts in a production run within specification
is 0.998. A sample of 8 machines is taken at random. Calculate the probability that
(a) all 8 machines, (b) 7 or 8 machines, (c) at least 6 machines
will produce all bolts within specification
5. The probability that a machine develops a fault within the first 3 years of use is 0.003. If 40
machines are selected at random, calculate the probability that 38 or more will not develop any
faults within the first 3 years of use.
6. A computer installation has 10 terminals. Independently, the probability that any one terminal
will require attention during a week is 0.1. Find the probabilities that
(a) 0, (b), 1 (c) 2, (d) 3 or more, terminals will require attention during the next week.
7. The quality of electronic chips is checked by examining samples of 5. The frequency distribution
of the number of defective chips per sample obtained when 100 samples have been examined
is:
No. of defectives 0 1 2 3 4 5
No. of samples 47 34 16 3 0 0
Calculate the proportion of defective chips in the 500 tested. Assuming that a binomial distri-
bution holds, use this value to calculate the expected frequencies corresponding to the observed
frequencies in the table.
30 HELM (2006):
®
Exercises continued
8. In a large school, 80% of the pupils like mathematics. A visitor to the school asks each of 4
pupils, chosen at random, whether they like mathematics.
(a) Calculate the probabilities of obtaining an answer yes from 0, 1, 2, 3, 4 of the pupils
(b) Find the probability that the visitor obtains the answer yes from at least 2 pupils:
(i) when the number of pupils questioned remains at 4
(ii) when the number of pupils questioned is increased to 8.
9. A machine has two drive belts, one on the left and one on the right. From time to time the
drive belts break. When one breaks the machine is stopped and both belts are replaced. Details of n
consecutive breakages are recorded. Assume that the left and right belts are equally likely to break
first. Let X be the number of times the break is on the left.
(a) How many possible different sequences of “left” and “right” are there?
(b) How many of these sequences contain exactly j “lefts”?
(c) Find an expression, in terms of n and j, for the probability that X = j.
(d) Let n = 6. Find the probability distribution of X.
10. A machine is built to make mass-produced items. Each item made by the machine has a
probability p of being defective. Given the value of p, the items are independent of each other.
Because of the way in which the machines are made, p could take one of several values. In fact
p = X/100 where X has a discrete uniform distribution on the interval [0, 5]. The machine is tested
by counting the number of items made before a defective is produced. Find the conditional probability
distribution of X given that the first defective item is the thirteenth to be made.
11. Seven batches of articles are manufactured. Each batch contains ten articles. Each article has,
independently, a probability of 0.1 of being defective. Find the probability that there is at least one
defective article
(a) in exactly four of the batches,

(b) in four or more of the batches.
12. A service engineer is can be called out for maintenance on the photocopiers in the offices of
four large companies, A, B, C and D. On any given week there is a probability of 0.1 that he will
be called to each of these companies. The event of being called to one company is independent of
whether or not he is called to any of the others.
(a) Find the probability that, on a particular day,
(i) he is called to all four companies,
(ii) he is called to at least three companies,
(iii) he is called to all four given that he is called to at least one,
(iv) he is called to all four given that he is called to Company A.
(b) Find the expected value and variance of the number of these companies which call the
engineer on a given day.
HELM (2006): 31
Exercises continued
13. There are five machines in a factory. Of these machines, three are working properly and two
are defective. Machines which are working properly produce articles each of which has independently
a probability of 0.1 of being imperfect. For the defective machines this probability is 0.2. A machine
is chosen at random and five articles produced by the machine are examined. What is the probability
that the machine chosen is defective given that, of the five articles examined, two are imperfect and
three are perfect?
14. A company buys mass-produced articles from a supplier. Each article has a probability p of being
defective, independently of other articles. If the articles are manufactured correctly then p = 0.05.
However, a cheaper method of manufacture can be used and this results in p = 0.1.
(a) Find the probability of observing exactly three defectives in a sample of twenty articles
(i) given that p = 0.05
(ii) given that p = 0.1.
(b) The articles are made in large batches. Unfortunately batches made by both methods
are stored together and are indistinguishable until tested, although all of the articles
in any one batch will be made by the same method. Suppose that a batch delivered
to the company has a probability of 0.7 of being made by the correct method. Find the
conditional probability that such a batch is correctly manufactured given that, in a sample
of twenty articles from the batch, there are exactly three defectives.
(c) The company can either accept or reject a batch. Rejecting a batch leads to a loss for
the company of £150. Accepting a batch which was manufactured by the cheap method
will lead to a loss for the company of £400. Accepting a batch which was correctly
manufactured leads to a profit of £500. Determine a rule for what the company should
do if a sample of twenty articles contains exactly three defectives, in order to maximise
the expected value of the profit (where loss is negative profit). Should such a batch be
accepted or rejected?
(d) Repeat the calculation for four defectives in a sample of twenty and hence, or otherwise,
determine a rule for how the company should decide whether to accept or reject a batch
according to the number of defectives.
32 HELM (2006):
®
Answers
1. Binomial distribution P(X = r) = nCr pr (1 − p)n−r where p is the probability of single ‘success’
which is ‘tyre burst’.
17
(a) P(X = 1) = C1 (0.05)1 (0.95)16 = 0.3741
(b)
P(X ≤ 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)

17 × 16
= (0.95)17 + 17(0.05)(0.95)16 + (0.05)2 (0.95)15
2×1
17 × 16 × 15
+ (0.05)3 (0.95)14 = 0.9912
3×2×1
(c) P(X ≥ 2) = 1 − P[(X = 0) ∪ (X = 1)] = 1 − (0.95)17 − 17(0.05)(0.95)16 = 0.2077
2.
(a) P (distortion) = 0.01 for each digit. This is a binomial situation in which the probability
of ‘success’ is 0.01 = p and there are n = 8 trials.
A word is decoded incorrectly if there are two or more digits in error
P(X ≥ 2) = 1 − P[(X = 0) ∪ (X = 1)]

= 1 − 8C0 (0.99)8 − 8C1 (0.01)(0.99)7 = 0.00269
(b) Same as (a) with n = 10. Correct decoding if X ≤ 2
P(X ≤ 2) = P[(X = 0) ∪ (X = 1) ∪ (X = 2)]

= (0.99)10 + 10(0.01)(0.99)9 + 45(0.01)2 (0.99)8 = 0.99989
3. Let X be a random variable ‘number of answers guessed correctly’ then for each question
(i.e. trial) the probability of a ‘success’ = 51 . It is clear that X follows a binomial distribution
with n = 10 and p = 0.2.
P (randomly choosing correct answer) = 15 n = 10
P(8 or more correct) = P[(X = 8) ∪ (X = 9) ∪ (X = 10)]

= 10C8 (0.2)8 (0.8)2 + 10C9 (0.2)9 (0.8) + 10
C10 (0.2)10 = 0.000078
4. (a) 0.9841 (b) 0.9999 (c) 1.0000

5. P(X ≥ 38) = P(X = 38) + P(X = 39) + P(X = 40) = 0.00626 + 0.1067 + 0.88676 = 0.99975
6. (a) 0.3487 (b) 0.3874 (c) 0.1937 (d) 0.0702
7. 0.15 (total defectives = 0 + 34 + 32 + 9 + 0 out of 500 tested); 44, 39, 14, 2, 0, 0
8. (a) 0.0016, 0.0256, 0.1536, 0.4096, 0.4096; (b)(i) 0.9728 (b)(ii) 0.9988
HELM (2006): 33
Answers
9.
(a) There are 2n possible sequences.

n
(b) The number containing exactly j “lefts” is .
j

n
(c) P(X = j) = 2−n .
j
(d) With n = 6 the distribution of X is
j 0 1 2 3 4 5 6
P(X = j) 0.015625 0.09375 0.234375 0.3125 0.234375 0.09375 0.015625
10. Let Y be the number of the first defective item.
P(X = j) × P(Y = 13 | X = j) P(Y = 13 | X = j)

P(X = j | Y = 13) = 5
= P5
i=0 P(Y = 13 | X = i)
X
P(X = i) × P(Y = 13 | X = i)
i=0
since P(X = j) = 1/6 for j = 0, . . . , 5.

12
X X
P(Y = 13 | X = j) = 1 −
100 100
j P(Y = 13 | X = j) P(X = j | Y = 13)

0 0.00000 0.0000
1 0.00886 0.0707
2 0.01569 0.1251
3 0.02082 0.1660
4 0.02451 0.1954
5 0.02702 0.2154
6 0.02856 0.2277
Total 0.12546 1
34 HELM (2006):
®
Answers
11.
The probability of at least one defective in a batch is 1 − 0.910 = 0.6513.
Let the probability of at least one defective in exactly j batches be pj .

7 4 3
(a) p4 = 1 − 0.910 0.910 = 35 × 0.65134 × 0.34873 = 0.2670.
4
(b)

7 5 2
p5 = 1 − 0.910 0.910 = 21 × 0.65135 × 0.34872 = 0.2993.
5

7 6 1
p6 = 1 − 0.910 0.910 = 7 × 0.65136 × 0.34871 = 0.1863.
6

7 7 0
p7 = 1 − 0.910 0.910 = 0.65137 = 0.0497.
7
The probability of at least one defective in four or more of the batches is

p4 + p5 + p6 + p7 = 0.8023.
12.
(a) Let Y be the number of companies to which the engineer is called and let A denote the event
that the engineer is called to company A.
(i) P(Y = 4) = 0.14 = 0.0001.

4
(ii) P(Y ≥ 3) = × 0.13 × 0.91 + 0.14 = 0.0037.
3
P(Y = 4 ∩ Y ≥ 1)
(iii) P(Y = 4 | Y ≥ 1) =
P(Y ≥ 1)
P(Y = 4) 0.0001 0.0001 1
= = 4
= = = 0.0003.
P(Y ≥ 1) 1 − 0.9 0.3439 3439
P(Y = 4 ∩ A)
(iv) P(Y = 4 | A) =
P(A)
P(Y = 4) 0.0001
= = 0.0010.
P(A) 0.1
(b) The mean is E(Y ) = 4 × 0.1 = 0.4. The variance is V (Y ) = 4 × 0.1 × 0.9 = 0.36.
HELM (2006): 35
Answers
13. Let D denote the event that the chosen machine is defective and D̄ denote the event
“not D”.
Let Y be the number of imperfect articles in the sample of five.
Then
P(D) × P(Y = 2 | D)
P(D | Y = 2) =
P(D) × P(Y = 2 | D) + P(D̄) × P(Y = 2 | D̄)

2 5
5
× × 0.22 × 0.83
2
=
2 5 3 5
5
× × 0.22 × 0.83 + 5 × × 0.12 × 0.93
2 2
2 × 0.22 × 0.83
=
2 × 0.22 × 0.83 + 3 × 0.12 × 0.93
0.04096
= = 0.6519.
0.04096 + 0.02187
14.

20 20 × 19 × 18
(a) (i) p3 = 0.13 × 0.917 = × 0.13 × 0.97 = 0.190.
3 1×2×3
(ii)

20 3
p2 = 0.12 × 0.918 = × 9 × p3 = 0.28518
2 18

20 2
p1 = 0.1 × 0.919 = × 9 × p2 = 0.27017
1 19

20
p0 = 0.920 = 0.12158.
0
The total probability is 0.867.
(iii) The required probability is the probability of at most 2 out of 16.
p00 = P(0 out of 16) = 0.916 = 0.185302

16
p01 = P(1 out of 16) = × p00 = 0.3294258
9
15 1
p02 = P(2 out of 16) = × × p01 = 0.2745215
2 9
(b)

4
0.2 × 0.31 × 0.73
1 0.02058
= = 0.2608.
4 1 3 4 1 3
0.02058 + 0.05832
0.2 × 0.3 × 0.7 + 0.9 × 0.1 × 0.9
1 1
36 HELM (2006):
®
The Poisson
Distribution 37.3

Introduction
In this Section we introduce a probability model which can be used when the outcome of an experiment
is a random variable taking on positive integer values and where the only information available is a
measurement of its average value. This has widespread applications, for example in analysing traffic
flow, in fault prediction on electric cables and in the prediction of randomly occurring accidents. We
shall look at the Poisson distribution in two distinct ways. Firstly, as a distribution in its own right.
This will enable us to apply statistical methods to a set of problems which cannot be solved using
the binomial distribution. Secondly, as an approximation to the binomial distribution X ∼ B(n, p)
in the case where n is large and p is small. You will find that this approximation can often save the
need to do much tedious arithmetic.

• understand the concepts of probability
Prerequisites
• understand the concepts and notation for the
Before starting this Section you should . . . binomial distribution

'
$
• recognise and use the formula for probabilities
calculated from the Poisson model
Learning Outcomes • use the recurrence relation to generate a

succession of probabilities
• use the Poisson model to obtain approximate
values for binomial probabilities
& %
HELM (2006): 37
Section 37.3: The Poisson Distribution
1. The Poisson approximation to the binomial distribution
The probability of the outcome X = r of a set of Bernoulli trials can always be calculated by using
the formula
n
P(X = r) = Cr q n−r pr
given above. Clearly, for very large values of n the calculation can be rather tedious, this is particularly
so when very small values of p are also present. In the situation when n is large and p is small and the
product np is constant we can take a different approach to the problem of calculating the probability
that X = r. In the table below the values of P(X = r) have been calculated for various combinations
of n and p under the constraint that np = 1. You should try some of the calculations for yourself
using the formula given above for some of the smaller values of n.
Probability of X successes
n p X=0 X=1 X=2 X=3 X=4 X=5 X=6
4 0.25 0.316 0.422 0.211 0.047 0.004

5 0.20 0.328 0.410 0.205 0.051 0.006 0.000
10 0.10 0.349 0.387 0.194 0.058 0.011 0.001 0.000
20 0.05 0.359 0.377 0.189 0.060 0.013 0.002 0.000
100 0.01 0.366 0.370 0.185 0.061 0.014 0.003 0.001
1000 0.001 0.368 0.368 0.184 0.061 0.015 0.003 0.001
10000 0.0001 0.368 0.368 0.184 0.061 0.015 0.003 0.001
Each of the binomial distributions given has a mean given by np = 1. Notice that the probabilities
that X = 0, 1, 2, 3, 4, . . . approach the values 0.368, 0.368, 0.184, . . . as n increases.
If we have to determine the probabilities of success when large values of n and small values of p are
involved it would be very convenient if we could do so without having to construct tables. In fact we
can do such calculations by using the Poisson distribution which, under certain constraints, may be
considered as an approximation to the binomial distribution.
By considering simplifications applied to the binomial distribution subject to the conditions

1. n is large
2. p is small
3. np = λ (λ a constant)
we can derive the formula
λr
P(X = r) = e−λ as an approximation to P(X = r) = n Cr q n−r pr .
r!
This is the Poisson distribution given previously. We now show how this is done. We know that the
binomial distribution is given by
n(n − 1) n−2 2 n(n − 1) . . . (n − r + 1) n−r r
(q + p)n = q n + nq n−1 p + q p + ··· + q p + · · · + pn
2! r!
Condition (2) tells us that since p is small, q = 1 − p is approximately equal to 1. Applying this to
the terms of the binomial expansion above we see that the right-hand side becomes
n(n − 1) 2 n(n − 1) . . . (n − r + 1) r
1 + np + p + ··· + p + · · · + pn
2! r!
38 HELM (2006):
®
Applying condition (1) allows us to approximate terms such as (n − 1), (n − 2), . . . to n (mathemat-
ically, we are allowing n → ∞ ) and the right-hand side of our expansion becomes
n2 2 nr r
1 + np + p + · · · + p + . . .
2! r!
n
Note that the term p → 0 under these conditions and hence has been omitted.
We now have the series
(np)2 (np)r
1 + np + + ··· + + ...
2! r!
which, using condition (3) may be written as
(λ)2 (λ)r
1+λ+ + ··· + + ...
2! r!
You may recognise this as the expansion of eλ .
If we are to be able to claim that the terms of this expansion represent probabilities, we must be sure
that the sum of the terms is 1. We divide by eλ to satisfy this condition. This gives the result
eλ 1 (λ)2 (λ)r
= 1 = (1 + λ + + · · · + + ...)
eλ eλ 2! r!
λ2 λ3 λr
= e−λ + e−λ λ + e−λ + e−λ + · · · + e−λ + · · · +
2! 3! r!
The terms of this expansion are very good approximations to the corresponding binomial expansion
under the conditions
1. n is large
2. p is small
3. np = λ (λ constant)
The Poisson approximation to the binomial distribution is summarized below.
Key Point 6
Poisson Approximation to the Binomial Distribution
Assuming that n is large, p is small and that np is constant, the terms
n
P(X = r) = Cr (1 − p)n−r pr
of a binomial distribution may be closely approximated by the terms
r
−λ λ
P(X = r) = e
r!
of the Poisson distribution for corresponding values of r.
HELM (2006): 39
Example 12
We introduced the binomial distribution by considering the following scenario. A
worn machine is known to produce 10% defective components. If the random vari-
able X is the number of defective components produced in a run of 3 components,
find the probabilities that X takes the values 0 to 3.
Suppose now that a similar machine which is known to produce 1% defective
components is used for a production run of 40 components. We wish to calculate
the probability that two defective items are produced. Essentially we are assuming
that X ∼ B(40, 0.01) and are asking for P(X = 2). We use both the binomial
distribution and its Poisson approximation for comparison.
Solution
Using the binomial distribution we have the solution
40 40 × 39
P(X = 2) = C2 (0.99)40−2 (0.01)2 = × 0.9938 × 0.012 = 0.0532
1×2
Note that the arithmetic involved is unwieldy. Using the Poisson approximation we have the solution
0.42
P(X = 2) = e−0.4 = 0.0536
2!
Note that the arithmetic involved is simpler and the approximation is reasonable.
Practical considerations
In practice, we can use the Poisson distribution to very closely approximate the binomial distribution
provided that the product np is constant with
n ≥ 100 and p ≤ 0.05
Note that this is not a hard-and-fast rule and we simply say that
‘the larger n is the better and the smaller p is the better provided that np is a sensible size.’
The approximation remains good provided that np < 5 for values of n as low as 20.
Task
Mass-produced needles are packed in boxes of 1000. It is believed that 1 needle
in 2000 on average is substandard. What is the probability that a box contains
2 or more defectives? The correct model is the binomial distribution with n =
1 1999
1000, p = (and q = ).
2000 2000
40 HELM (2006):
®
(a) Using the binomial distribution calculate P(X = 0), P(X = 1) and hence P(X ≥ 2):
Your solution
Answer
1000
1999
P(X = 0) = = 0.60645
2000
999 999
1999 1 1 1999
P(X = 1) = 1000 × = = 0.30338
2000 2000 2 2000
∴ P(X = 0) + P(X = 1) = 0.60645 + 0.30338 = 0.90983 ' 0.9098 (4 d.p.)
Hence P(2 or more defectives) ' 1 − 0.9098 = 0.0902.
(b) Now choose a suitable value for λ in order to use a Poisson model to approximate the probabilities:
Your solution
λ=
Answer
1 1
λ = np = 1000 × = 2
2000
Now recalculate the probability that there are 2 or more defectives using the Poisson distribution
with λ = 12 :
Your solution
P(X = 0) =
P(X = 1) =
∴ P(2 or more defectives)=
Answer
1 1
P(X = 0) = e− 2 , P(X = 1) = 12 e− 2
1
∴ P(X = 0) + P(X = 1) = 32 e− 2 = 0.9098 (4 d.p.)
Hence P(2 or more defectives) ' 1 − 0.9098 = 0.0902.
HELM (2006): 41
In the above Task we have obtained the same answer to 4 d.p., as the exact binomial calculation,
essentially because p was so small. We shall not always be so lucky!
Example 13
In the manufacture of glassware, bubbles can occur in the glass which reduces the
status of the glassware to that of a ‘second’. If, on average, one in every 1000
items produced has a bubble, calculate the probability that exactly six items in a
batch of three thousand are seconds.
Solution
Suppose that X = number of items with bubbles, then X ∼ B(3000, 0.001)
Since n = 3000 > 100 and p = 0.001 < 0.005 we can use the Poisson distribution with λ = np =
3000 × 0.001 = 3. The calculation is:
36
P(X = 6) = e−3 ≈ 0.0498 × 1.0125 ≈ 0.05
6!
The result means that we have about a 5% chance of finding exactly six seconds in a batch of three
thousand items of glassware.
Example 14
A manufacturer produces light-bulbs that are packed into boxes of 100. If quality
control studies indicate that 0.5% of the light-bulbs produced are defective, what
percentage of the boxes will contain:
(a) no defective? (b) 2 or more defectives?
Solution
As n is large and p, the P(defective bulb), is small, use the Poisson approximation to the binomial
probability distribution. If X = number of defective bulbs in a box, then
X ∼ P(µ) where µ = n × p = 100 × 0.005 = 0.5
e−0.5 (0.5)0 e−0.5 (1)
(a) P(X = 0) = = = 0.6065 ≈ 61%
0! 1
(b) P(X = 2 or more) = P(X = 2) + P(X = 3) + P(X = 4) + . . . but it is easier to consider:
P(X ≥ 2) = 1 − [P(X = 0) + P(X = 1)]
e−0.5 (0.5)1 e−0.5 (0.5)
P(X = 1) = = = 0.3033
1! 1
i.e. P(X ≥ 2) = 1 − [0.6065 + 0.3033] = 0.0902 ≈ 9%
42 HELM (2006):
®
2. The Poisson distribution

The Poisson distribution is a probability model which can be used to find the probability of a single
event occurring a given number of times in an interval of (usually) time. The occurrence of these
events must be determined by chance alone which implies that information about the occurrence
of any one event cannot be used to predict the occurrence of any other event. It is worth noting
that only the occurrence of an event can be counted; the non-occurrence of an event cannot be
counted. This contrasts with Bernoulli trials where we know the number of trials, the number of
events occurring and therefore the number of events not occurring.
The Poisson distribution has widespread applications in areas such as analysing traffic flow, fault pre-
diction in electric cables, defects occurring in manufactured objects such as castings, email messages
arriving at a computer and in the prediction of randomly occurring events or accidents. One well
known series of accidental events concerns Prussian cavalry who were killed by horse kicks. Although
not discussed here (death by horse kick is hardly an engineering application of statistics!) you will
find accounts in many statistical texts. One example of the use of a Poisson distribution where the
events are not necessarily time related is in the prediction of fault occurrence along a long weld -
faults may occur anywhere along the length of the weld. A similar argument applies when scanning
castings for faults - we are looking for faults occurring in a volume of material, not over an interval
if time.
The following definition gives a theoretical underpinning to the Poisson distribution.
Definition of a Poisson process

Suppose that events occur at random throughout an interval. Suppose further that the interval can
be divided into subintervals which are so small that:
1. the probability of more than one event occurring in the subinterval is zero
2. the probability of one event occurring in a subinterval is proportional to the length of the
subinterval
3. an event occurring in any given subinterval is independent of any other subinterval
then the random experiment is known as a Poisson process.

The word ‘process’ is used to suggest that the experiment takes place over time, which is the usual
case. If the average number of events occurring in the interval (not subinterval) is λ (> 0) then the
random variable X representing the actual number of events occurring in the interval is said to have
a Poisson distribution and it can be shown (we omit the derivation) that
λr
P(X = r) = e−λ r = 0, 1, 2, 3, . . .
r!
The following Key Point provides a summary.
HELM (2006): 43
Key Point 7
The Poisson Probabilities
If X is the random variable
‘number of occurrences in a given interval’
for which the average rate of occurrence is λ then, according to the Poisson model, the probability
of r occurrences in that interval is given by
λr
P(X = r) = e−λ r = 0, 1, 2, 3, . . .
r!
Task
λr
Using the Poisson distribution P(X = r) = e−λwrite down the formulae for
r!
P(X = 0), P(X = 1), P(X = 2) and P(X = 6), noting that 0! = 1.
Your solution
P(X = 0) =
P(X = 1) =
P(X = 2) =
P(X = 6) =
Answer
λ0 1 λ
P(X = 0) = e−λ × = e−λ × ≡ e−λ P(X = 1) = e−λ × = λe−λ
0! 1 1!
2 2
λ λ λ6 λ6 −λ
P(X = 2) = e−λ × = e−λ P(X = 6) = e−λ × = e
2! 2 6! 720
44 HELM (2006):
®
Task
Calculate P(X = 0) to P(X = 5) when λ = 2, accurate to 4 d.p.
Your solution
Answer
r 0 1 2 3 4 5
P(X = r) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361
Notice how the values for P(X = r) in the above answer increase, stay the same and then decrease
relatively rapidly (due to the significant increase in r! with increasing r). Here two of the probabilities
are equal and this will always be the case when λ is an integer.
In this last Task we only went up to P(X = 5) and calculated each entry separately. However, each
probability need not be calculated directly. We can use the following relations (which can be checked
from the formulae for P(X = r)) to get the next probability from the previous one:
λ λ λ
P(X = 1) = P(X = 0) , P(X = 2) = P(X = 1), P(X = 3) = P(X = 2) , etc.
1 2 3
Key Point 8
Recurrence Relation for Poisson Probabilities
In general, for ease of calculation the recurrence relation below can be used
λ
P(X = r) = P(X = r − 1) for r ≥ 1.
r
HELM (2006): 45
Example 15
Calculate the value for P(X = 6) to extend the Table in the previous Task using
the recurrence relation and the value for P(X = 5).
Solution
The recurrence relation gives the formula
2 1
P(X = 6) = P(X = 5) = × 0.0361 = 0.0120
6 3
We now look further at the Poisson distribution by considering an example based on traffic flow.
Example 16
Suppose it has been observed that, on average, 180 cars per hour pass a specified
point on a particular road in the morning rush hour. Due to impending roadworks
it is estimated that congestion will occur closer to the city centre if more than
5 cars pass the point in any one minute. What is the probability of congestion
occurring?
Solution
We note that we cannot use the binomial model since we have no values of n and p. Essentially we
are saying that there is no fixed number (n) of cars passing the specified point and that we have
no way of estimating p. The only information available is the average rate at which cars pass the
specified point.
Let X be the random variable X = number of cars arriving in any minute. We need to calculate
the probability that more than 5 cars arrive in any one minute. Note that in order to do this we
need to convert the information given on the average rate (cars arriving per hour) into a value for
λ (cars arriving per minute). This gives the value λ = 3.
Using λ = 3 to calculate the required probabilities gives:
r 0 1 2 3 4 5 Sum
P(X = r) 0.04979 0.149361 0.22404 0.22404 0.168031 0.10082 0.91608
To calculate the required probability we note that

P(more than 5 cars arrive in one minute) = 1 − P(5 cars or less arrive in one minute)
Thus
P(X > 5) = 1 − P(X ≤ 5)

= 1 − P(X = 0) − P(X = 1) − P(X = 2) − P(X = 3) − P(X = 4) − P(X = 5)
Then P(more than 5) = 1 − 0.91608 = 0.08392 = 0.0839 (4 d.p).
46 HELM (2006):
®
Example 17
The mean number of bacteria per millilitre of a liquid is known to be 6. Find the
probability that in 1 ml of the liquid, there will be:
(a) 0, (b) 1, (c) 2, (d) 3, (e) less than 4, (f) 6 bacteria.
Solution
Here we have an average rate of occurrences but no estimate of the probability so it looks as though
we have a Poisson distribution with λ = 6. Using the formula in Key Point 7 we have:
60
(a) P(X = 0) = e−6 = 0.00248.
0!
That is, the probability of having no bacteria in 1 ml of liquid is 0.00248
λ
(b) P(X = 1) = × P(X = 0) = 6 × 0.00248 = 0.0149.
1
That is, the probability of having 1 bacteria in 1 ml of liquid is 0.0149
λ 6
(c) P(X = 2) = × P(X = 1) = × 0.01487 = 0.0446.
2 2
λ 6
(d) P(X = 3) = × P(X = 2) = × 0.04462 = 0.0892.
3 3
(e) P(X < 4) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.1512
66
(f) P(X = 6) = e−6 = 0.1606
6!
Note that in working out the first 6 answers, which link together, all the digits were kept in the
calculator to ensure accuracy. Answers were rounded off only when written down.
Never copy down answers correct to, say, 4 decimal places and then use those rounded figures to
calculate the next figure as rounding-off errors will become greater at each stage. If you did so here
you would get answers 0.0025, 0.0150, 0.0450, 0.9000 and P(X < 4) = 0.1525. The difference is
not great but could be significant.
HELM (2006): 47
Task
A Council is considering whether to base a recovery vehicle on a stretch of road to
help clear incidents as quickly as possible. The road concerned carries over 5000
vehicles during the peak rush hour period. Records show that, on average, the
number of incidents during the morning rush hour is 5. The Council won’t base a
vehicle on the road if the probability of having more than 5 incidents in any one
morning is less than 30%. Based on this information should the Council provide a
vehicle?
Your solution
(Do the calculation on separate paper and record the main results here.)
Answer
We need to calculate the probability that more than 5 incidents occur i.e. P(X > 5). To find this
we use the fact that P(X > 5) = 1 − P(X ≤ 5). Now, for this problem:
5r
P(X = r) = e−5
r!
Writing answers to 5 d.p. gives:
50
P(X = 0) = e−5 = 0.00674
0!
P(X = 1) = 5 × P(X = 0) = 0.03369
5
P(X = 2) = × P(X = 1) = 0.08422
2
5
P(X = 3) = × P(X = 2) = 0.14037
3
5
P(X = 4) = × P(X = 3) = 0.17547
4
5
P(X = 5) = × P(X = 4) = 0.17547
5
P(X ≤ 5) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5)

= 0.61596
The probability of more than 5 incidents is P(X > 5) = 1 − P(X ≤ 5) = 0.38403, which is 38.4%
(to 3 s.f.) so the Council should provide a vehicle.
48 HELM (2006):
®
3. Expectation and variance of the poisson distribution

The expectation and variance of the Poisson distribution can be derived directly from the definitions
which apply to any discrete probability distribution. However, the algebra involved is a little lengthy.
Instead we derive them from the binomial distribution from which the Poisson distribution is derived.
Intuitive Explanation
One way of deriving the mean and variance of the Poisson distribution is to consider the behaviour
of the binomial distribution under the following conditions:
1. n is large 2. p is small 3. np = λ (a constant)
Recalling that the expectation and variance of the binomial distribution are given by the results
E(X) = np and V(X) = np(1 − p) = npq
it is reasonable to assert that condition (2) implies, since q = 1 − p, that q is approximately 1 and
so the expectation and variance are given by
E(X) = np and V(X) = npq ≈ np
In fact the algebraic derivation of the expectation and variance of the Poisson distribution shows that
these results are in fact exact.
Note that the expectation and the variance are equal.
Key Point 9
The Poisson Distribution
If X is the random variable {number of occurrences in a given interval}
for which the average rate of occurrences is λ and X can assume the values 0, 1, 2, 3, . . . and the
probability of r occurrences in that interval is given by
λr
P(X = r) = e−λ
r!
then the expectation and variance of the distribution are given by the formulae
E(X) = λ and V(X) = λ
For a Poisson distribution the Expectation and Variance are equal.
HELM (2006): 49
Exercises
1. Large sheets of metal have faults in random positions but on average have 1 fault per 10 m2 .
What is the probability that a sheet 5 m × 8 m will have at most one fault?
2. If 250 litres of water are known to be polluted with 106 bacteria what is the probability that a
sample of 1 cc of the water contains no bacteria?
3. Suppose vehicles arrive at a signalised road intersection at an average rate of 360 per hour and
the cycle of the traffic lights is set at 40 seconds. In what percentage of cycles will the number
of vehicles arriving be (a) exactly 5, (b) less than 5? If, after the lights change to green, there
is time to clear only 5 vehicles before the signal changes to red again, what is the probability
that waiting vehicles are not cleared in one cycle?
4. Previous results indicate that 1 in 1000 transistors are defective on average.
(a) Find the probability that there are 4 defective transistors in a batch of 2000.
(b) What is the largest number, N , of transistors that can be put in a box so that the
probability of no defectives is at least 1/2?
5. A manufacturer sells a certain article in batches of 5000. By agreement with a customer the
following method of inspection is adopted: A sample of 100 items is drawn at random from
each batch and inspected. If the sample contains 4 or fewer defective items, then the batch
is accepted by the customer. If more than 4 defectives are found, every item in the batch is
inspected. If inspection costs are 75 p per hundred articles, and the manufacturer normally
produces 2% of defective articles, find the average inspection costs per batch.
6. A book containing 150 pages has 100 misprints. Find the probability that a particular page
contains (a) no misprints, (b) 5 misprints, (c) at least 2 misprints, (d) more than 1 misprint.
7. For a particular machine, the probability that it will break down within a week is 0.009. The
manufacturer has installed 800 machines over a wide area. Calculate the probability that (a)
5, (b) 9, (c) less than 5, (d) more than 4 machines breakdown in a week.
8. At a given university, the probability that a member of staff is absent on any one day is 0.001.
If there are 800 members of staff, calculate the probabilities that the number absent on any
one day is (a) 6, (b) 4, (c) 2, (d) 0, (e) less than 3, (f) more than 1.
9. The number of failures occurring in a machine of a certain type in a year has a Poisson
distribution with mean 0.4. In a factory there are ten of these machines. What is
(a) the expected total number of failures in the factory in a year?

(b) the probability that there are fewer than two failures in the factory in a year?
50 HELM (2006):
®
Exercises continued
10. A factory uses tools of a particular type. From time to time failures in these tools occur and
they need to be replaced. The number of such failures in a day has a Poisson distribution with
mean 1.25. At the beginning of a particular day there are five replacement tools in stock. A
new delivery of replacements will arrive after four days. If all five spares are used before the
new delivery arrives then further replacements cannot be made until the delivery arrives.
Find
(a) the probability that three replacements are required over the next four days.
(b) the expected number of replacements actually made over the next four days.
Answers
1. Poisson Process. In a sheet size 40 m2 we expect 4 faults
∴ λ=4 P(X = r) = λr e−λ /r!
P(X ≤ 1) = P(X = 0) + P(X = 1) = e−4 + 4e−4 = 0.0916

2. In 1 cc we expect 4 bacteria(= 106 /250000) ∴ λ=4
P(X = 0) = e−4 = 0.0183
3. In 40 seconds we expect 4 vehicles ∴ λ=4
(a) P (exactly 5) = λ5 e−λ /5! = 0.15629 i.e. in 15.6% of cycles
λ2 λ3 λ 4

−λ 4
(b) P (less than 5) = e 1+λ + + +
2! 3! 4!

32 32
= e−4 1 + 4 + 8 + + = 0.6288
3 3
Vehicles will not be cleared if more than 5 are waiting.
P (greater than 5) = 1 − P (exactly 5) −P (less than 5)
= 1 − 0.15629 − 0.6288 = 0.2148
4 (a) Poisson approximation to binomial
1
λ = np = 2000. =2
1000
P(X = 4) = λ4 e−λ /4! = 16e−2 /24 = 0.09022

λ0 e−λ
(b) λ = N p = N/1000; P(X = 0) = = e−λ = e−N/1000
0!
−N
e−N/1000 = 0.5 ∴ = ln(0.5)
1000
∴ N = 693.147 choose N = 693 or less.
HELM (2006): 51
Answers
5. P(defective) = 0.02. Poisson approximation to binomial λ = np = 100(0.02) = 2
P(4 or fewer defectives in sample of 100)
= P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4)
−2 −2 22 −2 23 −2 24 −2
= e + 2e + e + e + e = 0.947347
2 3! 4!
Cost c 75 75 × 50
Inspection costs
P(X = c) 0.947347 0.0526
E(Cost) = 75(0.947347) + 75 × 50(0.0526) = 268.5 p
6. (a) 0.51342 (b) 0.00056, (c) 0.14430, (d) 0.14430

7. (a) 0.12038, (b) 0.10698, (c) 0.15552, (d) 0.84448
8. (a) 0.00016, (b) 0.00767, (c) 0.14379, (d) 0.44933, (e) 0.95258, (f) 0.19121
9. Let X be the total number of failures.
(a) E(X) = 10 × 0.4 = 4.

(b) P(X < 2) = P(X = 0) + P(X = 1) = e−4 + 4e−4 = 5e−4 = 0.0916.
10. Let the number required over 4 days be X. Then E(X) = 4 × 1.25 = 5 and X ∼ Poisson(5).
e−5 53
(a) P(X = 3) = = 0.1404.
3!
(b) Let R be the number of replacements made.
E(R) = 0 × P(X = 0) + · · · + 4 × P(X = 4) + 5 × P(X ≥ 5),
and
P(X ≥ 5) = 1 − [P(X = 0) + · · · + P(X = 4)]
so E(R) = 5 − 5 × P(X = 0) − · · · − 1 × P(X = 4)

50 51 54

−5
= 5−e 5× +4× + ··· + 1 ×
0! 1! 4!
= 5 − 0.8773
= 4.123.
52 HELM (2006):
®
The Hypergeometric
Distribution 37.4
Introduction
The hypergeometric distribution enables us to deal with situations arising when we sample from
batches with a known number of defective items. In essence, the number of defective items in a
batch is not a random variable - it is a known, fixed, number.

Prerequisites
• understand the notation n Cr used in
Before starting this Section you should . . . probability calculations

Learning Outcomes • apply the hypergeometric distribution to

simple examples

HELM (2006): 53
Section 37.4: The Hypergeometric Distribution
1. The Hypergeometric distribution
Suppose we are sampling without replacement from a batch of items containing a variable number of
defectives. We are essentially assuming that we know the probability p that a given item is defective
but not the actual number of defective items contained in the batch. The number of defective items
in the batch is a random variable in this case.
When we sample from the batch, we are left with:
1. a smaller batch;
2. a (possibly) smaller (but still variable) number of defective items. The number of defective
items is still a random variable.
While the probability of finding a given number of defectives in a sample drawn from the second
batch will (in general) be different from the probability of finding a given number of defectives in a
sample drawn from the first batch, sampling from both batches may be described by the binomial
distribution for which:
n
P(X = r) = Cr (1 − p)n−r pr
Sampling in this case varies the values of n and p in general but not the underlying distribution
describing the sampling process.
Example 18
A batch of 100 piston rings is known to contain 10 defective rings. If two piston
rings are drawn from the batch, write down the probabilities that:
(a) the first ring is defective;

(b) the second ring is defective given that the first one is defective.
Solution
10 1
(a) The probability that the first ring is defective is clearly = .
100 10
(b) Assuming that the first ring selected is defective and we do not replace it, the probability
9 1
that the second ring is defective is equally clearly = .
99 11
The hypergeometric distribution may be thought of as arising from sampling from a batch of items
where the number of defective items contained in the batch is known.
Essentially the number of defectives contained in the batch is not a random variable, it is fixed.
54 HELM (2006):
®
The calculations involved when using the hypergeometric distribution are usually more complex than
their binomial counterparts.
If we sample without replacement we may proceed in general as follows:
• we may select n items from a population of N items in N Cn ways;
• we may select r defective items from M defective items in M Cr ways;
• we may select n − r non-defective items from N − M non-defective items in N −M Cn−r ways;
• hence we may select n items containing r defectives in M Cr × N −M Cn−r ways.
• hence the probability that we select a sample of size n containing r defective items from a
population of N items known to contain M defective items is
M N −M
Cr × Cn−r
NC
n
Key Point 10
Hypergeometric Distribution
The distribution given by
M N −M
Cr × Cn−r
P(X = r) = NC
n
which describes the probability of obtaining a sample of size n containing r defective items from
a population of size N known to contain M defective items is known as the hypergeometric
distribution.
Example 19
A batch of 10 rocker cover gaskets contains 4 defective gaskets. If we draw samples
of size 3 without replacement, from the batch of 10, find the probability that a
sample contains 2 defective gaskets.
Solution
M N −M
Cr × Cn−r
Using P(X = r) = NC
we know that N = 10, M = 4, n = 3 and r = 2.
n
4
C2 × 6 C1 6×6
Hence P(X = 2) = 10 C
= = 0.3
3 120
HELM (2006): 55
It is possible to derive formulae for the mean and variance of the hypergeometric distribution. How-
ever, the calculations are more difficult than their binomial counterparts, so we will simple state the
results.
Key Point 11
Expectation and Variance of the Hypergeometric Distribution
The expectation (mean) and variance of the hypergeometric random variable
M N −M
Cr × Cn−r
P(X = r) = NC
n
are given by
N −M M
E(X) = µ = np and V(X) = np(1 − p) where p =
N −1 N
Example 20
For the previous Example, concerning rocker cover gaskets, find the expectation
and variance of samples containing 2 defective gaskets.
Solution
M N −M
Cr × Cn−r
Using P(X = r) = NC
we know that N = 10, M = 4, n = 3 and r = 2.
n
Hence
4
E(X) = np = 3 × = 1.2
10
and
N −M 4 6 10 − 4
V(X) = np(1 − p) =3× × × = 0.48
N −1 10 10 10 − 1
56 HELM (2006):
®
Task
In the manufacture of car tyres, a particular production process is know to yield 10
tyres with defective walls in every batch of 100 tyres produced. From a production
batch of 100 tyres, a sample of 4 is selected for testing to destruction. Find:
(a) the probability that the sample contains 1 defective tyre

(b) the expectation of the number of defectives in samples of size 4
(c) the variance of the number of defectives in samples of size 4.
Your solution
Answer
Sampling is clearly without replacement and we use the hypergeometric distribution with
N = 100, M = 10, n = 4, r = 1 and p = 0.1. Hence:
M N −M
Cr × Cn−r
(a) P(X = r) = NC
gives
n
10 100−10
C1 × C4−1 10 × 117480
P(X = 1) = 100 C
= ≈ 0.3
4 3921225
(b) The expectation is E(X) = np = 4 × 0.1 = 0.4
N −M 90
(c) The variance is V(X) = np(1 − p) = 0.4 × 0.9 × ≈ 0.33
N −1 99
HELM (2006): 57
Task
A company (the producer) supplies microprocessors to a manufacturer (the con-
sumer) of electronic equipment. The microprocessors are supplied in batches of
50. The consumer regards a batch as acceptable provided that there are not
more than 5 defective microprocessors in the batch. Rather than test all of the
microprocessors in the batch, 10 are selected at random and tested.
(a) Find the probability that out of a sample of 10, d = 0, 1, 2, 3, 4, 5 are defec-
tive when there are actually 5 defective microprocessors in the batch.
(b) Suppose that the consumer will accept the batch provided that not more
than m defectives are found in the sample of 10.
(i) Find the probability that the batch is accepted when there are 5 defec-
tives in the batch.
(ii) Find the probability that the batch is rejected when there are 3 defec-
tives in the batch.
Your solution
58 HELM (2006):
®
Answer
(a) Let X = the numbers of defectives in a sample. Then
45
C10−d × 5 Cd
P(X = d) = 50 C
10
Hence
45
C10 × 5 C0 45
C9 × 5 C1
P(X = 0) = 50 C
= 0.311 P(X = 1) = 50 C
= 0.431
10 10
45
C8 × 5 C2 45
C7 × 5 C3
P(X = 2) = 50 C
= 0.210 P(X = 3) = 50 C
= 0.044
10 10
45
C6 × 5 C4 45
C5 × 5 C5
P(X = 4) = 50 C
= 0.004 P(X = 5) = 50 C
= 0.0001
10 10
(b) (i) Case D = 5
P(Accept batch with 5 defectives) is

m m 45
X X C10−d × 5 Cd
P(X = d) = 50 C
m≤5
10
d=0 d=0
(b) (ii) Case D = 3
P(Reject batch with 3 defectives) is

m m 47
X X C10−d × 3 Cd
1− P(X = d) = 1 − 50 C
m≤3
10
d=0 d=0
Exercise
A company buys batches of n components. Before a batch is accepted, m of the components are
selected at random from the batch and tested. The batch is rejected if more than d components in
the sample are found to be below standard.
(a) Find the probability that a batch which actually contains six below-standard components
is rejected when n = 20, m = 5 and d = 1.
(b) Find the probability that a batch which actually contains nine below-standard components
is rejected when n = 30, m = 10 and d = 1.
HELM (2006): 59
Answer
(a) Let the number of below-standard components in the sample be X. The probability of
acceptance is

14 6 14 6
5 0 4 1
P(X = 0) + P(X = 1) = +
20 20
5 5
14 13 12 11
5
× 4
× 3
× 2
× 101
+ 14
4
× 13
3
× 12
2
× 12
2
× 11
1
× 6
1
= 20 19 18 17 16
5
× 4
× 3
× 2
× 1
2002 + 6006
=
15504
= 0.5165
Hence the probability of rejection is 1 − 0.5165 = 0.4835.

(b) Let the number of below-standard components in the sample be X. The probability of
acceptance is

21 9 21 9
10 0 9 1
P(X = 0) + P(X = 1) = +
30 30
10 10
Now

21 9 21 20 19 18 17 16 15 14 13 12
= × × × × × × × × ×
10 0 10 9 8 7 6 5 4 3 2 1
= 352716

21 9 21 20 19 18 17 16 15 14 13 9
= × × × × × × × × ×
9 1 9 8 7 6 5 4 3 2 1 1
= 2645370

30 30 29 28 27 26 25 24 23 22 21
= × × × × × × × × ×
10 10 9 8 7 6 5 4 3 2 1
= 30045015
So the probability of acceptance is
352716 + 2645370
= 0.0998
30045015
Hence the probability of rejection is 1 − 0.0998 = 0.9002
60 HELM (2006):
Contents 38
Continuous Probability
Distributions
38.1 Continuous Probability Distributions 2

38.2 The Uniform Distribution 18
38.3 The Exponential Distribution 23
Learning outcomes
In this Workbook you will learn what a continuous random variable is. You wll find out how
to determine the expectation and variance of a continuous random variable which are
measures of the centre and spread of the distribution. You will learn about two distributions
important in engineering - uniform and exponential.
Continuous
Probability
Distributions 38.1
Introduction
It is often possible to model real systems by using the same or similar random experiments and
their associated random variables. Random variables may be classified in two distinct categories
called discrete random variables and continuous random variables. Discrete random variables can
take values which are discrete and which can be written in the form of a list. In contrast, continuous
random variables can take values anywhere within a specified range. This Section will familiarize
you with continuous random variables and their associated probability distributions. This Workbook
makes no attempt to cover the whole of this large and important branch of statistics. The most
commonly met continuous random variables in engineering are the Uniform, Exponential, Normal and
Weibull distributions. The Uniform and Exponential distributions are introduced in Sections 38.2 and
38.3 while the Normal distribution and the Weibull distribution are covered in 39 and
46 respectively.

Prerequisites
• be familiar with the concepts of expectation
Before starting this Section you should . . . and variance

'
$
• explain what is meant by the term continuous
random variable
Learning Outcomes • explain what is meant by the term continuous

• use two continuous distributions which are
important to engineers
& %
2 HELM (2006):
Workbook 38: Continuous Probability Distributions
®
1. Continuous probability distributions

In order to get a good understanding of continuous probability distributions it is advisable to start by
answering some fairly obvious questions such as: “What is a continuous random variable?” “Is there
any carry over from the work we have already done on discrete random variables and distributions?”
We shall start with some basic concepts and definitions.
Continuous random variables

In day-to-day situations met by practising engineers, experiments such as measuring current in a
piece of wire or measuring the dimensions of machined components play a part. However closely an
engineer tries to control an experiment, there will always be small variations in the results obtained
due to many factors: the influence of factors outside the control of the engineer. Such influences
include changes in ambient temperature which may affect the accuracy of measuring devices used,
slight variation in the chemical composition of the materials used to produce the objects (wire,
machined components in this case) under investigation. In the case of machined components, many
of the small variations seen in measurements may be due to the influence of vibration, cutting tool
wear in the machine producing the component, changes in raw material used and the process used
to refine it and even the measurement process itself!
Such variations (current and length for example) can be represented by a random variable and it is
customary to define an interval, finite or infinite, within which variation can take place. Since such
a variable (X say) can assume any value within an interval we say that the variable is continuous
rather than discrete. - its values form an entity we can think of as a continuum.
The following definition summarizes the situation.
Definition
A random variable X is said to be continuous if it can assume any value in a given interval. This
contrasts with the definition of a discrete random variable which can only assume discrete values.
Practical example
This example will help you to see how continuous random variables arise and will help you to distin-
guish between continuous and discrete random variables.
Consider a de-magnetised compass needle mounted at its centre so that it can spin freely. Its initial
position is shown in Figure 1(a). It is spun clockwise and when it comes to rest the angle θ, from
the vertical, is measured. See Figure 1(b).
(a) (b)
Figure 1
HELM (2006): 3
Section 38.1: Continuous Probability Distributions
Let X be the random variable
“angle θ measured after each spin”
Firstly, note that X is a random variable since it can take any value in the interval 0 to 2π and
we cannot be sure in advance which value it will take. However, after each spin and thinking in
probability terms, there are certainly two distinct questions we can ask.
• What is the probability that X lies between two values a, b, i.e. what is P(a < X < b)?
• What is the probability that X assumes a particular value, say c. We are really asking what is
the value of P(X = c)?
The first question is easy to answer provided we assume that the probability of the needle coming to
rest in a given interval is given by the formula:
Given interval in radians Given interval in radians
Probability = =
Total interval in radians 2π
The following results are easily obtained and they clearly coincide with what we intuitively feel is
correct:
π 1
(a) P 0 < X < = since the interval (0, π/2) covers one quarter of a full circle
2 4
π 3
(b) P < X < 2π = since the interval (π/2, 2π) covers three quarters of a full circle.
2 4
It is easy to see the generalization of this result for the interval (a, b), in which both a, b lie in the
interval (0, 2π) :
b−a
P(a < X < b) =
2π
The second question immediately presents problems! In order to answer a question of this kind would
require a measuring device (e.g. a protractor) with infinite precision: no such device exists nor could
one ever be constructed. Hence it can never be verified that the needle, after spinning, takes any
particular value; all we can be reasonably sure of is that the needle lies between two particular
values.
We conclude that in experiments of this kind we never determine the probability that the random
variable assume a particular value but only calculate the probability that it lies within a given range of
values. This kind of random variable is called a continuous random variable and it is characterised,
not by probabilities of the type P(X = c) (as was the case with a discrete random variable), but by
a function f (x) called the probability density function (pdf for short). In the case of the rotating
needle this function takes the simple form given with corresponding plot:
f (x)
( 1 1
f (x) =
, 0 ≤ x < 2π 2π
2π Area = 1
0, elsewhere
2π x
(a) Definition of the pdf (b) Plot of the pdf

Figure 2
4 HELM (2006):
®
The probability P(a < X < b) is the area under the function curve f (x) and so is given by the
integral
Z b
f (x)dx
a
π π
Suppose we wanted to find P <X< . Then using the definition of the pdf for this case:
6 4
π Z π π4
π 4 1 x 1 hπ π i 1 π 1
P <X< = dx = = − = × =
6 4 π
6
2π 2π π 2π 4 6 2π 12 24
6
π π
This is reasonable since the interval ( , ) is one twenty-fourth of the interval 0 to 2π.
6 4
In general terms we have
Z b
b−a
P(a < X < b) = f (t)dt = F (b) − F (a) =
a 2π
for the pdf under consideration here. Note also that
(a) f (x) ≥ 0, for all real x

Z ∞ Z 2π
1
(b) f (x) dx = dx = 1, i.e. total probability is 1.
−∞ 0 2π
We are now in a position to give a formal definition of a continuous random variable in Key Point 1.
Key Point 1
X is said to be a continuous random variable if there exists a function f (x) associated with X
called the probability density function with the properties
• f (x) ≥ 0 for all x
Z ∞
• f (x)dx = 1
−∞
Z b
• P(a < X < b) = f (x)dx = F (b) − F (a)
a
X first two bullet points in Key Point 1 are the analogues of the results P(X = xi ) ≥ 0 and
The
P(X = xi ) = 1 for discrete random variables.
i
HELM (2006): 5
Task
Which of the following are not probability density functions?
f (x) f (x)
1 1
(i) (ii)
1 x 2 x
( 10
x2 − 4x + , 0≤x≤3
(iii) f (x) = 3
0, elsewhere
Check whether the first two statements in Key Point 1 are satisfied for each pdf above:
Your solution
For (i)
Answer
x, 0 ≤ x ≤ 1
(i) We can write f (x) =
0, elsewhere
Z ∞ Z 1
1
f (x) ≥ 0 for all x but f (x) dx = x dx = =6 1.
−∞ 0 2
Thus this function is not a valid probability density function because the integral’s value is not 1.
Your solution
For (ii)
6 HELM (2006):
®
Answer
(ii)
1

 1 − x, 0 ≤ x ≤ 2

Note that f (x) = 2 f (x) ≥ 0 for all x

0, elsewhere

Z ∞ Z 2 2
x2

1
f (x) dx = 1 − x dx = x − =2−1=1
−∞ 0 2 4 0
1
(Alternatively, the area of the triangle is × 1 × 2 = 1)
2
This implies that f (x) is a valid probability density function.
Your solution
For (iii)
Answer
(iii)
Z ∞ 3 3
x3
Z
2 10 10
f (x) dx = x − 4x + dx = − 2x2 + x = (9 − 18 + 10) = 1
−∞ 0 3 3 3 0
but f (x) < 0 for 1 ≤ x ≤ 3. Hence (iii) is not a pdf.
Task
Find the probability that X takes a value between −1 and 1 when the pdf is given
by the following figure.
f (x)
1/2
−2 2 x
HELM (2006): 7
First find k:
Your solution
Answer
Z ∞
1
f (x) dx = area under curve = area of triangle = × 4 × k = 2k
−∞ 2
Z ∞
1
Also f (x) dx = 1, so 2k = 1 hence k =
−∞ 2
State the formula for f (x):

Your solution
Answer
1 1
− x, 0 ≤ x ≤ 2


2 4






f (x) = 1 1
+ x, −2 ≤ x < 0
2 4







 0, elsewhere.
Write down an integral to represent P(−1 < X < 1). Use symmetry to evaluate the integral.
Your solution
Answer
Z 1 Z 1 1
1 1 1 1 1 1 3
f (x) dx = 2 − x dx = 2 x − x2 =2 − =
−1 0 2 4 2 8 0 2 8 4
8 HELM (2006):
®
The cumulative distribution function

Analogous to the formula for the cumulative distribution function:
X
F (x) = P(X = xi )
xi ≤x
used in the case of a discrete random variable X with associated probabilities P(X = xi ), we define
a cumulative probability distribution function F (x) by means of the integral (being a form of a
sum):
Z x
F (x) = f (t)dt
−∞
The cdf represents the probability of observing a value less than or equal to x.
Task
For the pdf in the diagram below
f (x)
1/2
−2 2 x
obtain the cdf and verify the result obtained in the previous Task for
P (−1 ≤ X ≤ 1).
Your solution
HELM (2006): 9
Answer 

 0, x ≤ −2





 1 1 1
+ x + x2 −2 < x < 0


2 2 8


F (x) =
 1 1 1
+ x − x2 0 < x ≤ 2



2 2 8







 1 x≥2

1 1 1 1 1 1
P(−1 ≤ x ≤ 1) = F (1) − F (−1) = + = − − +
2 2 8 2 4 8
1 1 1 1 3
= + − − = .
2 2 8 8 4
10 HELM (2006):
®
Example 1
Traditional electric light bulbs are known to have a mean lifetime to failure of 2000
hours. It is also known that the distribution function p(t) of the time to failure
takes the form
p(t) = 1 − e−t/µ
where µ is the mean time to failure. You will see if you study the topic of reliability
in more detail that this is a realistic distribution function. The reliability function
r(t) , giving the probability that the light bulb is still working at time t, is defined
as
r(t) = 1 − p(t) = e−t/µ
Find the proportion of light bulbs that you would expect to fail before 1500 hours
and the proportion you would expect to last longer than 2500 hours.
Solution
Let T be the random variable ‘time to failure’.
The proportion of bulbs expected to fail before 1500 hours is given as
P(T < 1500) = 1 − e−1500/2000 = 1 − e−3/4 = 1 − 0.4724 = 0.5276
The proportion of bulbs expected to last longer that 2500 hours is given as
P(T > 2500) = 1 − P(T ≤ 2500) = e−2500/2000 = e−5/4 = 0.2865.
Using r(t) = 1 − p(t) we have r(2500) = 0.2865.
Hence we expect just under 53% of light bulbs to fail before 1500 hours service and just under 29%
of light bulbs to give over 2500 hours service.
Mean and variance of a continuous distribution

You will probably have realised by now that, essentially, the definitions of discrete and continuous
random variables are virtually the same provided we use the analogues given in the following table:
Quantity Discrete Variable Continuous Variable
Probability P(X = x) f (x)dx

Allowed Values P = x) ≥ 0
P(X Rf (x) ≥ 0
Summation P(X
P = x) f (x)dx
Expectation E(X)P= xP(X = x) ?
Variance V(X) = (x − µ)2 P(X = x) ?
Completing the above table of analogues to write down the mean and variance of a continuous
variable leads to the obvious definitions given in Key Point 2:
HELM (2006): 11
Key Point 2
Let X be a continuous random variable with associated pdf f (x). Then its expectation and variance
denoted by E(X) (or µ) and V(X) (or σ 2 ) respectively are given by:
Z ∞
µ = E(X) = x f (x)dx
−∞
and
Z ∞ Z ∞
2 2
σ = V(X) = (x − µ) f (x)dx = x2 f (x)dx − µ2
−∞ −∞
As with discrete random variables the variance V(X) can be written in an alternative form, more
amenable to calculation:
V(X) = E(X 2 ) − [E(X)]2
Z ∞
2
where E(X ) = x2 f (x)dx.
−∞
Task
For the variable X with pdf
1

 x, 0 ≤ x ≤ 2

f (x) = 2

0, elsewhere

find E(X) and then V(X).
First find E(X):

Your solution
Answer
Z 2 2
1 1 3 8 4
E(X) = x.x dx = x = = .
0 2 6 0 6 3
Now find E(X 2 ):

Your solution
12 HELM (2006):
®
Answer
Z 2 2
2 1 2 1 4
E(X ) = x.x dx = x = 2.
0 2 8 0
Now find V(X):

Your solution
Answer
V(X) = E(X 2 ) − {E(X)}2

16 2
= 2− = .
9 9
Task
The mileage (in 1000s of miles) for which a certain type of tyre will last is a random
variable with pdf
1

 e−x/20 , for all x > 0

f (x) = 20

0 for all x < 0

Find the probability that the tyre will last
(a) at most 10,000 miles;

(b) between 16,000 and 24,000 miles;
(c) at least 30,000 miles.
Your solution
HELM (2006): 13
Answer
Z b
(a) P(a < X < b) = f (x)dx
a
Z 10
P(X < 10) = f (x) dx
−∞
Z 10 10
1 −x/20 −x/20
= e dx = −e = 0.393
0 20 0
Z 24 24
1 −x/20 −x/20
(b) P(16 < X < 24) = e dx = − e = −e−1.2 + e−0.8 = 0.148
16 20 16
Z ∞ ∞
1 −x/20
(c) P(X > 30) = e dx = − e−x/20 = e−1.5 = 0.223
30 20 30
Important continuous distributions

There are a number of continuous distributions which have important applications in engineering and
science. The areas of application and a little of the history (where appropriate) of the more important
and useful distributions will be discussed in the later Sections and other Workbooks devoted to each
of the distributions. Among the most important continuous probability distributions are:
(a) the Uniform or Rectangular distribution, where the random variable X is restricted to a
finite interval [a, b] and f (x) has constant density often defined by a function of the form:
 1
 , a≤x≤b
−

f (x) = b a


0 otherwise
( 38.2)
(b) the Exponential distribution defined by a probability density function of the form:
f (t) = λe−λt λ is a given constant
( 38.3)
(c) the Normal distribution (often called the Gaussian distribution) where the random variable
X is defined by a probability density function of the form:
1 2 2
f (x) = √ e−(x−µ) /2σ µ, σ are given constants
σ 2π
( 39)
(d) the Weibull distribution where the random variable X is defined by a probability density
function of the form:
β
f (x) = αβ(αx)β−1 e−(αx) α, β are given constants
( 46.1)
14 HELM (2006):
®
Exercises
√ √
1. A target is made of three concentric circles of radii 1/ 3, 1 and 3 metres. Shots within
the inner circle count 4 points, within the middle band 3 points and within the outer band 2
points. (Shots outside the target count zero.) The distance of a shot from the centre of the
2
target is a random variable R with density function f (r) = , r > 0. Calculate the
π(1 + r2 )
expected value of the score after five shots.
2. A continuous random variable T has the following probability density function.

 0 (u < 0)
fT (u) = 3(1 − u/k) (0 ≤ u ≤ k) .
0 (u > k)

Find
(a) k.
(b) E(T ).
(c) E(T 2 ).
(d) V(T ).
3. A continuous random variable X has the following probability density function

 0 (u < 0)
fX (u) = ku (0 ≤ u ≤ 1)
0 (u > 1)

(a) Find k.
(b) Find the distribution function FX (u).
(c) Find E(X).
(d) Find V(X).
(e) Find E(eX ).
(f) Find V(eX ).
(g) Find the distribution function of eX . (Hint: For what values of X is eX < u?)
(h) Find the probability density function of eX .
(i) Sketch fX (u).
(j) Sketch FX (u).
HELM (2006): 15
Answers
1.
Z √1 √1
1 3 2 2 −1
3
P (inner circle) = P 0 < r < √ = dr = tan r
3 0 π(1 + r2 ) π 0
2 1 2 π
1
= tan−1 √ = =
π 3 π 6 3

1
P (middle band) = P √ <r<1
3
Z 1
2 2 −1 2 1 1
= 2
dr = tan r = tan−1 1 − = .
1
√ π(1 + r ) π √1 π 3 6
3 3
√3
√ √

2 2 1 1
P(outer band) = P(1 < r < 3) = tan−1 r = tan−1 3 − =
π 1 π 2 6
1 1 1 1
P(miss target) = 1 − − − =
6 6 3 3
Let S be the random variable equal to ‘score’.
s 0 2 3 4
P(S = s) 1 /3 1 /6 1 /6 1 /3
2 3 4 13
E(S) = 0 + + + =
6 6 3 6
13

The expected score after 5 shots is this value times 5 which is: = 5 6
= 10.83.
2.
k k
u2
Z
(a) 1= 3(1 − u/k) du = 3 u − = 3(k − k/2) so k = 2/3.
0 2k 0
Z 2/3 Z 2/3
(b) E(T ) = 3u(1 − 3u/2) du = 3 u − 3u2 /2 du
0 0
3 2/3
u2 u

2 4 6−4 2
3 − =3 − =3 = .
2 2 0 9 27 27 9
Z 2/3 Z 2/3
2 2
(c) E(T ) = 3u (1 − 3u/2) du = 3 u2 − 3u3 /2 du
0 0
4 2/3
u3 3u

8 6 8−6 2
=3 − =3 − =3 =
3 8 0 81 81 81 27
2 4 2
(d) V(T ) = E(T 2 ) − {E(T )}2 = − = .
27 81 81
16 HELM (2006):
®
Answers
3.
1 1
ku2
Z
k
(a) 1= ku du = = , so k = 2.
0 2 0 2

 0 (u < 0)
2
FX (u) = u (0 ≤ u ≤ 1)
1 (1 < u)

Z 1 3 1
2 2u 2
(b) E(X) = 2u du = = .
0 3 0 3
Z 1 4 1
2u 1 1 4 1
(c) E(X 2 ) = 2u3 du = = . so V(X) = E(X 2 ) − {E(X)}2 = − = .
0 4 0 2 2 9 18
Z 1 1 Z 1 1
X u u u u u
(e) E(e ) = 2ue du = 2ue −2 e du = 2ue − 2e = 2e − 2e + 2 = 2
0 0 0 0
Z 1 1 Z 1 1
2X 2u 2u 2u 2u
(f) E(e )= 2ue du = ue − e du = ue − e /2 = e2
2u
0 0 0 0
X 2X
= e2 /2 + 1/2 = (e2 + 1)/2 so V(e ) = E(e ) − {E(e )} = (e2 + 1)/2 − 4.
X 2
(g) P(eX < u) = P(X < ln u) = (ln u)2 for 0 < ln u < 1, i.e. 1 < u < e.

 0 (u < 1)
X 2
Hence the distribution function of e is FeX (u) = (ln u) ( ≤ u ≤ e)
1 (e < u)


 0
 (u < 1)
2 ln u
(h) The pdf of eX is feX (u) = ( ≤ u ≤ e)
 u

0 (e < u)
(i) Sketch of pdf:
fX (u) 2 6
1 u
-
0
(j) Sketch of distribution function:

FX (u) 1 6
1 u
-
0
HELM (2006): 17
The Uniform
Distribution 38.2
Introduction
This Section introduces the simplest type of continuous probability distribution which features a
continuous random variable X with probability density function f (x) which assumes a constant
value over a finite interval.
' $

Prerequisites and variance
• be familiar with the concept of continuous
&
# %
• explain what is meant by the term uniform
distribution
Learning Outcomes
On completion you should be able to . . . • calculate the mean and variance of a uniform
distribution
" !
18 HELM (2006):
1. The uniform distribution
The Uniform or Rectangular distribution has random variable X restricted to a finite interval [a, b]
and has f (x) a constant over the interval. An illustration is shown in Figure 3:
f (x)
1
b−a
x
a b
Figure 3
The function f (x) is defined by:
 1
 , a≤x≤b
−

f (x) = b a


0 otherwise
Mean and variance of a uniform distribution

Using the definitions of expectation and variance leads to the following calculations. As you might
expect, for a uniform distribution, the calculations are not difficult.
Using the basic definition of expectation we may write:
Z ∞ Z b b
1 1
E(X) = xf (x) dx = x dx = x2
−∞ a b−a 2(b − a) a
b − a2
2
=
2(b − a)
b+a
=
2
Using the formula for the variance, we may write:
V(X) = E(X 2 ) − [E(X)]2

Z b 2 b 2
2 1 b+a 1 3 b+a
= x. dx − = x −
a b−a 2 3(b − a) a 2
2
b 3 − a3

b+a
= −
3(b − a) 2
2 2
b + ab + a b2 + 2ab + a2
= −
3 4
(b − a)2
=
12
HELM (2006): 19
Section 38.2: The Uniform Distribution
Key Point 3
The Uniform random variable X whose density function f (x) is defined by
 1
 , a≤x≤b
−

f (x) = b a


0 otherwise
has expectation and variance given by the formulae
b+a (b − a)2
E(X) = and V(X) =
2 12
Example 2
The current (in mA) measured in a piece of copper wire is known to follow a uniform
distribution over the interval [0, 25]. Write down the formula for the probability
density function f (x) of the random variable X representing the current. Calculate
the mean and variance of the distribution and find the cumulative distribution
function F (x).
Solution
Over the interval [0, 25] the probability density function f (x) is given by the formula
 1
 = 0.04, 0 ≤ x ≤ 25
f (x) = 25 − 0



0 otherwise
Using the formulae developed for the mean and variance gives
25 + 0 (25 − 0)2
E(X) = = 12.5 mA and V(X) = = 52.08 mA2
2 12
The cumulative distribution function is obtained by integrating the probability density function as
shown below.
Z x
F (x) = f (t) dt
−∞
Hence, choosing the three distinct regions x < 0, 0 ≤ x ≤ 25 and x > 25 in turn gives:

 0,

x
x<0
F (x) = 0 ≤ x ≤ 25
 25

1 x > 25
20 HELM (2006):
Task
The thickness x of a protective coating applied to a conductor designed to work in
corrosive conditions follows a uniform distribution over the interval [20, 40] microns.
Find the mean, standard deviation and cumulative distribution function of the
thickness of the protective coating. Find also the probability that the coating is
less than 35 microns thick.
Your solution
Answer
Over the interval [20, 40] the probability density function f (x) is given by the formula

0.05, 20 ≤ x ≤ 40
f (x) =
0 otherwise
Using the formulae developed for the mean and variance gives
p 20
E(X) = 10 µm and σ = V(X) = √ = 5.77 µm
12
The cumulative distribution function is given by
Z x
F (x) = f (x) dx
−∞
Hence, choosing appropriate ranges for x, the cumulative distribution function is obtained as:


 0, x < 20



x − 20

F (x) = 20 ≤ x ≤ 40


 20


x ≥ 40

1
Hence the probability that the coating is less than 35 microns thick is
35 − 20
F (x < 35) = = 0.75
20
HELM (2006): 21
Section 38.2: The Uniform Distribution
Exercises
1. In the manufacture of petroleum the distilling temperature (T ◦ C) is crucial in determining the
quality of the final product. T can be considered as a random variable uniformly distributed over
150◦ C to 300◦ C. It costs £C1 to produce 1 gallon of petroleum. If the oil distills at temperatures
less than 200◦ C the product sells for £C2 per gallon. If it distills at a temperature greater than
200◦ C it sells for £C3 per gallon. Find the expected net profit per gallon.
2. Packages have a nominal net weight of 1 kg. However their actual net weights have a uniform
distribution over the interval 980 g to 1030 g.
(a) Find the probability that the net weight of a package is less than 1 kg.
(b) Find the probability that the net weight of a package is less than w g, where 980 < w <
1030.
(c) If the net weights of packages are independent, find the probability that, in a sample of
five packages, all five net weights are less than wg and hence find the probability density
function of the weight of the heaviest of the packages. (Hint: all five packages weigh less
than w g if and only if the heaviest weighs less that w g).
Answers
1.
1 1 2
P(X < 200) = 50 × = P(X > 200) =
150 3 3
Let F be a random variable defining profit.
F can take two values £(C2 − C1 ) or £(C3 − C1 )
x C2 − C1 C3 − C1
1 2
P(F = x) /3 /3

C2 − C1 2 C2 − 3C1 + 2C3
E(F ) = + [C3 − C1 ] =
3 3 3
2.
1000 − 98 20
(a) The required probability is P(W < 1000) = = = 0.4
1030 − 980 50
w − 980 w − 980
(b) The required probability is P(W < w) = =
1030 − 980 50
5
w − 980
(c) The probability that all five weigh less than w g is so the pdf of the
50
heaviest is
5 4 4
d w − 980 5 w − 980 w − 980
= = 0.1 for 980 < w < 1030.
dw 50 50 50 50
22 HELM (2006):
The Exponential
Distribution 38.3
Introduction
If an engineer is responsible for the quality of, say, copper wire for use in domestic wiring systems,
he or she might be interested in knowing both the number of faults in a given length of wire and
also the distances between such faults. While the number of faults may be analysed by using the
Poisson distribution, the distances between faults along the wire may be shown to give rise to the
exponential distribution defined and used in this Section.
' $

Prerequisites and variance
Before starting this Section you should . . . • be familiar with the concepts of continuous
distributions, in particular the Poisson
distribution.
& %
' $
• understand what is meant by the term
exponential distribution
Learning Outcomes • calculate the mean and variance of an

exponential distribution
• use the exponential distribution to solve
simple practical problems
& %
HELM (2006): 23
Section 38.3: The Exponential Distribution
1. The exponential distribution
The exponential distribution is defined by
f (t) = λe−λt t≥0 λ a constant
or sometimes (see the Section on Reliability in 46) by
1 −t/µ
f (t) = e t≥0 µ a constant
µ
The advantage of this latter representation is that it may be shown that the mean of the distribution
is µ.
Example 3
The lifetime T (years) of an electronic component is a continuous random variable
with a probability density function given by
f (t) = e−t t≥0 (i.e. λ = 1 or µ = 1)
Find the lifetime L which a typical component is 60% certain to exceed. If five
components are sold to a manufacturer, find the probability that at least one of
them will have a lifetime less than L years.
Solution
We require P(T > L) = 0.6. We know that this probability is given by the relationship
Z ∞ ∞
−t −t
P(T > L) = e dt = − e = e−L
L L
−L
Solving e = 0.6 for the least value of L we obtain L = 0.51 years.
Assuming that the lifetime of each component is independent we have
P(at least one component has a lifetime less than 0.51 years)
= 1 − P(no component has a lifetime less than 0.51 years)
= 1 − 0.65
= 0.92
24 HELM (2006):
Task
Commonly, car cooling systems are controlled by electrically driven fans. Assuming
that the lifetime T in hours of a particular make of fan can be modelled by an
exponential distribution with λ = 0.0003 find the proportion of fans which will
give at least 10000 hours service. If the fan is redesigned so that its lifetime may
be modelled by an exponential distribution with λ = 0.00035, would you expect
more fans or fewer to give at least 10000 hours service?
Your solution
Answer
We know that f (t) = 0.0003e−0.0003t so that the probability that a fan will give at least 10000 hours
service is given by the expression
Z ∞ Z ∞ ∞
−0.0003t −0.0003t
P(T > 10000) = f (t) dt = 0.0003e dt = − e = e−3 ≈ 0.0498
10000 10000 10000
Hence about 5% of the fans may be expected to give at least 10000 hours service. After the redesign,
the calculation becomes
Z ∞ Z ∞ ∞
−0.00035t −0.00035t
P(T > 10000) = f (t) dt = 0.00035e dt = − e = e−3.5 ≈ 0.0302
10000 10000 10000
and so only about 3% of the fans may be expected to give at least 10000 hours service.
Hence, after the redesign we expect fewer fans to give 10000 hours service.
HELM (2006): 25
Exercises
1. The time intervals between successive barges passing a certain point on a busy waterway have
an exponential distribution with mean 8 minutes.
(a) Find the probability that the time interval between two successive barges is less than 5
minutes.
(b) Find a time interval t such that we can be 95% sure that the time interval between two
successive barges will be greater than t.
2. It is believed that the time X for a worker to complete a certain task has probability density
function fX (x) where

0 (x ≤ 0)
fX (x) = 2 −λx
kx e (x > 0)
where λ is a parameter, the value of which is unknown, and k is a constant which depends on
λ.
Z ∞
n
(a) Show that if In = xn e−λx dx then In = In−1 , where n > 0 and λ > 0.
0 λ
Z ∞
Evaluate I0 = e−λx dx and hence find a general expression for In .
0
This result can be used in the rest of this question.

(b) Find, in terms of λ, the value of k.
(c) Find, in terms of λ, the expected value of X.
(d) Find, in terms of λ, the variance of X.
(e) Write down the expected value and variance of the sample mean of a sample of n inde-
pendent observations on X.
(f) Find, in terms of λ, the expected value of X −1 .
26 HELM (2006):
Answers
1. We have µ = 8 so λ = 0.125.
(a) The probability is

Z 5
P(T < 5) = 0.125e−0.125t dt = 1 − e−0.125×5 = 0.4647.
0
(b) We require
Z ∞
0.125e−0.125x dx = e−0.125t = 0.95.
t
So −0.125t = log 0.95 and

log 0.95
t=− = 0.4103.
0.125
That is, 24.6 s.
2.
Z ∞ ∞ Z ∞
n −λx 1 n −λx n n
(a) In = x e dx = − x e + xn−1 e−λ dx = In−1
0 λ 0 λ 0 λ
Z ∞ ∞
−λx 1 −λx 1 n!
I0 = e dx = − e = hence In = n+1 .
0 λ 0 λ λ
Z ∞
1 λ3
(b) kx2 e−λx dx = 1 ⇒ kI2 = 1 ⇒ k = =
0 I2 2
Z ∞
λ3 6 3
(c) E(X) = xfX (x) dx = kI3 = 4
=
0 2 λ λ
Z ∞
λ3 24 12
(d) E(X 2 ) = x2 fX (x) dx = kI4 = 5
= 2
0 2 λ λ
12 9 3
so V(X) = E(X 2 ) − {E(X)}2 = 2
− 2 = 2
λ λ λ
3 3
(e) E(X̄) = V(X̄) =
λ nλ2
Z ∞
1 1 λ3 1 λ
(f) E = fX (x) dx − kI1 = 2
=
X 0 x 2 λ 2
HELM (2006): 27
Contents 39
The Normal Distribution
39.1 The Normal Distribution 2

39.2 The Normal Approximation to the Binomial Distribution 25
39.3 Sums and Differences of Random Variables 33
Learning outcomes
In a previous Workbook you learned what a continuous random variable was. Here you
will examine the most important example of a continuous random variable: the normal
distribution. The probabilities of the normal distribution have to be determined numerically.
Tables of such probabilities, which refer to a simplified normal distribution called the
standard normal distribution, which has mean 0 and variance 1, will be used to determine
probabilities of the general normal distribution. Finally you will learn how to deal with
combinations of random variables which is an important statistical tool applicable to many
engineering situations.
The Normal
Distribution 39.1
Introduction
Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to
random errors in the production process a tolerance is set on deviations from the mean. For example
if we produce piston rings which have a target mean internal diameter of 45 mm then realistically
we expect the diameter to deviate only slightly from this value. The deviations from the mean value
are often modelled very well by the normal distribution. Suppose we decide that diameters in the
range 44.95 mm to 45.05 mm are acceptable, then what proportion of the output is satisfactory? In
this Section we shall see how to use the normal distribution to answer questions like this.

• be familiar with the basic properties of
Prerequisites probability
Before starting this Section you should . . . • be familiar with continuous random variables

'
$
• recognise the shape of the frequency curve
for the normal distribution and the standard
normal distribution
Learning Outcomes
• calculate probabilities using the standard
On completion you should be able to . . . normal distribution
• recognise key areas under the frequency curve

& %
2 HELM (2006):
Workbook 39: The Normal Distribution
®
1. The normal distribution

The normal distribution is the most widely used model for the distribution of a random variable. There
is a very good reason for this. Practical experiments involve measurements and measurements involve
errors. However you go about measuring a quantity, inaccuracies of all sorts can make themselves
felt. For example, if you are measuring a length using a device as crude as a ruler, you may find
errors arising due to:
• the calibration of the ruler itself;
• parallax errors due to the relative positions of the object being measured, the ruler and your eye;
• rounding errors;
• ‘guesstimation’ errors if a measurement is between two marked lengths on the ruler.
• mistakes.
If you use a meter with a digital readout, you will avoid some of the above errors but others, often
present in the design of the electronics controlling the meter, will be present. Errors are unavoidable
and are usually the sum of several factors. The behaviour of variables which are the sum of several
other variables is described by a very important and powerful result called the Central Limit Theorem
which we will study later in this Workbook. For now we will quote the result so that the importance
of the normal distribution will be appreciated.
The central limit theorem

Let X be the sum of n independent random variables Xi , i = 1, 2, . . . n each having a distribution
with mean µi and variance σi2 (σi2 < ∞), respectively, then the distribution of X has expectation
and variance given by the expressions
n
X n
X
E(X) = µi and V(X) = σi2
i=1 i=1
and becomes normal as n → ∞.

Essentially we are saying that a quantity which represents the combined effect of a number of variables
will be approximately normal no matter what the original distributions are provided that σ 2 < ∞.
This statement is true for the vast majority of distributions you are likely to meet in practice. This
is why the normal distribution is crucially important to engineers. A quotation attributed to Prof.
G. Lippmann, (1845-1921, winner of the Nobel prize for Physics in 1908) ‘Everybody believes on
the law of errors, experimenters because they think it is a mathematical theorem andmathematicians
because they think it is an experimental fact.’
You may think that anything you measure follows an approximate normal distribution. Unfortunately
this is not the case. While the heights of human beings follow a normal distribution, weights do
not. Heights are the result of the interaction of many factors (outside one’s control) while weights
principally depend on lifestyle (including how much and what you eat and drink!) In practice, it is
found that weight is skewed to the right but that the square root of human weights is approximately
normal.
HELM (2006): 3
Section 39.1: The Normal Distribution
The probability density function of a normal distribution with mean µ and variance σ 2 is given by
the formula
1 2 2
f (x) = √ e−(x−µ) /2σ
σ 2π
This curve is always bell-shaped with the centre of the bell located at the value of µ. See Figure
1. The height of the bell is controlled by the value of σ. As with all normal distribution curves it is
symmetrical about the centre and decays as x → ±∞. As with any probability density function the
area under the curve is equal to 1.
y 1 (x−μ)2
y = √ e− 2σ2
σ 2π
μ x
Figure 1
A normal distribution is completely defined by specifying its mean (the value of µ) and its variance
(the value of σ 2 .) The normal distribution with mean µ and variance σ 2 is written N (µ, σ 2 ). Hence
the distribution N (20, 25) has a mean of 20 and a standard deviation of 5; remember that the second
parameter is the variance which is the square of the standard deviation.
Key Point 1
A normal distribution has mean µ and variance σ 2 . A random variable X following this distribution
is usually denoted by N (µ, σ 2 ) and we often write
X ∼ N (µ, σ 2 )
Clearly, since µ and σ 2 can both vary, there are infinitely many normal distributions and it is impossible
to give tabulated information concerning them all.
For example, if we produce piston rings which have a target mean internal diameter of 45 mm then
we may realistically expect the actual diameter to deviate from this value. Such deviations are well-
modelled by the normal distribution. Suppose we decide that diameters in the range 44.95 mm to
45.05 mm are acceptable, we may then ask the question ‘What proportion of our manufactured
output is satisfactory?’
Without tabulated data concerning the appropriate normal distribution we cannot easily answer this
question (because the integral used to calculate areas under the normal curve is intractable.)
Since tabulated data allow us to apply the distribution to a wide variety of statistical situations, and
we cannot tabulate all normal distributions, we tabulate only one - the standard normal distribution
- and convert all problems involving the normal distribution into problems involving the standard
normal distribution.
4 HELM (2006):
®
2. The standard normal distribution

At this stage we shall, for simplicity, consider what is known as a standard normal distribution which
is obtained by choosing particularly simple values for µ and σ.
Key Point 2
The standard normal distribution has a mean of zero and a variance of one.
In Figure 2 we show the graph of the standard normal distribution which has probability density
1 2
function y = √ e−x /2
2π
y
1 2
y=√ e−x /2
2π
0 x
Figure 2: The standard normal distribution curve
The result which makes the standard normal distribution so important is as follows:
Key Point 3
If the behaviour of a continuous random variable X is described by the distribution N (µ, σ 2 ) then
X −µ
the behaviour of the random variable Z = is described by the standard normal distribution
σ
N (0, 1).
We call Z the standardised normal variable and we write
Z ∼ N (0, 1)
HELM (2006): 5
Example 1
If the random variable X is described by the distribution N (45, 0.000625) then
what is the transformation required to obtain the standardised normal variable?
Solution
Here, µ = 45 and σ 2 = 0.000625 so that σ = 0.025. Hence Z = (X − 45)/0.025 is the required
transformation.
Example 2
When the random variable X ∼ N (45, 0.000625) takes values between 44.95 and
45.05, between which values does the random variable Z lie?
Solution
45.05 − 45
When X = 45.05, Z = =2
0.025
44.95 − 45
When X = 44.95, Z = = −2
0.025
Hence Z lies between −2 and 2.
Task
The random variable X follows a normal distribution with mean 1000 and variance
100. When X takes values between 1005 and 1010, between which values does
the standardised normal variable Z lie?
Your solution
Answer
X − 1000
The transformation is Z = .
10
5
When X = 1005, Z = = 0.5
10
10
When X = 1010, Z = = 1.
10
Hence Z lies between 0.5 and 1.
6 HELM (2006):
®
3. Probabilities and the standard normal distribution

Since the standard normal distribution is used so frequently a table of values has been produced to
help us calculate probabilities - located at the end of the Workbook. It is based upon the following
diagram:
0 z1
Figure 3
Since the total area under the curve is equal to 1 it follows from the symmetry in the curve that
the area under the curve in the region Z > 0 is equal to 0.5. In Figure 3 the shaded area is the
probability that Z takes values between 0 and z1 . When we ‘look-up’ a value in the table we obtain
the value of the shaded area.
Example 3
What is the probability that Z takes values between 0 and 1.9? (Refer to the table
of normal probabilities at the end of the Workbook.)
Solution
The row beginning ‘1.9’ and the column headed ‘0’ is the appropriate choice and its entry is 4713.
This is to be read as 0.4713 (we omitted the ‘0.’ in each entry for clarity.) The interpretation is
that the probability that Z takes values between 0 and 1.9 is 0.4713.
Example 4
What is the probability that Z takes values between 0 and 1.96?
Solution
This time we want the row beginning 1.9 and the column headed ‘6’.
The entry is 4750 so that the required probability is 0.4750.
HELM (2006): 7
Example 5
What is the probability that Z takes values between 0 and 1.965?
Solution
There is no entry corresponding to 1.965 so we take the average of the values for 1.96 and 1.97.
(This linear interpolation is not strictly correct but is acceptable.)
The two values are 4750 and 4756 with an average of 4753. Hence the required probability is 0.4753.
Task
What are the probabilities that Z takes values between
(a) 0 and 2 (b) 0 and 2.3 (c) 0 and 2.33 (d) 0 and 2.333?
Your solution
Answer
(a) The entry is 4772; the probability is 0.4772.
(b) The entry is 4893; the probability is 0.4893.
(c) The entry is 4901; the probability is 0.4901.
(d) The entry for 2.33 is 4901, that for 2.34 is 4904.
Linear interpolation gives a value of 4901 + 0.3(4904 − 4901) i.e. about 4902; the
probability is 0.4902.
Note from Table 1 that as Z increases from 0 the entries increase, rapidly at first and then more
slowly, toward 5000 i.e. a probability of 0.5. This is consistent with the shape of the curve.
After Z = 3 the increase is quite slow so that we tabulate entries for values of Z rising by increments
of 0.1 instead of 0.01 as in the rest of Table 1.
8 HELM (2006):
®
4. Calculating other probabilities

In this Section we see how to calculate probabilities represented by areas other than those of the type
shown in Figure 3.
Case 1
Figure 4 illustrates what we do if both Z values are positive. By using the properties of the standard
normal distribution we can organise matters so that any required area is always of ‘standard form’.
Here the shaded region can be represented

by the difference between two shaded areas.
0 z1 z2
0 z2 0 z1
Figure 4
Example 6
Find the probability that Z takes values between 1 and 2.
Solution
Using Table 1:
P(Z = z2 ) i.e. P(Z = 2) is 0.4772
P(Z = z1 ) i.e. P(Z = 1) is 0.3413.
Hence P(1 < Z < 2) = 0.4772 − 0.3413 = 0.1359
Remember that with a continuous distribution, P(Z = 1) is meaningless (will have zero probability)
so that P(1 ≤ Z ≤ 2) is interpreted as P(1 < Z < 2).
HELM (2006): 9
Case 2
The following diagram illustrates the procedure to be followed when finding probabilities of the form
P(Z > z1 ).
This time the shaded area is the difference

between the right-hand half of the total
area and an area which can be read off
from Table 1.
0 z1
area 0.5
0 0 z1
Figure 5
Example 7
What is the probability that Z > 2?
Solution
P(0 < Z < 2) = 0.4772 (from Table 1). Hence the probability is 0.5 − 0.4772 = 0.0228.
Case 3
Here we consider the procedure to be followed when calculating probabilities of the form P(Z < z1 ).
Here the shaded area is the sum of the left-hand half of the total area and a ‘standard’ area.
0 z1
area 0.5
0 0 z1
Figure 6
10 HELM (2006):
®
Example 8
What is the probability that Z < 2?
Solution
P(Z < 2) = 0.5 + 0.4772 = 0.9772.
Case 4
Here we consider what needs to be done when calculating probabilities of the form
P(−z1 < Z < 0) where z1 is positive. This time we make use of the symmetry in the standard
normal distribution curve.
−z1 0
By symmetry this shaded area is equal in value

to the one above.
0 z1
Figure 7
Example 9
What is the probability that −2 < Z < 0?
Solution
The area is equal to that corresponding to P(0 < Z < 2) = 0.4772.
HELM (2006): 11
Case 5
Finally we consider probabilities of the form P(−z2 < Z < z1 ). Here we use the sum property and
the symmetry property.
−z1 0 z2
0 z1 0 z2
Figure 8
Example 10
What is the probability that −1 < Z < 2?
Solution
P(−1 < Z < 0) = P(0 < Z < 1) = 0.3413
P(0 < Z < 2) = 0.4772
Hence the required probability P(−1 < Z < 2) is 0.8185.
Other cases can be handed by a combination of the ideas already used.
12 HELM (2006):
®
Task
Find the following probabilities.
(a) P(0 < Z < 1.5) (b) P(Z > 1.8)
(c) P(1.5 < Z < 1.8) (d) P(Z < 1.8)
(e) P(−1.5 < Z < 0) (f) P(Z < −1.5)
(g) P(−1.8 < Z < −1.5) (h) P(−1.5 < Z < 1.8)
(A simple sketch of the standard normal curve will help.)
Your solution
Answer
(a) 0.4332 (direct from Table 1)
(b) 0.5 − 0.4641 = 0.0359
(c) P(0 < Z < 1.8) − P(0 < Z < 1.5) = 0.4641 − 0.4332 = 0.0309
(d) 0.5 + 0.4641 = 0.9641
(e) P(−1.5 < Z < 0) = P(0 < Z < 1.5) = 0.4332
(f) P(Z < −1.5) = P(Z > 1.5) = 0.5 − 0.4332 = 0.0668
(g) P(−1.8 < Z < −1.5) = P(1.5 < Z < 1.8) = 0.0309
(h) P(0 < Z < 1.5) + P(0 < Z < 1.8) = 0.8973
HELM (2006): 13
5. The cumulative distribution function
We know that the normal probability density function f (x) is given by the formula
1 2 2
f (x) = √ e−(x−µ) /2σ
σ 2π
and so the cumulative distribution function F (x) is given by the formula
Z x
1 2 2
F (x) = √ e−(u−µ) /2σ du
σ 2π −∞
In the case of the cumulative distribution for the standard normal curve, we use the special notation
Φ(z) and, substituting 0 and 1 for µ and σ 2 , we obtain
Z z
1 2
Φ(z) = √ e−u /2 du
2π −∞
The shape of the curve is essentially ‘S’ -shaped as shown in Figure 9. Note that the curve runs
from −∞ to +∞ . As you can see, the curve approaches the value 1 asymptotically.
Φ(z)
1
−2 −1 0 1 2 z
Figure 9
Comparing the integrals
Z x Z z
1 2 2 1 2 /2
F (x) = √ e−(u−µ) /2σ du and Φ(z) = √ e−v dv
σ 2π −∞ 2π −∞
shows that
u−µ du
v= and so dv =
σ σ
and F (x) may be written as
Z (x−µ)/σ
1 2
F (x) = √ e−v /2 σdv
σ 2π −∞
Z (x−µ)/σ
1 2 x−µ
=√ e−v /2 dv = Φ( )
2π −∞ σ
We already know, from the basic definition of a cumulative distribution function, that
P(a < X < b) = F (b) − F (a)
so that we may write the probability statement above in terms of Φ(z) as
b−µ a−µ
P(a < X < b) = F (b) − F (a) = Φ( ) − Φ( ).
σ σ
14 HELM (2006):
®
The value of Φ(z) is measured from z = −∞ to any ordinate z = z1 and represents the probability
P(Z < z1 ).
The values of Φ(z) start as shown below:
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 .5000 5040 5080 5120 5160 5199 5239 5279 5319 5359
0.1 .5398 5438 5478 5517 5577 5596 5636 5675 5714 5753
0.2 .5793 5832 5871 5909 5948 5987 6026 6064 6103 6141
You should compare the values given here with the values given for the normal probability integral
(Table 1 at the end of the Workbook). Simply adding 0.5 to the values in the latter table gives the
values of Φ(z). You should also note that the diagrams shown at the top of each set of tabulated
values tells you whether you are looking at the values of Φ(z) or the values of the normal probability
integral.
Exercises
1. If a random variable X has a standard normal distribution find the probability that it assumes
a value:
(a) less than 2.00
(b) greater than 2.58
(c) between 0 and 1.00
(d) between −1.65 and −0.84
2. If X has a standard normal distribution find k in each of the following cases:

(a) P(X < k) = 0.4
(b) P(X < k) = 0.95
(c) P(0 < X < k) = 0.1

Answers
1 (a) 0.9772 (b) 0.0049 (c) 0.3413 (d) 0.1510
2 (a) −0.2533 (b) 1.6450 (c) 0.2533
HELM (2006): 15
6. Applications of the normal distribution
We have, in the previous subsection, noted that the probability density function of a normal distri-
bution X is
1 −(x−µ)2
y = √ e 2σ2
σ 2π
This curve is always ‘bell-shaped’ with the centre of the bell located at the value of µ. The height
of the bell is controlled by the value of σ. See Figure 10.
y 1 (x−μ)2
y = √ e− 2σ2
σ 2π
μ x
Figure 10
We now show, by example, how probabilities relating to a general normal distribution X are deter-
mined. We will see that being able to calculate the probabilities of a standard normal distribution Z
is crucial in this respect.
Example 11
Given that the variate X follows the normal distribution X ∼ N (151, 152 ), calcu-
late:
(a) P(120 ≤ X ≤ 155); (b) P(X ≥ 185)
Solution
X −µ X − 151
The transformation used in this problem is Z = =
σ 15
(a)
120 − 151 155 − 151
P(120 ≤ X ≤ 155) = P( ≤Z≤ )
15 15
= P(−2.07 ≤ Z ≤ 0.27)
= 0.4808 + 0.1064 = 0.5872
(b) 185 − 151

P(X ≥ 185) = P(Z ≥ )
15
= P(Z ≥ 2.27)
= 0.5 − 0.4884 = 0.0116
We note that, as for any continuous random variable, we can only calculate the probability that
• X lies between two given values;
• X is greater than a given value;
• X is less that a given value.
rather than for individual values.
16 HELM (2006):
®
Task
A worn, poorly set-up machine is observed to produce components whose length
X follows a normal distribution with mean 20 cm and variance 2.56 cm Calculate:
(a) the probability that a component is at least 24 cm long;
(b) the probability that the length of a component lies between 19 and 21 cm.
Answer
X − 20
The transformation used is Z = giving
1.6
24 − 20
P(X ≥ 24) = P(Z ≥ ) = P(Z ≥ 2.5) = 0.5 − 0.4938 = 0.0062
1.6
and
19 − 20 21 − 20
P(19 < X < 21) = P( <Z< ) = P(−0.625 < Z < 0.625) = 0.4681
1.6 1.6
Example 12
Piston rings are mass-produced. The target internal diameter is 45 mm
but records show that the diameters are normally distributed with mean 45
mm and standard deviation 0.05 mm. An acceptable diameter is one within
the range 44.95 mm to 45.05 mm. What proportion of the output is unacceptable?
Solution
There are many words in the statement of the problem; we must read them carefully to extract the
necessary information. If X is the diameter of a piston ring then X ∼ N (45, (0.05)2 ).
X −µ X − 45
The transformation is Z = = .
σ 0.05
The upper limit of acceptability is x2 = 45.05 so that z2 = (45.05 − 45)/0.05 = 1.
The lower limit of acceptability is x1 = 44.95 so that z1 = (44.95 − 45)/0.05 = −1.
The range of ‘acceptable’ Z values is therefore −1 to 1. Figure 11 below.
−1 0 +1 z
Figure 11
Using the symmetry of the curves P(−1 < Z < 1) = 2 × P(0 < Z < 1) = 2 × 0.3413 = 0.6826.
Thus the proportion of unacceptable items is 1 − 0.6826 = 0.3174, or 31.74%.
HELM (2006): 17
Example 13
If the standard deviation is halved by improved production practices what is now
the proportion of unacceptable items?
Solution
Now σ = 0.025 so that:
45.05 − 45
z2 = =2 and z1 = −2
0.025
Then P(−2 < Z < 2) = 2 × P(0 < Z < 2) = 2 × 0.4772 = 0.9544. Hence the proportion of
unacceptable items is reduced to 1 − 0.9544 = 0.0456 or 4.56%.
We observe that less of the area under the curve now lies outside the interval (44.95, 45.05).
−2 2
Figure 12
Task
The resistance of a strain gauge is normally distributed with a mean of 100 ohms
and a standard deviation of 0.2 ohms. To meet the specification, the resistance
must be within the range 100 ± 0.5 ohms.
(a) What percentage of gauges are unacceptable?
First, state the upper and lower limits of acceptable resistance and find the Z−values which corre-
spond:
Your solution
Answer
X − 100
(0.2)2 = 0.04

x1 = 99.5, x2 = 100.5 Z= so that z1 = −2.5 and z2 = 2.5
0.2
18 HELM (2006):
®
Now, using a suitable sketch, calculate the probability that z1 < Z < z2 :
Your solution
Answer
Here the shaded region can be represented

by the difference between two shaded areas.
0 z1 z2
0 z2 0 z1
The shaded area (see diagram) is 0.4938 (from the table of values on page 15). Using symmetry,
P(−2.5 < Z < 2.5) = 2 × 0.4938

= 0.9876.
Hence the proportion of acceptable gauges is 98.76%.

Therefore the proportion of unacceptable gauges is 1.24%.
HELM (2006): 19
(b) To what value must the standard deviation be reduced if the proportion of unacceptable gauges
is to be no more than 0.2%?
First sketch the standard normal curve marking on it the lower and upper values z1 and z2 and
appropriate areas:
Your solution
Answer
This time the shaded area is the difference

between the right-hand half of the total
area and an area which can be read off
from Table 1.
0 z1
area 0.5
0 0 z1
Now use the Table to find z2 , and hence write down the value of z1 :
Your solution
Answer
z2 = 3.1 so that z1 = −3.1
X −µ
Finally, rewrite Z = to make σ the subject. Put in values for z2 , x2 and µ hence evaluate σ:
σ
Your solution
Answer
X −µ 100.5 − 100
σ= = = 0.16 (2 d.p.)
Z 3.1
20 HELM (2006):
®
7. Probability intervals - standard normal distribution

We use probability models to make predictions in situations where there is not sufficient data available
to make a definite statement. Any statement based on these models carries with it a risk of being
proved incorrect by events. Notice that the normal probability curve extends to infinity in both
directions. Theoretically any value of the normal random variable is possible, although, of course,
values far from the mean position (zero) are very unlikely.
Consider the diagram in Figure 13:
95%
−1.96 0 1.96
Figure 13
The shaded area is 95% of the total area. If we look at the entry in Table 1 (at the end of the
Workbook) corresponding to Z = 1.96 we see the value 4750. This means that the probability of
Z taking a value between 0 and 1.96 is 0.475. By symmetry, the probability that Z takes a value
between −1.96 and 0 is also 0.475. Combining these results we see that
P(−1.96 < Z < 1.96) = 0.95 or 95%
We say that the 95% probability interval for Z (about its mean of 0) is (−1.96, 1.96). It follows that
there is a 5% chance that Z lies outside this interval.
Task
Find the 99% probability interval for Z about its mean, i.e. the value of z1 in the
diagram:
99%
−z1 0 z1
The shaded area is 99% of the total area
First, note that 99% corresponds to a probability of 0.99. Find z1 such that
1
P(0 < Z < z1 ) = × 0.99 = 0.495 :
2
Your solution
HELM (2006): 21
Answer
We look for a table value of 4950. The nearest we get is 4949 and 4951 corresponding to Z = 2.57
and Z = 2.58 respectively. We choose Z = 2.58.
Now quote the 99% probability interval:

Your solution
Answer
(−2.58, 2.58) or −2.58 < Z < 2.58.
Notice that the risk of Z lying outside this wider interval is reduced to 1%.
Task
Find the value of Z
(a) which is exceeded on 5% of occasions
(b) which is exceeded on 99% of occasions.
Your solution
Answer
(a) The value is z1 , where P(Z > z1 ) = 0.05. Hence P(0 < Z < z1 ) = 0.5 − 0.05 = 0.45 This
corresponds to a table entry of 4500. The nearest values are 4495 (Z = 1.64) and 4505 (Z = 1.65).
Hence the required value is Z1 = 1.65.
(b) Values less than z1 occur on 1% of occasions. By symmetry values greater than (−z1 ) occur
on 1% of occasions so that P(0 < z < −z1 ) = 0.49. The nearest table corresponding to 4900 is
4901 (Z = 2.33).
Hence the required value is z1 = −2.33.
22 HELM (2006):
®
8. Probability intervals - general normal distribution

We saw in subsection 3 that 95% of the area under the standard normal curve lay between z1 = −1.96
X −µ
and z2 = 1.96. Using the formula Z = in the re-arrangement X = µ + Zσ. We can see that
σ
95% of the area under the general normal curve lies between x1 = µ − 1.96σ and x2 = µ + 1.96σ.
95%
μ −1.96σ μ μ +1.96σ
Figure 14
Example 14
Suppose that the internal diameters of mass-produced pipes are normally dis-
tributed with mean 50 mm and standard deviation 2 mm. What are the 95%
probability limits on the internal diameter of a single pipe?
Solution
Here µ = 50 σ = 2 so that the 95% probability limits are
50 ± 1.96 × 2 = 50 ± 3.92mm
i.e. 46.08 mm and 53.92 mm.
The probability interval is (46.08, 53.92).
Task
What is the 99% probability interval for the lifetime of a bulb when the lifetimes
of such bulbs are normally distributed with a mean of 2000 hours and standard
deviation of 40 hours?
First sketch the standard normal curve marking the values z1 , z2 between which 99% of the area
under the curve is located:
Your solution
HELM (2006): 23
Answer
−z1 0
By symmetry this shaded area is equal in value

to the one above.
0 z1
Now deduce the corresponding values x1 , x2 for the general normal distribution:
Your solution
Answer
x1 = µ − 2.58σ, x2 = µ + 2.58σ
Next, find the values for x1 and x2 for the given problem:
Your solution
Answer
x1 = 2000 − 2.58 × 40 = 1896.8 hours
x2 = 2000 + 2.58 × 40 = 2103.2 hours
Finally, write down the 99% probability interval for the lifetimes:
Your solution
Answer
(1896.8 hours, 2103.2 hours).
24 HELM (2006):
®
The Normal
Approximation to the
Binomial Distribution 39.2
Introduction
We have already seen that the Poisson distribution can be used to approximate the binomial distri-
bution for large values of n and small values of p provided that the correct conditions exist. The
approximation is only of practical use if just a few terms of the Poisson distribution need be calcu-
lated. In cases where many - sometimes several hundred - terms need to be calculated the arithmetic
involved becomes very tedious indeed and we turn to the normal distribution for help. It is possible,
of course, to use high-speed computers to do the arithmetic but the normal approximation to the
binomial distribution negates the necessity of this in a fairly elegant way. In the problem situations
which follow this introduction the normal distribution is used to avoid very tedious arithmetic while
at the same time giving a very good approximate solution.
#
• be familiar with the normal distribution and
the standard normal distribution
Prerequisites
Before starting this Section you should . . . • be able to calculate probabilities using the
standard normal distribution
"
' !
$
• recognise when it is appropriate to use the
normal approximation to the binomial
distribution
Learning Outcomes • solve problems using the normal

approximation to the binomial distribution.
• interpret the answer obtained using the
normal approximation in terms of the original
problem
& %
HELM (2006): 25
Section 39.2: The Normal Approximation to the Binomial Distribution
1. The normal approximation to the binomial distribution
A typical problem
An engineering professional body estimates that 75% of the students taking undergraduate engineer-
ing courses are in favour of studying of statistics as part of their studies. If this estimate is correct,
what is the probability that more than 780 undergraduate engineers out of a random sample of 1000
will be in favour of studying statistics?
Discussion
The problem involves a binomial distribution with a large value of n and so very tedious arithmetic
may be expected. This can be avoided by using the normal distribution to approximate the binomial
distribution underpinning the problem.
If X represents the number of engineering students in favour of studying statistics, then
X ∼ B(1000, 0.75)
Essentially we are asked to find the probability that X is greater than 780, that is P(X > 780).
The calculation is represented by the following statement
P(X > 780) = P(X = 781) + P(X = 782) + P(X = 783) + · · · + P(X = 1000)
In order to complete this calculation we have to find all 220 terms on the right-hand side of the
expression. To get some idea of just how big a task this is when the binomial distribution is used,
imagine applying the formula
n(n − 1)(n − 2) . . . (n − r + 1)pr (1 − p)n−r
P(X = r) =
r(r − 1)(r − 2) . . . 3.2.1
220 times! You would have to take n = 1000, p = 0.75 and vary r from 781 to 1000. Clearly, the
task is enormous.
Fortunately, we can approximate the answer very closely by using the normal distribution with the
same mean and standard deviation as X ∼ B(1000, 0.75). Applying the usual formulae for µ and σ
we obtain the values µ = 750 and σ = 13.7 from the binomial distribution.
We now have two distributions, X ∼ B(1000, 0.75) and (say) Y ∼ N (750, 13.72 ). Remember
that the second parameter represents the variance. By doing the appropriate calculations, (this is
extremely tedious even for one term!) it can be shown that
P(X = 781) ≈ P(780.5 ≤ Y ≤ 781.5)
This statement means that the probability that X = 781 calculated from the binomial distribution
X ∼ B(1000, 0.75) can be very closely approximated by the area under the normal curve Y ∼
N (750, 13.72 ) between 780.5 and 781.5. This relationship is then applied to all 220 terms involved
in the calculation.
26 HELM (2006):
®
The result is summarised below:
P(X = 781) ≈ P(780.5 ≤ Y ≤ 781.5)

P(X = 782) ≈ P(781.5 ≤ Y ≤ 782.5)
..
.
P(X = 999) ≈ P(998.5 ≤ Y ≤ 999.5)
P(X = 1000) ≈ P(999.5 ≤ Y ≤ 1000.5)
By adding these probabilities together we get

P(X > 780) = P(X = 781) + P(X = 782) + · · · + P(X = 1000)
≈ P(780.5 ≤ Y ≤ 1000.5)
To complete the calculation we need only to find the area under the curve Y ∼ N (750, 13.72 ) between
the values 780.5 and 1000.5. This is far easier than completing the 220 calculations suggested by
the use of the binomial distribution.
Finding the area under the curve Y ∼ N (750, 13.72 ) between the values 780.5 and 1000.5 is easily
done by following the procedure used previously. The calculation, using the tables on page 15 and
working to three decimal places, is
780.5 − 750 1000.5 − 750
P(X > 780) ≈ P( ≤Z≤ )
13.7 13.7
= P(2.23 ≤ Z ≤ 18.28)
= P(Z ≥ 2.23)
= 0.013
Notes:
1. Since values as high as 18.28 effectively tell us to find the area to the right of 2.33 (the area
to the right of 18.28 is so close to zero as to make no difference) we have
P(Z ≥ 2.23) = 0.0129 ≈ 0.013
2. The solution given assumes that the original binomial distribution can be approximated by a
normal distribution. This is not always the case and you must always check that the following
conditions are satisfied before you apply a normal approximation. The conditions are:
• np > 5
• n(1 − p) > 5
You can see that these conditions are satisfied here.
HELM (2006): 27
Task
A particular production process used to manufacture ferrite magnets used to op-
erate reed switches in electronic meters is known to give 10% defective magnets
on average. If 200 magnets are randomly selected, what is the probability that the
number of defective magnets is between 24 and 30?
Your solution
Answer
If X is the number of defective magnets then X ∼ B(200, 0.1) and we require
P(24 < X < 30) = P(25 ≤ X ≤ 29)
Now,
p √
µ = np = 200 × 0.1 = 20 and σ = np(1 − p) = 200 × 0.1 × 0.9 = 4.24
Note that np > 5 and n(1 − p) > 5 so that approximating X ∼ B(200, 0.1) by Y ∼ N (20, 4.242 )
is acceptable. We can approximate X ∼ B(200, 0.1) by the normal distribution Y ∼ N (20, 4.242 )
and use the transformation
Y − 20
Z= ∼ N (0, 1)
4.24
so that
P(25 ≤ X ≤ 29) ≈ P(24.5 ≤ Y ≤ 29.5)

24.5 − 20 29.5 − 20
= P( ≤Z≤ )
4.24 4.24
= P(1.06 ≤ Z ≤ 2.24)
= 0.4875 − 0.3554 = 0.1321
28 HELM (2006):
®
Example 15
Overbooking of passengers on intercontinental flights is a common practice among
airlines. Aircraft which are capable of carrying 300 passengers are booked to carry
320 passengers. If on average 10% of passengers who have a booking fail to turn
up for their flights, what is the probability that at least one passenger who has a
booking will end up without a seat on a particular flight?
Solution
Let p = P(a passenger with a booking, fails to turn up) = 0.10.
Then: q = P(a passenger with a booking, turns up) = 1 − p = 1 − 0.10 = 0.9
Let X = number of passengers with a booking who turn up.
As there are 320 bookings, we are dealing with the terms of the binomial expansion of
320 × 319 318 2
(q + p)320 = q 320 + 320q 319 p + q p + · · · + p320
2!
Using this approach is too long to calculate by finding the values term by term. It is easier to switch
to the corresponding normal distribution, i.e. that which has the same mean and variance as the
binomial distribution above.
Mean = µ = 320 × 0.9 = 288
√
Variance = 320 × 0.9 × 0.1 = 28.8 so σ = 28.8 = 5.37
Hence, the corresponding normal distribution is given by Y ∼ N (288, 28.8)
300.5 − 288
So that, P(X > 300) ≈ P(Y ≥ 300.5) = P(Z ≥ ) = P(Z ≥ 2.33)
5.37
From Z-tables P(Z ≥ 2.33) = 0.0099.
NB. Continuity correction is needed when changing from the binomial, a discrete distribution, to
the normal, a continuous distribution.
HELM (2006): 29
Exercises
1. The diameter of an electric cable is normally distributed with mean 0.8 cm and variance 0.0004
cm2 .
(a) What is the probability that the diameter will exceed 0.81 cm?
(b) The cable is considered defective if the diameter differs from the mean by more than
0.025 cm. What is the probability of obtaining a defective cable?
2. A machine packs sugar in what are nominally 2 cm kg bags. However there is a variation in
the actual weight which is described by the normal distribution.
(a) Previous records indicate that the standard deviation of the distribution is 0.02 cm kg
and the probability that the bag is underweight is 0.01. Find the mean value of the
distribution.
(b) It is hoped that an improvement to the machine will reduce the standard deviation while
allowing it to operate with the same mean value. What value standard deviation is needed
to ensure that the probability that a bag is underweight is 0.001?
3. Rods are made to a nominal length of 4 cm but in fact the length is a normally distributed
random variable with mean 4.01 cm and standard deviation 0.03. Each rod costs 6p to make
and may be used immediately if its length lies between 3.98 cm and 4.02 cm. If its length is
less than 3.98 cm the rod cannot be used but has a scrap value of 1p. If the length exceeds
4.02 cm it can be shortened and used at a further cost of 2p. Find the average cost per usable
rod.
4. A supermarket chain sells its ‘own-brand’ label instant coffee in packets containing 200 gm of
coffee granules. The packets are filled by a machine which is set to dispense fills of 200 gm If
fills are normally distributed, about a mean of 200 gm and with a standard deviation of 7 gm,
find the number of packets out of a consignment of 1,000 packets that:
(a) contain more than 215 gm

(b) contain less than 195 gm
(c) contain between 190 to 210 gm
The supermarket chain decides to withdraw all packets with less than a certain weight of coffee.
As a result, 40 packets which were in the consignment of 1,000 packets are withdrawn. What
is the weight at which the ‘line has been drawn’ ?
5. The time taken by a team to complete the assembly of an electrical component is found to
be normally distributed, about a mean of 110 minutes, and with a standard deviation of 10
minutes.
(a) Out of a group of 20 teams, how many will complete the assembly:
(i) within 95 minutes. (ii) in more than 2 hours.

(b) If the management decides to set a ‘cut off’ time such that 95% of the teams will have
completed the assembly on time, what time limit should be set?
30 HELM (2006):
®
Answers
1. X ∼ N (0.8, 0.0004)

0.81 − 0.8
(a) P(X > 0.81) = P Z >
0.02
= P(Z > 0.5) = 0.5 − P(0 < Z < 0.5) = 0.5 − 0.1915 = 0.3085
(b) P[(X > 0.825) ∪ (X < 0.785)] = 2P(X > 0.825)

0.025
= 2P Z > = 2P(Z > 1.25)
0.02
= 2[−P(0 < Z < 1.25) + 0.5] = 2[−0.3944 + 0.5] = 0.2112

2−µ
2. (a) σ = 0.02, P(X < 2) = 0.01 We need to find µ from P Z < = 0.01.
0.02

µ−2 µ−2
∴ 0.05 − P 0 < Z < = 0.01 ∴ = 2.33 ∴ µ = 2.0466
0.02 0.02
(b) Now we require σ such that P(X < 2) = 0.001 with µ = 2.0466

0.0466
i.e. 0.5 − P 0 < Z < = 0.001
σ

0.0466 0.0466
∴ P 0<Z< = 0.499 ∴ = 3.1 ∴ σ = 0.015
σ σ
3. L ∼ N (4.01, (0.03)2 )
Cost has 2 possible values per usable rod: 6p, 8p.

4.01 − 3.98 4.02 − 4.01
P(C = 6) = P(3.98 < L < 4.02) = P 0 < Z < +P 0<Z <
0.03 0.03
= P(0 < Z < 1) + P(0 < Z < 0.333) = 0.3413 + 0.1305 = 0.4718
P(C = 8) = P(L > 4.02) = P(Z > 0.333) = 0.5 − P(0 < Z < 0.333) = 0.3695
For every 100 rods produced:
Total
36.95 are usable after shortening costing 8p each 295.6
47.18 are immediately usable costing 6p each 283.08
15.87 are scrap costing 5p each 79.35
283.08 + 295.6 + 79.35
Average cost per usable rod = = 7.82
84.13
HELM (2006): 31
Answers
4. Let X = the amount of coffee in a fill; then X ∼ N (200, 7)
215.0 − 200.0
(a) P(X > 215) = P(Z > ) = P(Z > 2.14) = 0.016 from Z-tables.
7.0
Hence, from a consignment of 1000 packets, the number containing more than
215 gm = 1000 × 0.016 = 16

195.0 − 200.0
(b) P(X < 195) = P(Z < = P(Z < −0.714) = 0.2389 from Z-tables.
7.0
Hence, from a consignment of 1000 packets, the number containing less than
195 gm = 1000 × 0.2389 = 238.9

(c)
190.0 − 200.0 210.0 − 200.0
P(190.0 < X < 210.0) = P( <Z< )
7.0 7
= P(−1.43 < Z < 1.43) = 0.8472 from Z-tables.
Hence, from a consignment of 1000 packets, the number containing between
190 gm and 210 gm = 1000 × 0.8472 = 847

40
If 40 out of the 1000 packets are withdrawn, then P(sub-standard packet) = = 0.04.
1000
Let k be the limit below which packets are sub-standard, then P(X < k) = 0.04
From Z-tables, Z = −1.75 as we are dealing with ‘less than’ i.e. the ‘left-hand’ part of the
standard normal distribution curve.
k − 200.0
Hence, = −1.75 i.e. k = −1.75(7) + 200.0 = 187.75
7
‘Line drawn’ at 188 gm; any packet below this value to be withdrawn.
5. Let X be the time taken to assemble the component; then X ∼ N (110, 10)
95.0 − 110.0
(a) P(X < 95) = P(Z < ) = P(Z < −1.5) = 0.3085 from Z-tables
10.0
Hence, from a group of 20 teams, the number completing the assembly within
95 minutes = 20 × 0.3085 = 6.17 so the number of teams is 6.

120.0 − 110.0
(b) P(X > 120) = P(Z > ) = P(Z > 1.0) = 0.1587 from Z-tables
10.0
Hence, from a group of 20 teams, the number completing the assembly in more than
2 hours = 20 × 0.1587 = 3.174 so the number of teams is 3.
If 95% of teams are to complete the assembly ‘on time’, then 5% take longer than the set time,
k, and P(X > k) = 0.05 hence, Z = 1.64
k − 110.0
Therefore, = 1.64 or, k = 10(1.64) + 110.0 = 126.4 minutes.
10.0
32 HELM (2006):
®
Sums and
Differences of
Random Variables 39.3
Introduction
In some situations, it is possible to easily describe a problem in terms of sums and differences of
random variables. Consider a typical situation in which shafts are fitted to cylindrical sleeves. One
random variable is used to describe the variability of the diameter of the shaft, and one is used to
describe the variability of the sleeves. Clearly, we need to know how the total variability involved
affects the fitting of shafts and sleeves. In this Section, we will confine ourselves to cases where the
random variables are normally distributed and independent.

• be familiar with the results and concepts met
Prerequisites in the study of probability
Before starting this Section you should . . . • be familiar with the normal distribution

'
$
• describe a variety of problems in terms of
sums and differences of normal random
Learning Outcomes variables
On completion you should be able to . . . • solve problems described in terms of sums
and differences of normal random variables
& %
HELM (2006): 33
Section 39.3: Sums and Differences of Random Variables
1. Sums and differences of random variables
In some situations, we can specify a problem in terms of sums and differences of random variables.
Here we confine ourselves to cases where the random variables are normally distributed. Typical
situations may be understood by considering the following problems.
Problem 1
In a certain mass-produced assembly, a 3 cm shaft must slide into a cylindrical sleeve. Shafts are man-
ufactured whose diameter S follows a normal distribution S ∼ N (3, 0.0042 ) and cylindrical sleeves
are manufactured whose internal diameter C follows a normal distribution C ∼ N (3.010, 0.0032 ).
Assembly is performed by selecting a shaft and a cylindrical sleeve at random. In what proportion of
cases will it be impossible to fit the selected shaft and cylindrical sleeve together?
Discussion
Clearly, the shaft and cylindrical sleeve will fit together only if the diameter of the shaft is smaller than
the internal diameter of the cylindrical sleeve. We need the difference of the two random variables
C and S to be greater than zero. We can take the difference C − S and find its distribution. Once
we do this we can then ask the question ”What is the probability that the inside diameter of the
cylindrical sleeve is greater than the outside diameter of the shaft, i.e. what is P(C − S > 0)?”
Essentially we are trying to ensure that the internal diameter of the cylindrical sleeve is larger than
the external diameter of the shaft.
Problem 2
A manufacturer produces boxes of woodscrews containing a variety of sizes for a local DIY store. The
weight W (in kilograms) of boxes of woodscrews manufactured is a normal random variable following
the distribution W ∼ N (1.01, 0.004). Note that 0.004 is the variance. Find the probability that a
customer who selects two boxes of screws at random finds that their combined weight is greater than
2.03 kilograms.
Discussion
In this problem we are looking at the effects of adding two random variables together. Since all
boxes are assumed to have weights W which follow the distribution W ∼ N (1.01, 0.004), we are
considering the effect of adding the random variable W to itself. In general, there is no reason why
we cannot combine variables W1 ∼ N (µ1 , σ12 ) and W2 ∼ N (µ2 , σ22 ). This might happen if the DIY
store bought in two similar products from two different manufacturers.
Before we can solve such problems, we need to obtain some results concerning the behaviour of
random variables.
Functions of several random variables

Note that we shall quote results only for the continuous case. The results for the discrete case
are similar with integration replaced by summation. We will omit the mathematics leading to these
results.
34 HELM (2006):
®
Key Point 4
• If X1 , X2 , · · · + Xn are n random variables then

E(X1 + X2 + · · · + Xn ) = E(X1 ) + E(X2 ) + . . . E(Xn )
• If X1 , X2 , . . . Xn are n independent random variables then
V(X1 + X2 + · · · + Xn ) = V(X1 ) + V(X2 ) + · · · + V(Xn )
and more generally
V(X1 ± X2 ± · · · ± Xn ) = V(X1 ) + V(X2 ) + · · · + V(Xn )
Example 16
Solve Problem 1 from the previous page. You may assume that the sum and
difference of two normal random variables are themselves normal.
Solution
Consider the random variable C − S. Using the results above we know that
C − S ∼ N (3.010 − 3.0, 0.0042 + 0.0032 ) i.e, C − S ∼ N (0.01, 0.0052 )
0 − 0.01
Hence P(C − S > 0) = P(Z > = −2) = 0.9772
0.005
This result implies that in 2.28% of cases it will be impossible to fit the shaft to the sleeve.
Task
Solve Problem 2 from the previous page. You may assume that the sum and
difference of two normal random variables are themselves normal.
Your solution
HELM (2006): 35
Answer
If W12 is the random variable representing the combined weight of the two boxes then
W12 ∼ N (2.02, 0.008)
Hence
2.03 − 2.02
P(W12 > 2.03) = P(Z > √ = 0.1118) = 0.5 − 0.0445 = 0.4555
0.008
The result implies that the customer has about a 46% chance of finding that the weight of the two
boxes combined is greater than 2.03 kilograms.
Exercises
1. Batteries of type A have mean voltage 6.0 (volts) and variance 0.0225 (volts2 ). Type B
batteries have mean voltage 12.0 and variance 0.04. If we form a series connection containing
one of each type what is the probability that the combined voltage exceeds 17.4?
2. Nuts and bolts are made separately and paired at random. The nuts’ diameters, in mm, are
independently N (10, 0.02) and the bolts’ diameters, in mm, are independently N (9.5, 0.02).
Find the probability that a bolt is too large for its nut.
3. Certain cutting tools have lifetimes, in hours, which are independent and normally distributed
with mean 300 and variance 10000.
(a) Find the probability that
(i) the total life of three tools is more than 1000 hours.
(ii) the total life of four tools is more than 1000 hours.
(b) In a factory each tool is replaced when it fails. Find the probability that exactly four tools
are needed to accumulate 1000 hours of use.
(c) Explain why the first sentence in this question can only be approximately, not exactly,
true.
4. A firm produces articles whose length, X, in cm, is normally distributed with nominal mean
µ = 4 and variance σ 2 = 0.1. From time to time a check is made to see whether the value
of µ has changed. A sample of ten articles is taken, the lengths are measured, the sample
mean length X̄ is calculated, and the process is adjusted if X̄ lies outside the range (3.9, 4.1).
Determine the probability, α, that the process is adjusted as a result of a sample taken when
µ = 4. Find the smallest sample size n which would make α ≤ 0.05.
36 HELM (2006):
®
Answers
1. XA ∼ N (6, 0.0225) XB ∼ N (12, 0.04)

Series X = XA + XB ∼ N (18, 0.0625) as variances always add

−0.6 0.6
P (X > 17.4) = P Z > = 0.5 + P 0 < Z <
0.25 0.25
= 0.5 + P(0 < Z < 2.4) = 0.5 + 0.4918 = 0.9918
2. Let the diameter of a nut be N. Let the diameter of a bolt be B. A bolt is too large for its
nut if N − B < 0.
E(N − B) = 10 − 9.5 = 0.5

V(N − B) = 0.02 + 0.02 = 0.04
N − B ∼ N (0.5, 0.04)

N − B − 0.5 0 − 0.5
P(N − B < 0) = P < = P(Z < −2.5)
0.2 0.2
= Φ(−2.5) = 1 − Φ(2.5) = 1 − 0.99379
= 0.00621.
The probability that a bolt is too large for its nut is 0.00621.
3. (a) Let the lifetime of tool i be Ti .
(i) E(T1 + T2 + T3 ) = 900

V(T1 + T2 + T3 ) = 30000
(T1 + T2 + T3 ) ∼ N (900, 30000)

T1 + T2 + T3 − 900
P(T1 + T2 + T3 > 1000) = P √ = P(Z > 0.57735)
30000
= 1 − Φ(0.57735) = 1 − 0.7181 = 0.2819
(ii) E(T1 + T2 + T3 + T4 ) = 1200

V(T1 + T2 + T3 + T4 ) = 40000
(T1 + T2 + T3 + T4 ) ∼ N (1200, 40000)

T1 + T2 + T3 + T4 − 1200 1000 − 1200
P(T1 + T2 + T3 + T4 > 1000) = P √ > √
40000 40000
= P(Z > −1)
= 1 − Φ(−1) = Φ(1) = 0.8413
HELM (2006): 37
Answers
(b) Let the number of tools needed be N.
P(N ≤ 3) = P(N = 1) + P(N = 2) + P(N = 3)

= P(T1 + T2 + T3 > 1000) = 0.2819
P(N ≤ 4) = P(N = 1) + P(N = 2) + P(N = 3) + P(N = 4)
= P(T1 + T2 + T3 + T4 > 1000) = 0.8413.
Hence P(N = 4) = P(N ≤ 4) − P(N ≤ 3) = 0.8413 − 0.2819 = 0.5594.

(c) Lifetimes can not be negative. The normal distribution assigns non-zero probability density
to negative values so it can only be an approximation in this case.
4.
X ∼ N (4, 0.1)
X1 + · · · + X10 ∼ N (40, 1)
X̄ = (X1 + · · · + X10 )/10 ∼ N (4, 0.01)
By symmetry P(X̄ < 3.9) = P(X̄ > 4.1).

X̄ − 4 3.9 − 4
P(X̄ < 3.9) = P < = P(Z < −1)
0.1 0.1
= Φ(−1) = 1 − Φ(1) = 1 − 0.8413
= 0.1587
More generally
X1 + · · · + Xn ∼ N (4n, 0.1n)
X̄ = (X1 + · · · + Xn )/n ∼ N (4, 0.1/n)
! !
X̄ − 4 3.9 − 4 −0.1 √
P(X̄ < 3.9) = P p <p =P Z< p = − 0.1n
0.1/n 0.1/n 0.1/n
√ √
= Φ(− 0.1n) = 1 − Φ( 0.1n)
√
α = 2[1 − Φ( 0.1n)]
We require α ≤ 0.05.
√ √
2[1 − Φ( 0.1n)] ≤ 0.05 ⇔ 1 − Φ( 0.1n) ≤ 0.025
√
⇔ Φ( 0.1n) ≥ 0.975
√
⇔ 0.1n ≥ 1.96
⇔ n ≥ 10 × 1.962 = 38.416
The smallest sample size which satisfies this is n = 39.
38 HELM (2006):
®
Table 1: The Standard Normal Probability Integral

x−µ
Z= σ 0 1 2 3 4 5 6 7 8 9
0 0000 0040 0080 0120 0160 0199 0239 0279 0319 0359
.1 0398 0438 0478 0517 0577 0596 0636 0675 0714 0753
.2 0793 0832 0871 0909 0948 0987 1026 1064 1103 1141
.3 1179 1217 1255 1293 1331 1368 1406 1443 1480 1517
.4 1555 1591 1628 1664 1700 1736 1772 1808 1844 1879
.5 1915 1950 1985 2019 2054 2088 2123 2157 2190 2224
.6 2257 2291 2324 2357 2389 2422 2454 2486 2517 2549
.7 2580 2611 2642 2673 2703 2734 2764 2794 2822 2852
.8 2881 2910 2939 2967 2995 3023 3051 3078 3106 3133
.9 3159 3186 3212 3238 3264 3289 3315 3340 3365 3389
1.0 3413 3438 3461 3485 3508 3531 3554 3577 3599 3621
1.1 3643 3665 3686 3708 3729 3749 3770 3790 3810 3830
1.2 3849 3869 3888 3907 3925 3944 3962 3980 3997 4015
1.3 4032 4049 4066 4082 4099 4115 4131 4147 4162 4177
1.4 4192 4207 4222 4236 4251 4265 4279 4292 4306 4319
1.5 4332 4345 4357 4370 4382 4394 4406 4418 4429 4441
1.6 4452 4463 4474 4484 4495 4505 4515 4525 4535 4545
1.7 4554 4564 4573 4582 4591 4599 4608 4616 4625 4633
1.8 4641 4649 4656 4664 4671 4678 4686 4693 4699 4706
1.9 4713 4719 4726 4732 4738 4744 4750 4756 4761 4767
2.0 4772 4778 4783 4788 4793 4798 4803 4808 4812 4817
2.1 4821 4826 4830 4834 4838 4842 4846 4850 4854 4857
2.2 4861 4865 4868 4871 4875 4878 4881 4884 4887 4890
2.3 4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.4 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.5 4938 4940 4941 4943 4946 4947 4948 4949 4951 4952
2.6 4953 4955 4956 4957 4959 4960 4961 4962 4963 4964
2.7 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974
2.8 4974 4975 4976 4977 4977 4978 4979 4979 4980 4981
2.9 4981 4982 4982 4983 4984 4984 4985 4985 4986 4986
3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9
4987 4990 4993 4995 4997 4998 4998 4999 4999 4999
HELM (2006): 39
Contents 40
Sampling Distributions
and Estimation
40.1 Sampling Distributions 2
40.2 Interval Estimation for the Variance 13
Learning outcomes
You will learn about the distributions which are created when a population is sampled. For
example, every sample will have a mean value; this gives rise to a distribution of mean
values. We shall look at the behaviour of this distribution. We shall also look at the
problem of estimating the true value of a population mean (for example) from a given
sample.
Sampling
Distributions 40.1
Introduction
When you are dealing with large populations, for example populations created by the manufacturing
processes, it is impossible, or very difficult indeed, to deal with the whole population and know
the parameters of that population. Items such as car components, electronic components, aircraft
components or ordinary everyday items such as light bulbs, cycle tyres and cutlery effectively form
infinite populations. Hence we have to deal with samples taken from a population and estimate
those population parameters that we need. This Workbook will show you how to calculate single
number estimates of parameters - called point estimates - and interval estimates of parameters -
called interval estimates or confidence intervals. In the latter case you will be able to calculate a
range of values and state the confidence that the true value of the parameter you are estimating lies
in the range you have found.
' $
• understand and be able to calculate means
and variances
Prerequisites • be familiar with the results and concepts met
Before starting this Section you should . . . in the study of probability
• be familiar with the normal distribution

&
' %
$
sample and sampling distribution
• explain the importance of sampling in the

application of statistics
Learning Outcomes • explain the terms point estimate and the

term interval estimate
• calculate point estimates of means and
variances
• find interval estimates of population

parameters for given levels of confidence
& %
2 HELM (2006):
Workbook 40: Sampling Distributions
®
1. Sampling
Why sample?
Considering samples from a distribution enables us to obtain information about a population where
we cannot, for reasons of practicality, economy, or both, inspect the whole of the population. For
example, it is impossible to check the complete output of some manufacturing processes. Items such
as electric light bulbs, nuts, bolts, springs and light emitting diodes (LEDs) are produced in their
millions and the sheer cost of checking every item as well as the time implications of such a checking
process render it impossible. In addition, testing is sometimes destructive - one would not wish to
destroy the whole production of a given component!
Populations and samples

If we choose n items from a population, we say that the size of the sample is n. If we take many
samples, the means of these samples will themselves have a distribution which may be different from
the population from which the samples were chosen. Much of the practical application of sampling
theory is based on the relationship between the ‘parent’ population from which samples are drawn
and the summary statistics (mean and variance) of the ‘offspring’ population of sample means. Not
surprisingly, in the case of a normal ‘parent’ population, the distribution of the population and the
distribution of the sample means are closely related. What is surprising is that even in the case of a
non-normal parent population, the ‘offspring’ population of sample means is usually (but not always)
normally distributed provided that the samples taken are large enough. In practice the term ‘large’
is usually taken to mean about 30 or more. The behaviour of the distribution of sample means is
based on the following result from mathematical statistics.
The central limit theorem

In what follows, we shall assume that the members of a sample are chosen at random from a
population. This implies that the members of the sample are independent. We have already met the
Central Limit Theorem. Here we will consider it in more detail and illustrate some of the properties
resulting from it.
Much of the theory (and hence the practice) of sampling is based on the Central Limit Theorem.
While we will not be looking at the proof of the theorem (it will be illustrated where practical) it is
necessary that we understand what the theorem says and what it enables us to do. Essentially, the
Central Limit Theorem says that if we take large samples of size n with mean X̄ from a population
which has a mean µ and standard deviation σ then the distribution of sample means X̄ is normally
σ
distributed with mean µ and standard deviation √ .
n
That is, the sampling distribution of the mean X̄ follows the distribution

σ
X̄ ∼ N µ, √
n
Strictly speaking we require σ 2 < ∞ , and it is important to note that no claim is made about the
way in which the original distribution behaves, and it need not be normal. This is why the Central
Limit Theorem is so fundamental to statistical practice. One implication is that a random variable
which takes the form of a sum of many components which are random but not necessarily normal
will itself be normal provided that the sum is not dominated by a small number of components. This
explains why many biological variables, such as human heights, are normally distributed.
HELM (2006): 3
Section 40.1: Sampling Distributions and Estimation
In the case where the original distribution is normal, the relationship
between
the original distribution
σ
X ∼ N (µ, σ) and the distribution of sample means X̄ ∼ N µ, √ is shown below.
n

σ
X̄ ∼ N μ, √
n
X ∼ N (μ, σ)
Figure 1
σ
The distributions of X and X̄ have the same mean µ but X̄ has the smaller standard deviation √
n
The theorem says that we must take large samples. If we take small samples, the theorem only
holds if the original population is normally distributed.
Standard error of the mean

You will meet this term often if you read statistical texts. It is the name given to the standard
deviation of the population of sample means. The name stems from the fact that there is some
uncertainty in the process of predicting the original population mean from the mean of a sample or
samples.
Key Point 1
For a sample of n independent observations from a population with variance σ 2 , the standard error
σ
of the mean is σn = √ .
n
Remember that this quantity is simply the standard deviation of the distribution of sample means.
4 HELM (2006):
®
Finite populations
When we sample without replacement from a population which is not infinitely large, the observations
are not independent. This means that we need to make an adjustment in the standard error of the
mean. In this case the standard error of the sample mean is given by the related but more complicated
formula r
σ N −n
σn,N = √
n N −1
where σn,N is the standard error of the sample mean, N is the population size and n is the sample
size.
Note that, in cases where the size of the population N is large in comparison to the sample size n,
the quantity
N −n
≈1
N −1
√
so that the standard error of the mean is approximately σ/ n.
Illustration - a distribution of sample means
It is possible to illustrate some of the above results by setting up a small population of numbers
and looking at the properties of small samples drawn from it. Notice that the setting up of a small
population, say
of size 5, and taking samples of size 2 enables us to deal with the totality of samples,
5 5!
there are = = 10 distinct samples possible, whereas if we take a population of 100 and
2 2!3!
100 100!
draw samples of size 10, there are = = 51, 930, 928, 370, 000 possible distinct samples
10 10!90!
and from a practical point of view, we could not possibly list them all let alone work with them!
Suppose we take a population consisting of the five numbers 1, 2, 3, 4 and 5 and draw samples of
size 2 to work with. The complete set of possible samples is:
(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)
For the parent population, since we know that the mean µ = 3, then we can calculate the standard
deviation by
r r
(1 − 3)2 + (2 − 3)2 + (3 − 3)2 + (4 − 3)2 + (5 − 3)2 10
σ= = = 1.4142
5 5
For the population of sample means,
1.5, 2, 2.5, 3, 2.5, 3, 3.5, 3.5, 4, 4.5
their mean and standard deviation are given by the calculations:
1.5 + 2 + 2.5 + 3 + 2.5 + 3 + 3.5 + 3.5 + 4 + 4.5
=3
10
and
r r
(1.5 − 3)2 + (2 − 3)2 + · · · + (4 − 3)2 + (4.5 − 3)2 7.5
= = 0.8660
10 10
We can immediately conclude that the mean of the population of sample means is the same as the
population mean µ.
HELM (2006): 5
Using the results given above the value of σn,N should be given by the formula
r
σ N −n
σn,N = √
n N −1
with σ = 1.4142, N = 5 and n = 2. Using these numbers gives:
r r r
σ N −n 1.4142 5 − 2 3
σ2,5 = √ = √ = = 0.8660 as predicted.
n N −1 2 5−1 4
r
N −n
Note that in this case the ‘correction factor’ ≈ 0.8660 and is significant. If we take samples
N −1
of size 10 from a population of 100, the factor becomes
r
N −n
≈ 0.9535
N −1
and for samples of size 10 taken from a population of 1000, the factor becomes
r
N −n
≈ 0.9955.
N −1
r
N −n σ
Thus as → 1, its effect on the value of √ reduces to insignificance.
N −1 n
Task
Two-centimetre number 10 woodscrews are manufactured in their millions but
packed in boxes of 200 to be sold to the public or trade. If the length of the
screws is known to be normally distributed with a mean of 2 cm and variance
0.05 cm2 , find the mean and standard deviation of the sample mean of 200 boxed
screws. What is the probability that the sample mean length of the screws in a
box of 200 is greater than 2.02 cm?
Your solution
6 HELM (2006):
®
Answer
Since the population is very large indeed, we are effectively sampling from an infinite population.
The mean and standard deviation are given by
√
0.05
µ = 2 cm and σ200 = √ = 0.016 cm
200
Since the parent population is normally distributed the means of samples of 200 will be normally
distributed as well.
2.02 − 2
Hence P(sample mean length > 2.02) = P(z > ) = P(z > 1.25) = 0.5 − 0.3944 = 0.1056
0.016
2. Statistical estimation
When we are dealing with large populations (the production of items such as LEDs, light bulbs,
piston rings etc.) it is extremely unlikely that we will be able to calculate population parameters such
as the mean and variance directly from the full population.
We have to use processes which enable us to estimate these quantities. There are two basic methods
used called point estimation and interval estimation. The essential difference is that point estimation
gives single numbers which, in the sense defined below, are best estimates of population parameters,
while interval estimates give a range of values together with a figure called the confidence that the
true value of a parameter lies within the calculated range. Such ranges are usually called confidence
intervals.
Statistically, the word ‘estimate’ implies a defined procedure for finding population parameters. In
statistics, the word ‘estimate’ does not mean a guess, something which is rough-and-ready. What
the word does mean is that an agreed precise process has been (or will be) used to find required
values and that these values are ‘best values’ in some sense. Often this means that the procedure
used, which is called the ‘estimator’, is:
(a) consistent in the sense that the difference between the true value and the estimate
approaches zero as the sample size used to do the calculation increases;
(b) unbiased in the sense that the expected value of the estimator is equal to the true value;
(c) efficient in the sense that the variance of the estimator is small.
Expectation is covered in Workbooks 37 and 38. You should note that it is not always possible to
find a ‘best’ estimator. You might have to decide (for example) between one which is
consistent, biased and efficient
and one which is
consistent, unbiased and inefficient
when what you really want is one which is
consistent, unbiased and efficient.
HELM (2006): 7
Point estimation
We will look at the point estimation of the mean and variance of a population and use the following
notation.
Notation
Population Sample Estimator
Size N n
Mean µ or E(x) x̄ µ̂ for µ
Variance σ 2 or V(x) s2 σ̂ 2 for σ 2
Estimating the mean
This is straightforward.
µ̂ = x̄
is a sensible estimate since the difference between the population mean and the sample mean dis-
appears with increasing sample size. We can show that this estimator is unbiased. Symbolically we
have:
x1 + x2 + · · · xn
µ̂ =
n
so that
E(x1 ) + E(x2 ) + · · · + E(xn )
E(µ̂) = =
n
E(X) + E(X) + · · · + E(X)
=
n
= E(X)
= µ
Note that the expected value of x1 is E(X), i.e. E(x1 ) = E(X). Similarly for x1 , x2 , · · · , xn .
Estimating the variance
(x − µ)2
P
2
This is a little more difficult. The true variance of the population is σ = which suggests
N
2
P
(x − µ)
the estimator, calculated from a sample, should be σ̂ 2 = .
n
However, we do not know the true value of µ, but we do have the estimator µ̂ = x̄.
Replacing µ by the estimator µ̂ = x̄ gives
(x − x̄)2
P
2
σ̂ =
n
This can be written in the form
(x − x̄)2
P P 2
2 x
σ̂ = = − (x̄)2
n n
Hence
E( x2 )
P
2
E(σ̂ ) = − E{(X̄)2 } = E(X 2 ) − E{(X̄)2 }
n
8 HELM (2006):
®
We already have the important result

V(x)
E(x) = E(x̄) and V(x̄) =
n
Using the result E(x) = E(x̄) gives us
E(σ̂ 2 ) = E(x2 ) − E{(x̄)2 }
= E(x2 ) − {E(x)}2 − E{(x̄)2 } + {E(x̄)}2
= E(x2 ) − {E(x)}2 − (E{(x̄)2 } − {E(x̄)}2 )
= V(x) − V(x̄)
σ2
= σ2 −
n
n−1 2
= σ
n n−1 2
This result is biased, for an unbiased estimator the result should be σ 2 not σ .
n
n
Fortunately, the remedy is simple, we just multiply by the so-called Bessel’s correction, namely
n−1
and obtain the result
(x − x̄)2 (x − x̄)2
P P
2 n
σ̂ = =
n−1 n n−1
There are two points to note here. Firstly (and rather obviously) you should not take samples of
size 1 since the variance cannot be estimated from such samples. Secondly, you should check the
operation of any hand calculators (and spreadsheets!) that you use to find out exactly what you are
calculating when you press the button for standard deviation. You might find that you are calculating
either
(x − µ)2 (x − x̄)2
P P
2 2
σ = or σ̂ =
N n−1
It is just as well to know which, as the first formula assumes that you are calculating the variance of
a population while the second assumes that you are estimating the variance of a population from a
random sample of size n taken from that population.
From now on we will assume that we divide by n − 1 in the sample variance and we will simply write
s2 for s2n−1 .
Interval estimation
We will look at the process of finding an interval estimation of the mean and variance of a population
and use the notation used above.
Interval estimation for the mean
This interval is commonly called the Confidence Interval for the Mean.
x1 + x2 + · · · + xn
Firstly, we know that while the sample mean x̄ = is a good estimator of the
n
population mean µ. We also know that the calculated mean x̄ of a sample of size n is unlikely to be
exactly equal to µ. We will now construct an interval around x̄ in such a way that we can quantify
the confidence that the interval actually contains the population mean µ.
Secondly, we know that for sufficiently large samples taken from a large population, x̄ follows a
σ
normal distribution with mean µ and standard deviation √ .
n
HELM (2006): 9
Thirdly, looking at the following extract from the normal probability tables,
X −µ
Z= 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
σ
1.9 .4713 4719 4726 4732 4738 4744 4750 4756 4762 4767
we can see that 2×47.5% = 95% of the values in the standard normal distribution lie between ±1.96
standard deviation either side of the mean.
So before we see the data we may say that

σ σ
P µ − 1.96 √ ≤ x̄ ≤ µ + 1.96 √ = 0.95
n n
After we see the data we say with 95% confidence that
σ σ
µ − 1.96 √ ≤ x̄ ≤ µ + 1.96 √
n n
which leads to
σ σ
x̄ − 1.96 √ ≤ µ ≤ x̄ + 1.96 √
n n
This interval is called a 95% confidence interval for the mean µ.
Note that while the 95% level is very commonly used, there is nothing sacrosanct about this level.
If we go through the same argument but demand that we need to be 99% certain that µ lies within
the confidence interval developed, we obtain the interval
σ σ
x̄ − 2.58 √ ≤ µ ≤ x̄ + 2.58 √
n n
since an inspection of the standard normal tables reveals that 99% of the values in a standard normal
distribution lie within 2.58 standard deviations of the mean.
The above argument assumes that we know the population variance. In practice this is often not the
case and we have to estimate the population variance from a sample. From the work we have seen
above, we know that the best estimate of the population variance from a sample of size n is given
by the formula
(x − x̄)2
P
2
σ̂ =
n−1
It follows that if we do not know the population variance, we must use the estimate σ̂ in place of σ.
Our 95% and 99% confidence intervals (for large samples) become
σ̂ σ̂ σ̂ σ̂
x̄ − 1.96 √ ≤ µ ≤ x̄ + 1.96 √ and x̄ − 2.58 √ ≤ µ ≤ x̄ + 2.58 √
n n n n
where
(x − x̄)2
P
2
σ̂ =
n−1
When we do not know the population variance, we need to estimate it. Hence we need to gauge the
confidence we can have in the estimate.
In small samples, when we need to estimate the variance, the values 1.96 and 2.58 need to be replaced
by values from the Student’s t-distribution. See 41.
10 HELM (2006):
®
Example 1
After 1000 hours of use the weight loss, in gm, due to wear in certain rollers in
machines, is normally distributed with mean µ and variance σ 2 . Fifty independent
observations are taken. (This may be regarded as a “large” sample.) If observation
50
X X50
i is yi , then yi = 497.2 and yi2 = 5473.58.
i=1 i=1
Estimate µ and σ 2 and give a 95% confidence interval for µ.
Solution
P
yi 497.2
We estimate µ using the sample mean: ȳ = = = 9.944 gm
n 50
We estimate σ 2 using the sample variance:
X
2 1 X 2 1 2 1 hX i 2
s = (yi − ȳ) = yi − yi
n−1 n−1 n

1 1
= 5473.58 − 497.2 = 10.8046 gm2
2
49 50
r r
s2 10.8046
The estimated standard error of the mean is = = 0.4649 gm
n 50
r
s2
The 95% confidence interval for µ is ȳ ± 1.96 . That is 9.479 < µ < 10.409
n
Exercises
1. The voltages of sixty nominally 10 volt cells are measured. Assuming these to be independent
observations from a normal distribution with mean µ and variance σ 2 , estimate µ and σ 2 .
Regarding this as a “large”sample, find a 99% confidence interval for µ. The data are:
10.3 10.5 9.6 9.7 10.6 9.9 10.1 10.1 9.9 10.5
10.1 10.1 9.9 9.8 10.6 10.0 9.9 10.0 10.3 10.1
10.1 10.3 10.5 9.7 10.1 9.7 9.8 10.3 10.2 10.2
10.1 10.5 10.0 10.0 10.6 10.9 10.1 10.1 9.8 10.7
10.3 10.4 10.4 10.3 10.4 9.9 9.9 10.5 10.0 10.7
10.1 10.6 10.0 10.7 9.8 10.4 10.3 10.0 10.5 10.1
2. The natural logarithms of the times in minutes taken to complete a certain task are normally
distributed with mean µ and variance σ 2 . Seventy-five independent observations are taken.
(This may beP regarded as a “large”
Psample.) If the natural logarithm of the time for observation
2
i is yi , then yi = 147.75 and yi = 292.8175.
Estimate µ and σ 2 and give a 95% confidence interval for µ.
Use your confidence interval to find a 95% confidence interval for the median time to complete
the task.
HELM (2006): 11
Answers
P P 2
1. yi = 611.0, yi = 6227.34 and n = 60. We estimate µ using the sample mean:
P
yi 611.0
ȳ = = = 10.1833 V
n 60
X
2 1 X 2 1 2 1 h X i2
s = (yi − ȳ) = yi − yi
n−1 n−1 n

1 1 2
= 6227.34 − 611.0 = 0.090226
59 59
The estimated standard error of the mean is

r r
s2 0.090226
= = 0.03878 V
n 60
p
The 99% confidence interval for µ is ȳ ± 2.58 s2 /n. That is
10.08 < µ < 10.28
2. We estimate µ using the sample mean:

P
yi 147.75
ȳ = = = 1.97
n 75
X
2 1 X 2 1 2 1 h X i2
s = (yi − ȳ) = yi − yi
n−1 n−1 n

1 1 2
= 292.8175 − 147.75 = 0.02365
74 75
The estimated standard error of the mean is

r r
s2 0.02365
= = 0.01776
n 75
p
The 95% confidence interval for µ is ȳ ± 1.96 s2 /n. That is
1.935 < µ < 2.005
The 95% confidence interval for the median time, in minutes, to complete the task is
e1.935 < M < e2.005
That is
6.93 < M < 7.42
12 HELM (2006):
®
Interval Estimation
for the Variance 40.2
Introduction
In Section 40.1 we have seen that the sampling distribution of the sample mean, when the data
come from a normal distribution (and even, in large samples, when they do not) is itself a normal
distribution. This allowed us to find a confidence interval for the population mean. It is also often
useful to find a confidence interval for the population variance. This is important, for example, in
quality control. However the distribution of the sample variance is not normal. To find a confidence
interval for the population variance we need to use another distribution called the “chi-squared”
distribution.
' $
• understand and be able to calculate means
and variances
• understand the concepts of continuous

Prerequisites probability distributions
• understand and be able to calculate a
confidence interval for the mean of a normal
distribution
&
# %
• find probabilities using a chi-squared
distribution
Learning Outcomes
On completion you should be able to . . . • find a confidence interval for the variance of
a normal distribution
" !
HELM (2006): 13
Section 40.2: Interval Estimation for the Variance
1. Interval estimation for the variance
In Section 40.1 we saw how to find a confidence interval for the mean of a normal population. We
can also find a confidence interval for the variance. The corresponding confidence interval for the
standard deviation is found by taking square roots.
We know that if we take samples from a population, then each sample will have a mean and a
variance associated with it. We can calculate the values of these quantities from first principles, that
is we can use the basic definitions of the mean and the variance to find their values. Just as the
means form a distribution, so do the values of the variance and it is to this distribution that we turn
in order to find an interval estimate for the value of the variance of the population. Note that if
the original population is normal, samples taken from this population have means which are normally
distributed. When we consider the distribution of variances calculated from the samples we need the
chi-squared (usually written as χ2 ) distribution in order to calculate the confidence intervals. As you
might expect, the values of the chi-squared distribution are tabulated for ease of use. The calculation
of confidence intervals for the variance (and standard deviation) depends on the following result.
Key Point 2
If x1 , x2 , · · · , xn is a random sample taken from a normal population with mean µ and variance σ 2
then if the sample variance is denoted by S 2 , the random variable
(n − 1)S 2
X2 =
σ2
has a chi-squared ( χ2 ) distribution with n − 1 degrees of freedom.
Clearly, a little explanation is required to make this understandable! Key Point 2 refers to the
chi-squared distribution and the term ‘degrees of freedom.’ Both require some detailed explanation
before the Key Point can be properly understood. We shall start by looking in a little detail at the
chi-squared distribution and then consider the term ‘degrees of freedom.’ You are advised to read
these explanations very carefully and make sure that you fully understand them.
The chi-squared random variable

The probability density function of a χ2 random variable is somewhat complicated and involves the
gamma (Γ) function. The gamma function, for positive r, is defined as
Z ∞
Γ(r) = xr−1 e−x dx
0
It is easily shown that Γ(r) = (r − 1)Γ(r − 1) and that, if r is an integer, then

Γ(r) = (r − 1)(r − 2)(r − 3) · · · (3)(2)(1) = (r − 1)!
14 HELM (2006):
Workbook 40: Sampling Distributions and Estimation
®
The probability density function is

1
f (x) = x(k/2)−1 e−x/2 x > 0.
2k/2 Γ(k/2)
The plots in Figure 2 show the probability density function for various convenient values of k. We
have deliberately taken even values of k so that the gamma function has a value easily calculated
from the above formula for a factorial. In these graphs the vertical scaling has been chosen to ensure
each graph has the same maximum value.
It is possible to discern two things from the diagrams.
Firstly, as k increases, the peak of each curve occurs at values closer to k. Secondly, as k increases,
the shape of the curve appears to become more and more symmetrical. In fact the mean of the χ2
distribution is k and in the limit as k → ∞ the χ2 distribution becomes normal. One further fact,
not obvious from the diagrams, is that the variance of the χ2 distribution is 2k.
2 4 6 8 10 12 14 16 5 10 15 20 25 30 35 40
k=4 k = 16
20 40 60 80 100 50 100 150 200 250 300 350

k = 64 k = 256
Figure 2
A summary is given in the following Key Point.
Key Point 3
The χ2 distribution, defined by the probability density function
1
f (x) = x(k/2)−1 e−x/2 x > 0.
2k/2 Γ(k/2)
has mean k and variance 2k and as k → ∞ the limiting form of the distribution is normal.
HELM (2006): 15
Degrees of freedom
A formal definition of the term ‘degrees of freedom’ is that it is the ‘number of independent com-
parisons that can be made among the elements of a sample.’ Textbooks on statistics e.g. Applied
Statistics and Probability for Engineers by Montgomery and Runger (Wiley) often give this formal def-
inition. The number of degrees of freedom is usually represented by the Greek symbol ν pronounced
‘nu’. The following explanations of the concept should be helpful.
Explanation 1
If we have a sample of n values say x1 , x2 , x3 · · · , xn chosen from a population and we are trying to
calculate the mean of the sample, we know that the sum of the deviations about the mean must be
zero. Hence, the following constraint must apply to the observations.
X
(x − x̄) = 0
Once we calculate the values of (x1 − x̄), (xP2 − x̄), (x3 − x̄), · · · (xn−1 − x̄) we can calculate
the value of (xn − x̄) by using the constraint (x − x̄) = 0. We say that we have n − 1 degrees of
freedom. The term ‘degrees of freedom’ may be thought of as the number of independent variables
minus the number of constraints imposed.
Explanation 2
A point in space which can move freely has three degrees of freedom since it can move independently
in the x, y and z directions. If we now restrict the point so that it can only move along the straight
line
x y z
= =
a b c
then we have effectively imposed two constraints since the value of (say) x determines the values of
y and z. In this situation, we say that the number of degrees of freedom is reduced from 3 to 1.
That is, we have one degree of freedom.
A similar argument may be used to demonstrate that a point in three dimensional space which is
restricted to move in a plane leads to a situation with two degrees of freedom.
Key Point 4
The term ‘degrees of freedom’ may be thought of as the number of independent variables involved
minus the number of constraints imposed.
Figure 3 shows a typical χ2 distribution and Table 1 at the end of this Workbook show the values
of χ2α,ν for a variety of values of the area α and the number of degrees of freedom ν. Notice that
Table 1 gives the area values corresponding to the right-hand tail of the distribution which is shown
shaded.
16 HELM (2006):
®
χ2α,ν
Figure 3
The χ2α,ν values for (say) right-hand area values of 5% are given by the column headed 0.05 while
the χ2α,ν values for (say) left-hand area values of 5% are given by the column headed 0.95. Figure 4
shows the values of χ2α,ν for the two 5% tails when there are 5 degrees of freedom.
f (x)
x
χ20.95,5 = 1.15 χ20.05,5 = 11.07
Figure 4
Task
Use the percentage points of the χ2 distribution to find the appropriate values of
χ2α,ν in the following cases.
(a) Right-hand tail of 10% and 7 degrees of freedom.

(b) Left-hand tail of 2.5% and 9 degrees of freedom.
(c) Both tails of 5% and 10 degrees of freedom.
(d) Both tails of 2.5% and 20 degrees of freedom.
Your solution
Answer
Using Table 1 and reading off the values directly gives:
(a) 12.02 (b) 2.70 (c) 3.94 and 18.31 (d) 9.59 and 34.17
HELM (2006): 17
Constructing a confidence interval for the variance
We know that if x1 , x2 , x3 , · · · , xn is a random sample taken from a normal population with mean
µ and variance σ 2 and if the sample variance is denoted by S 2 , the random variable
(n − 1)S 2
X2 =
σ2
has a chi-squared distribution with n − 1 degrees of freedom. This knowledge enables us to construct
a confidence interval as follows.
Firstly, we decide on a level of confidence, say, for the sake of illustration, 95%. This means that we
need two 2.5% tails.
Secondly, we know that we have n − 1 degrees of freedom so that the value of X 2 will lie between
the left-tail value of χ20.975,n−1 and the right-tail value of χ20.025,n−1 . If we know the value of n then
we can easily read off these values from the χ2 tables.
The confidence interval is developed as shown below.
We have
χ20.025,n−1 ≤ X 2 ≤ χ20.975,n−1
so that
(n − 1)S 2
χ20.025,n−1 ≤ 2
≤ χ20.975,n−1
σ
hence
1 σ2 1
≤ ≤ 2
χ20.975,n−1 (n − 1)S 2 χ0.025,n−1
so that
(n − 1)S 2 2 (n − 1)S 2
≤ σ ≤
χ20.975,n−1 χ20.025,n−1
Another way of stating the same result using probability directly is to say that
(n − 1)S 2 (n − 1)S 2

2
P ≤σ ≤ 2 = 0.95
χ20.975,n−1 χ0.025,n−1
Noting that 0.95 = 100(1 − 0.05) and that we are working with the right-hand tail values of the χ2
distribution, it is usual to generalize the above result as follows. Taking a general confidence level as
100(1 − α)%, (a 95% interval gives α = 0.05), our confidence interval becomes
(n − 1)S 2 2 (n − 1)S 2
≤ σ ≤
χ2α/2,n−1 χ21−α/2,n−1
Note that the confidence interval for the standard deviation σ is obtained by taking the appropriate
square roots.
The following Key Point summarizes the development of this confidence interval.
18 HELM (2006):
®
Key Point 5
If x1 , x2 , x3 , · · · , xn is a random sample with variance S 2 taken from a normal population with
variance σ 2 then a 100(1 − α)% confidence interval for σ 2 is
(n − 1)S 2 2 (n − 1)S 2
≤ σ ≤
χ2α/2,n−1 χ21−α/2,n−1
where χ2α/2,n−1 and χ21−α/2,n−1 are the appropriate right-hand and left-hand values respectively of a
chi-squared distribution with n − 1 degrees of freedom.
Example 2
A random sample of 20 nominally measured 2mm diameter steel ball bearings is
taken and the diameters are measured precisely. The measurements, in mm, are
as follows:
2.02 1.94 2.09 1.95 1.98 2.00 2.03 2.04 2.08 2.07
1.99 1.96 1.99 1.95 1.99 1.99 2.03 2.05 2.01 2.03
Assuming that the diameters are normally distributed with unknown mean, µ, and
unknown variance σ 2 ,
(a) find a two-sided 95% confidence interval for the variance, σ 2 ;

(b) find a two-sided confidence interval for the standard deviation, σ.
Solution
x2i = 80.7977. Hence
P P
From the data, we calculate xi = 40.19 and
40.192
(n − 1)S 2 = 80.7977 − = 0.035895
20
There are 19 degrees of freedom and the critical values of the χ219 -distribution are
χ20.975,19 = 8.91 and χ20.025,19 = 32.85
(a) the confidence interval for σ 2 is

0.035895 0.035895
< σ2 < ≡ 1.0927 × 10−3 mm < σ 2 ≤ 4.0286 × 10−3 mm
32.85 8.91
(b) the confidence interval for σ is
√ √
1.0927 × 10−3 < σ ≤ 4.0286 × 10−3 ≡ 0.033mm < σ < 0.063 mm
HELM (2006): 19
Task
In a typical car, bell housings are bolted to crankcase castings by means of a series
of 13 mm bolts. A random sample of 12 bolt-hole diameters is checked as part of
a quality control process and found to have a variance of 0.0013 mm2 .
(a) Construct the 95% confidence interval for the variance of the holes.
(b) Find the 95% confidence interval for the standard deviation of the holes.
State clearly any assumptions you make.
Your solution
Answer
Using the confidence interval formula developed, we know that the 95% confidence interval is
11 × 0.0013 11 × 0.0013 11 × 0.0013 11 × 0.0013
2
≤ σ2 ≤ i.e. ≤ σ2 ≤
χ0.025,11 χ20.975,11 21.92 3.82
(a) The 95% confidence interval for the variance is 0.0007 ≤ σ 2 ≤ 0.0037 mm2 .
(b) The 95% confidence interval for the standard deviation is 0.0265 ≤ σ ≤ 0.0608 mm.
We have assumed that the hole diameters are normally distributed.
20 HELM (2006):
®
Exercises
1. Measurements are made on the lengths, in mm, of a sample of twenty wooden components for
self-assembly furniture. Assume that these may be regarded as twenty independent observations
from a normal distribution with unknown mean µ and unknown variance σ 2 . The data are as
follows.
581 580 581 577 580 581 577 579 579 578
581 583 577 578 582 581 582 580 582 579
Find a 95% confidence interval for the variance σ 2 and hence find a 95% confidence interval
for the standard deviation σ.
2. A machine fills packets with powder. At intervals a sample of ten packets is taken and the
packets are weighed. The ten weights may be regarded as a sample of ten independent ob-
servations from a normal distribution with unknown mean. Find limits L, U such that the
probability that L < S 2 < U is 0.9 when the population variance is σ 2 = 3.0 and S 2 is the
sample variance.
Answers
yi2 = 6725744 and we have n = 20. Hence

P P
1. From the data we calculate yi = 11598 and
X 115982
(n − 1)s2 = (yi − ȳ)2 = 6725744 − = 63.8
20
The number of degrees of freedom is n − 1 = 19. We know that
(n − 1)S 2
χ20.975,19 < < χ20.025,19
σ2
with probability 0.95. So a 95% confidence interval for σ 2 is
(n − 1)s2 2 (n − 1)s2
< σ <
χ20.025,19 χ20.975,19
63.8 63.8
That is < σ2 < so 1.942 < σ 2 < 7.160
32.85 8.91
This gives a 95% confidence interval for σ: 1.394 < σ < 2.676
2. There are n − 1 = 9 degrees of freedom. Now
(n − 1)S 2

2 2
0.9 = P χ0.05,9 < < χ0.95,9
σ2
2
χ0.05,9 σ 2 χ20.95,9 σ 2

2
= P <S <
n−1 n−1

3.33 × 3.0 2 16.92 × 3.0
= P <S < = P(1.11 < S 2 < 5.64)
9 9
Hence L = 1.11 and U = 5.64.
HELM (2006): 21
f
Table 1: Percentage Points χ2α,ν of the χ2 distribution

α
χ2α,ν
α 0.995 0.990 0.975 0.950 0.900 0.500 0.100 0.050 0.025 0.010 0.005
v
1 0.00 0.00 0.00 0.00 0.02 0.45 2.71 3.84 5.02 6.63 7.88
2 0.01 0.02 0.05 0.01 0.21 1.39 4.61 5.99 7.38 9.21 10.60
3 0.07 0.11 0.22 0.35 0.58 2.37 6.25 7.81 9.35 11.34 12.28
4 0.21 0.30 0.48 0.71 1.06 3.36 7.78 9.49 11.14 13.28 14.86
5 0.41 0.55 0.83 1.15 1.61 4.35 9.24 11.07 12.83 15.09 16.75
6 0.68 0.87 1.24 1.64 2.20 5.35 10.65 12.59 14.45 16.81 18.55
7 0.99 1.24 1.69 2.17 2.83 6.35 12.02 14.07 16.01 18.48 20.28
8 1.34 1.65 2.18 2.73 3.49 7.34 13.36 15.51 17.53 20.09 21.96
9 1.73 2.09 2.70 3.33 4.17 8.34 14.68 16.92 19.02 21.67 23.59
10 2.16 2.56 3.25 3.94 4.87 9.34 15.99 18.31 20.48 23.21 25.19
11 2.60 3.05 3.82 4.57 5.58 10.34 17.28 19.68 21.92 24.72 26.76
12 3.07 3.57 4.40 5.23 6.30 11.34 18.55 21.03 23.34 26.22 28.30
13 3.57 4.11 5.01 5.89 7.04 12.34 19.81 22.36 24.74 27.69 29.82
14 4.07 4.66 5.63 6.57 7.79 13.34 21.06 23.68 26.12 29.14 31.32
15 4.60 5.23 6.27 7.26 8.55 14.34 22.31 25.00 27.49 30.58 32.80
16 5.14 5.81 6.91 7.96 9.31 15.34 23.54 26.30 28.85 31.00 34.27
17 5.70 6.41 7.56 8.67 10.09 16.34 24.77 27.59 30.19 33.41 35.72
18 6.26 7.01 8.23 9.39 10.87 17.34 25.99 28.87 31.53 34.81 37.16
19 6.84 7.63 8.91 10.12 11.65 18.34 27.20 30.14 32.85 36.19 38.58
20 7.43 8.26 9.59 10.85 12.44 19.34 28.41 31.41 34.17 37.57 40.00
21 8.03 8.90 10.28 11.59 13.24 20.34 29.62 32.67 35.48 38.93 41.40
22 8.64 9.54 10.98 12.34 14.04 21.34 30.81 33.92 36.78 40.29 42.80
23 9.26 10.20 11.69 13.09 14.85 22.34 32.01 35.17 38.08 41.64 44.18
24 9.89 10.86 12.40 13.85 15.66 23.34 33.20 36.42 39.36 42.98 45.56
25 10.52 11.52 13.12 14.61 16.47 24.34 34.28 37.65 40.65 44.31 46.93
26 11.16 12.20 13.84 15.38 17.29 25.34 35.56 38.89 41.92 45.64 48.29
27 11.81 12.88 14.57 16.15 18.11 26.34 36.74 40.11 43.19 46.96 49.65
28 12.46 13.57 15.31 16.93 18.94 27.34 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 28.34 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 29.34 40.26 43.77 46.98 50.89 53.67
40 20.71 22.16 24.43 26.51 29.05 39.34 51.81 55.76 59.34 63.69 66.77
50 27.99 29.71 32.36 34.76 37.69 49.33 63.17 67.50 71.42 76.15 79.49
60 35.53 37.48 40.48 43.19 46.46 59.33 74.40 79.08 83.30 88.38 91.95
70 43.28 45.44 48.76 51.74 55.33 69.33 85.53 90.53 95.02 100.42 104.22
80 51.17 53.54 57.15 60.39 64.28 79.33 96.58 101.88 106.63 112.33 116.32
90 59.20 61.75 65.65 69.13 73.29 89.33 107.57 113.14 118.14 124.12 128.30
100 67.33 70.06 74.22 77.93 82.36 99.33 118.50 124.34 129.56 135.81 140.17
22 HELM (2006):
Contents 41
Hypothesis Testing
41.1 Statistical Testing 2
41.2 Tests Concerning a Single Sample 5
41.3 Tests Concerning Two Samples 19
Learning outcomes
By studying this Workbook you will learn how to apply statistical techniques to test the
validity, on the basis of available evidence, of a given hypothesis. For example, a motor
engineer may be interested in testing the expected life of a given set of tyres ("the mean
life is 2,000 miles") against an alternative ("the mean life is less than 2,000 miles"). You
will learn about techniques which will enable you to answer such questions.
This Workbook will introduce you to the basic ideas of hypothesis testing in a
non-mathematical way by using a problem solving approach to highlight the concepts as
they are needed.
Once you have learned how to apply the basic ideas, you will be capable of applying
hypothesis testing to a very wide range of practical problems and learning about methods
of hypothesis testing which are not covered in this Workbook.

Statistical Testing 41.1
Introduction
If you are applying statistics to practical problems in industry, you may find that much of your work
is concerned with making decisions concerning populations and population parameters on the basis
of available evidence. For example you may be asked to decide whether one production process is
preferable to another or whether to repair or continue to use a machine that is producing a certain
proportion of defective components. In order to make such decisions, you will find that you have to
make certain assumptions which will determine the statistical tools that you may legitimately use.
Any assumptions made may or may not be true but you must always be sure of your grounds for
using a given statistical tool. Effectively you will find that you will be asked to decide which of two
statements, each called an hypothesis, is the more likely to be true. Note the choice of words. You
should be clear from the outset that the statistical tools you will study here will not allow you to prove
anything, but they will allow you to measure the strength of the evidence against the hypothesis.
' $
• understand the term ‘sample’
Prerequisites • be able to differentiate between statements

which are a matter of opinion and those
Before starting this Section you should . . . which are of a numerical nature and as such
can be challenged
&
' %
$
hypothesis and hypothesis testing
• understand the what is meant by the terms

one-tailed test and two-tailed test
Learning Outcomes • understand what is meant by the terms type I
On completion you should be able to . . . error and type II error
• understand the term level of significance
• apply a variety of statistical tests to problems

based in engineering
& %
2 HELM (2006):
Workbook 41: Hypothesis Testing
®
1. Types of statements
Almost every time we read a magazine or newspaper we see claims made by manufacturers about
their products. Such claims can take many forms, they may for example be subjective:
‘Luxcar, makers of the best luxury cars’
‘Burnol, the finest fuel you can buy’
‘ConstructAll, designers of beautiful buildings’
Such claims do not need to be backed up by facts and figures, they are a matter of opinion.
Many claims do contain information which is open to question and can be investigated statistically:
‘the expected life of these tyres is 20,000 miles’
‘on average, low energy light bulbs can be expected to last at least 8000 hours’
‘average bottle contents 330 ml.’
The validity of claims which contain information of a numerical nature can often be investigated by
taking random samples of the objects or quantities in question and investigating the likelihood that
a statement or hypothesis concerning them is true.
As stated in the introduction, it should be noted that hypothesis testing can never prove that a
statement is either true or false, it can only give a measure of the truth or otherwise of a given
statement. Statements which are investigated statistically are normally called hypotheses and we
usually try to establish a pair of hypotheses, called a null hypothesis and an alternative hypothesis
and then investigate how the evidence that we have supports one hypothesis more than the other. For
example, a demolition engineer might be interested in the burn rate of fuses connected to explosive
devices and on the basis of experience hypothesize that the mean burn rate (say µ) is 600 mm/sec.
A colleague may disagree and claim that the mean burn rate is greater than 600 mm/sec.
We can describe this situation by setting up the null hypothesis:
H0 : µ = 600
and test this against the alternative hypothesis:
H1 : µ > 600
HELM (2006): 3
Section 41.1: Statistical Testing
2. Types of errors
Since we cannot be 100% sure that a hypothesis is true or false it is possible that:
(a) a correct hypothesis will be rejected;

(b) a false hypothesis will be accepted.
Rejecting a correct hypothesis is called a Type I error and accepting a false hypothesis is called a
Type II error.
By working in a logical manner and developing a set of rules or guide-lines, it is possible to minimise
the occurrence of such errors.
This will introduce you to the basic ideas of hypothesis testing in a non-mathematical way by using
a problem solving approach to highlight the concepts as they are needed.
Once you have learned how to apply the basic ideas, you will be capable of applying hypothesis
testing to a very wide range of practical problems and learning about methods of hypothesis testing
which are not covered in this Workbook.
4 HELM (2006):
Tests Concerning
a Single Sample 41.2
Introduction
This Section introduces you to the basic ideas of hypothesis testing in a non-mathematical way by
using a problem solving approach to highlight the concepts as they are needed. We only consider
situations involving a single sample.
In Section 41.3 we will introduce you to situations involving two samples and while the basic ideas will
follow through, their practical application is a little more complex than that met in this Workbook.
However, once you have learned how to apply the basic ideas of hypothesis testing covered in this
Workbook, you should be capable of applying hypothesis testing to a very wide range of practical
problems and learning about methods of hypothesis testing which are not covered here.
' $
• be familiar with the results and concepts met
in the study of probability
• be familiar with a range of statistical

Prerequisites distributions
Before starting this Section you should . . . • understand the term hypothesis
• understand the concepts of Type I error and

Type II error
&
%

• apply the ideas of hypothesis testing to a
Learning Outcomes range of problems underpinned by elementary
statistical distributions and involving only a
On completion you should be able to . . . single sample.

HELM (2006): 5
Section 41.2: Tests Concerning a Single Sample
1. Tests of proportion
Problem 1
SwitchRight, a manufacturer of engine management systems requires its supplier of control modules
to supply modules with at least 99% complying with their specification. The quality control operators
at SwitchRight check a random sample of 1000 control modules delivered to SwitchRight and find
that 985 match the specification. Does this result imply that less than 99% of the control modules
supplied do not match SwitchRight’s specification?
Analysis
Firstly, we set up two hypotheses concerning the control modules. The first hypothesis, called the
null hypothesis is denoted by
H0 : 99% of the control modules match SwitchRight’s specification.
The second hypothesis, called the alternative hypothesis and is denoted by
H1 : less than 99% of the control modules match SwitchRight’s specification.
The alternative hypothesis is essentially saying that in this case, that SwitchRight cannot rely on its
supplier of control modules supplying delivering batches of modules where 99% match SwitchRight’s
specification.
Secondly, we describe the random sample from a statistical point of view, that is we find a statistical
distribution which describes the behaviour of the sample. Suppose that X is the number of control
modules in a random sample of 1000 matching SwitchRight’s specification.
We assume that the control modules are independent and that for each module the specification is
either matched or it isn’t. Under these conditions, X has a binomial distribution and the problem
can be summarised as follows:
X ∼ B(1000, p)
H0 : p = 0.99 H1 : p < 0.99
Thirdly, we set up a mechanism to enable us to make a decision between the two hypotheses. This
is done by assuming that H0 is correct until we can show otherwise.
Given that H0 is correct we can calculate the mean µ and the standard deviation σ of the distribution
as follows:
µ = np = 1000 × 0.99 = 990
p √
σ = np(1 − p) = 1000 × 0.99 × 0.01 = 3.15
Notice that
(a) np > 5 and (b) n(1 − p) > 5
so that we can use the normal approximation to the binomial distribution, that is
B(1000, 0.99) ≈ N (990, 3.152 )
The sample value obtained is 985 and we now assess how close 985 is to the expected result of 990
by defining a remote left tail (in this case) of the normal distribution and asking if the number 985
6 HELM (2006):
occurs in the left tail of the distribution or in the main body of the distribution.
In practice, we use the tail(s) of the standard normal distribution and convert a problem involving
the distribution N (µ, σ 2 ) into one involving the distribution N (0, 1). Diagrammatically the situation
can be represented as shown below:
Z ∼ N (0, 1)
5%
Z
− 1.645 0
Figure 1
In general, the tails of a distribution can be defined to occupy any proportion of the distribution that
we wish, the proportions chosen are usually taken as either 5% or 1%.
Given this information and a set of tables for the standard normal distribution we can assign values
to the limits defining the tails.
Throughout this Workbook we shall use the 5% proportion to
define the tail(s) of a distribution unless otherwise stated.
In the case we have here, the alternative hypothesis states that p is less than 0.99. Because of this
we use only one tail occupying a total of 5% of the distribution.
To discover where the number 985 lies within the distribution (tail or main body) we standardise
985 with respect to the normal distribution N (990, 3.152 ) in the usual way (see 39). The
calculation is:

985.5 − 990
P (X ≤ 985) = P Z ≤ = P (Z ≤ −1.43)
3.15
Notice that 985.5 is used and not 985. This because we are using a continuous normal distribution
to approximate a discrete binomial distribution and so
P (X = 985) ≈ P (984.5 ≤ X ≤ 985.5)
the right-hand side being calculated from the normal distribution.
The number −1.43 is greater than (to the right of) −1.645 and so the number 985 occurs in the
main body of the distribution not in the left tail. This suggests that the evidence does not support the
claim that the number of control modules supplied meeting SwitchRight’s specification is different
from 99%. Essentially, we accept the null hypothesis since we do not have the evidence necessary to
reject it. Note that this result does not prove that the claim is true.
Before looking at similar problems, we will look at the possible ways of defining the tails of the
standard normal distribution. As stated previously, we shall, in these notes, always use a total of 5%
for the tail or tails of a distribution.
We say that we are making a decision at the 5% level of significance.
HELM (2006): 7
The situation is represented by the following three figures:
(1) Hypotheses:-
H0 : p = p 0 Z ∼ N (0, 1)
H1 : p #= p0 2.5% 2.5%
Z
− 1.96 0 1.96
Figure 2
(2) Hypotheses:-
Z ∼ N (0, 1)
H0 : p = p 0
5%
H1 : p > p0
Z
0 1.645
Figure 3
(3) Hypotheses:-
H0 : p = p 0 Z ∼ N (0, 1)
H1 : p < p0 5%
Z
− 1.645 0
Figure 4
The values ±1.96, +1.645 and −1.645 are easily obtained from the standard normal table (Table
1) given at the end of this Workbook. The appropriate lines from the table are reproduced on the
following page for ease of reference. Note that it is sometimes advisable to be 99% sure (rather than
95% sure) of either correctly accepting or rejecting a null hypothesis. In this case we say that we are
working at the 1% level of significance. The situation diagrammatically is exactly the same as the
one shown above except that the 5% tail areas become 1% and the 2.5% areas become 0.5%.
The corresponding values of Z are ±2.58, +2.33 and −2.33 depending on whether a one-tailed or a
two-tailed test is being performed.
Particular note must always be taken of the form of the hypotheses and the corresponding test,
one-tailed or two-tailed.
8 HELM (2006):
Extracts from the normal probability integral table
Case 1 - 5% level of significance
Z = X−µ
σ
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
1.6 .4452 4463 4474 4485 4495 4505 4515 4525 4535 4545
1.9 .4713 4719 4726 4732 4738 4744 4750 4756 4762 4767
Case 2 - 1% level of significance
Z = X−µ
σ
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.3 .4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.5 .4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
We shall now look at a problem which is similar in type to Problem 1 and solve it using the ideas
discussed in the analysis of that problem.
Problem 2
The Head of Quality Control in a foundry claims that the castings produced in the foundry are
‘better than average.’ In support of this claim he points out that of a random sample of 60 castings
inspected, 59 passed. It is known that the industry average percentage of castings passing quality
control inspections is 90%. Do these results support the Head’s claim?
Analysis
Let X denote the number of castings passing the quality control inspection from the sample of 60.
Assuming that a casting either passes or fails the inspection process, we can assume that X follows
the binomial distribution
X ∼ B(60, p)
where p is the probability that a casting passes the inspection.
The null hypothesis H0 , is that the probability that a casting passes the inspection is the same as
the industry average. The alternative hypothesis H1 , is that the Head of Quality Control is correct
in his claim that castings produced in his foundry have a greater chance of passing the inspection.
The problem can be summarised as:
X ∼ B(60, p)
H0 : p = 0.90 H1 : p > 0.90
The form of the alternative hypothesis dictates that we do a one-tailed test.
If H0 is correct we can calculate the mean and standard deviation of the binomial distribution above
and, assuming that the appropriate condition are met, use the normal distribution with the same
mean and standard deviation to solve the problem. The calculations are:
µ = np = 60 × 0.90 = 54
p √
σ = np(1 − p) = 60 × 0.90 × 0.10 = 2.32
Notice that
HELM (2006): 9
(a) np > 5 and (b) n(1 − p) > 5
so that we can use the normal approximation to the binomial distribution, that is
B(60, 0.90) ≈ N (54, 2.322 )
In order to make a decision, we need to know whether or not the value 59 is in the remote tails of
the distribution or in the main body. Recall that the hypotheses are:
H0 : p = 0.90 H1 : p > 0.90
so that we must do a one-tailed test with a critical value of Z = 1.645.
The calculation is:-

58.5 − 54
P (X ≥ 59) = P Z≥ = P (Z ≥ 1.94)
2.32
The situation is represented by the following figure.
X ∼ N (54, 2.322 )
58.5 − 54
Z= Z ∼ N (0, 1)
2.32
Z
54 58.5 0 1.94
Figure 5
Since 1.94 > 1.645, the result is significant at the 5% level and so we reject the null hypothesis. The
evidence suggests that we accept the alternative hypothesis that, at the 5% level of significance, the
Head of Quality Control is making a justified claim.
Task
A firm manufactures heavy current switch units which depend for their correct
operation on a relay. The relays are provided by an outside supplier and out of a
random sample of 150 relays delivered, 140 are found to work correctly. Can the
relay manufacturer justifiably claim that at least 90% of the relays provided will
function correctly?
Your solution
10 HELM (2006):
Answer
Let X represent the number of relays working correctly. The required hypotheses are:
X ∼ B(150, p) H0 : p = 0.90 H1 : p > 0.90
We perform a one-tailed test with critical value Z = 1.645. The necessary calculations are:
µ = np = 150 × 0.90 = 135
p √
σ = np(1 − p) = 150 × 0.90 × 0.10 = 3.67
Since np > 5 and n(1 − p) > 5, we can use the normal approximation to the binomial distribution.
We approximate B(150, 0.90) ≈ N (135, 3.672 ). Hence:

139.5 − 135
P (X ≥ 140) = P Z = = P (Z ≥ 1.23)
3.67
Since 1.23 < 1.645 we cannot reject the null hypothesis at the 5% level of significance.
There is insufficient evidence to support the manufacturer’s claim that at least 90% of the relays
provided will function correctly.
2. Tests for population means

Tests concerning a single mean
Introduction
In cases where tests involving measurements are performed, it is often possible to statistically hy-
pothesize about the results. Suppose that the boiling point of a particular coolant used in car engines
is claimed by a manufacturer to be 110◦ C. Further suppose that a series of accurate measurements
made in a laboratory using 8 random samples of the coolant are recorded as:
110.2◦ , 110.3◦ , 110.1◦ , 109.8◦ , 109.9◦ , 110.0◦ , 110.4◦ , 110.1◦ ,
The mean of these results is 110.1◦ C.
It is reasonable to ask whether, on the basis of the results obtained, we may claim that the boiling
point of the coolant is greater than the assumed true boiling point of 110◦ C. We will return to this
problem later in this Workbook after looking at some general results.
General results
In general terms, we need to make predictions, based on calculation, about the parameters of the
population from which the random sample is drawn. As illustrated above we calculate the sample
mean x̄. The statistical tests used to answer the above question depend on whether the variance of
the population is known or not.
HELM (2006): 11
Case (i) - Population variance known
Firstly we form the null hypothesis that there is no difference between the true population mean µ
and the theoretical value µ0 . That is:
H 0 : µ = µ0
Secondly we consider drawing samples of size n from the population. If n is large (say n ≥ 30) then,
because of the central limit theorem, we can often assume that the sample means approximately
follow a normal distribution with mean µ and standard deviation (standard error of the mean) σn
given by
σ
σn = √
n
It follows that
X̄ − µ0
Z= √
σ/ n
has a standard normal distribution when the null hypothesis is true. That is, when µ = µ0 , Z ∼
N (0, 1).
We may now set up an alternative hypothesis which can take one of the three forms:
H1 : µ 6= µ0
H1 : µ > µ0
H1 : µ < µ0
depending on the form of deviation from the null hypothesis for which we wish to test. Then we will
reject the null hypothesis at the 5% level of significance if
|Z| > 1.96 for a two-tailed test
Z > 1.645 for a (right) one-tailed test
Z < −1.645 for a (left) one-tailed test
In each case we reject H0 in favour of the alternative hypothesis when Z lies in the remote tail of
the standard normal distribution.
Example 1
Dishwasher powder is poured into the cartons in which it is sold by an automatic
dispensing machine which is set to dispense 3 kg of powder into each carton. In
order to check that the dispensing machine is working to an acceptable standard
(i.e. does not need adjustment), a production engineer takes a random samples
of 40 cartons and weighs them. It is found that the mean weight of the sample
is 3.005 kg. It is known that the dispensing machine operates with a variance of
0.0152 kg2 and that the manufacturer of the powder is willing to rely on a 5%
level of significance. Does the sample provide the engineer with sufficient evidence
that the true mean is not 3.00 kg and so the machine requires adjustment?
12 HELM (2006):
Solution
Given that the dispensing machine can over-fill or under-fill the containers, the null and alternative
hypotheses are:
H0 : µ = 3 H1 : µ 6= 3
Since the sample size is large (≥ 30) and we can regard the population as infinite but with a known
variance, we can calculate the relevant value of the test statistic Z by using the formula:
x̄ − µ0
Z= √
σ/ n
Hence, in this case:
x̄ − µ0 3.005 − 3
Z= √ = √ = 2.108
σ/ n 0.015/ 40
and since we are performing a two-tailed test at the 5% level of significance and have found that
|Z| > 1.96, that is, Z is outside the range [−1.96, 1.96], we must reject the null hypothesis and
conclude that the machine is not operating acceptably and needs adjustment.
Case (ii) - Population variance unknown

We have exactly the same situation as that described in Case (i) but do not know the value of the
population variance σ 2 . Therefore we estimate it using
n
2 1 X
s = (xi − x̄)2
n − 1 i=1
and calculate the test statistic
x̄ − µ0
T =p .
s2 /n
However, because we are now dividing by an estimate, which is itself random, this test statistic
does not have a standard normal distribution under the null hypothesis. Instead it has a distribution
called Student’s t-distribution on n − 1 degrees of freedom. The number of degrees of freedom
is the same as that which we have already seen when we looked at the χ2 distribution in connection
with sample variances in Workbook 40. So, for example, instead of comparing Z with ±1.96 for a
two-sided test at the 5% level, when σ 2 is known, we compare T with a value from the t-distribution
which depends on the sample size through the number of degrees of freedom. The t-distribution is
symmetric, centred at zero and, for all but very small numbers of degrees of freedom, has a shape
similar to that of a standard normal distribution but with a larger variance. A table which gives the
values which we need is provided at the back of this Workbook. For example, if we have a two-sided
test at the 5% level of significance and a sample size n = 15, then the number of degrees of freedom
is 14 and we compare |T | with the upper 2.5% point which is 2.145.
Looking at the table and comparing it with the values for a standard normal distribution we can
see that, as the number of degrees of freedom becomes large, the t-distribution gets closer to the
standard normal distribution so that, for large samples, it makes little difference which we use. It is
also true that, under most circumstances, even if we do not know that the distribution from which
HELM (2006): 13
data are drawn is normal, a t-test provides a good approximation when the sample size is reasonably
large. In other circumstances, for example when normality cannot be assumed and the sample is
small, we need to use other procedures, often non-parametric tests.
In summary we have the following.
Population Variance Sample size Test
Normal Known Small Normal (Z)
Normal Known Large Normal (Z)
Normal Unknown Small t
Normal Unknown Large t but Z approximates
Not Normal Either Small Non-parametric
Not Normal Known Large Z approximates
Not Normal Unknown Large Z and t approximate
Non-parametric testing is covered in 45.
Example 2
The average useful life of a random sample of 33 similar calculator batteries made
on a production line is found to be 99.5 hours continuous use. The sample variance
is 18.49 hours2 . Test the null hypothesis that the population mean lifetime is 100
hours against the alternative that it is less. Use the 5% level of significance.
Solution
The null and alternative hypotheses are:
H0 : µ = 100 H1 : µ < 100
Our test statistic is

x̄ − µ0
T =p
s2 /n
In this case
99.5 − 100.0
T = p
18.49/33
= −0.668
and the number of degrees of freedom is n − 1 = 33 − 1 = 32. The table does not give values for
32 degrees of freedom but it does give values for 30 degrees of freedom and for 40 and the values
for 32 must be in between. The lower 5% points for 30 and 40 degrees of freedom are −1.697 and
−1.684 respectively. Clearly our observed value of −0.668 is not significant and we do not have
sufficient evidence to reject the null hypothesis that µ = 100.
14 HELM (2006):
Task
Solve the problem given at the start of subsection 2 (page 11). Note the sample
is small and you will have to estimate the population variance from the sample
variance. Use the tabulated values of the t-distribution given at the end of this
Workbook in conjunction with the appropriate number of degrees of freedom.
Your solution
Answer
H0 : µ = 110 H1 : µ > 110
The value of the sample variance is given by the formula
(x − x̄)2
P
2 0.28
s = = = 0.004
n−1 7
The test statistic t is given by
√
x̄ − µ0 110.1 − 110 0.1 × 8
t= √ = √ √ = = 1.414
s/ n 0.04/ 8 0.2
At the 5% level of significance and using 8 − 1 = 7 degrees of freedom, the value of tα,ν from tables
is 1.895. Since 1.414 < 1.895, we cannot reject the null hypothesis in favour of the alternative
hypothesis. On the basis of the evidence available, we are not able to conclude that the boiling
point of the coolant is greater than 110◦ C.
HELM (2006): 15
General comments about tests concerning a population mean
(a) The sample mean x̄ is often used as a test statistic when testing a hypothesis concerning
a population mean µ.
(b) Even if the population distribution cannot be assumed to be normal, the distribution of
sample means can often be assumed to be normal. This depends on the sample size.
(c) The tests described above sometimes require us to assume that the population variance
is known. This is often unrealistic and we turn to the t-test to deal with cases where the
population standard deviation is unknown and must be estimated from the data available.
General comments on the t -test
(a) The test only applies when the underlying distribution can be assumed to be normal.
(b) The test is used when the standard deviation of the parent population has to be estimated.
(c) As the sample size n get larger, the distribution approximates to the standard normal
distribution.
(d) The distribution depends on the number of degrees of freedom, for a single sample or
equal paired samples (see below), the number of degrees of freedom is always one less
than the sample size.
Tests concerning paired data

Sometimes experimental data may be directly compared using an appropriate test. The following
Example looks at experimental data concerning the throttle reaction times of two turbochargers fitted
to an internal combustion engine.
16 HELM (2006):
Example 3
In order to test the hypothesis that two standard turbochargers A and B have
the same throttle reaction times, a random sample of 7 cars were fitted with
the turbochargers and the throttle reaction times measured. The results were as
follows:
Car 1 2 3 4 5 6 7
Throttle Reaction time for A; R1 0.223 0.212 0.201 0.205 0.216 0.211 0.209
Throttle Reaction time for B; R2 0.208 0.207 0.203 0.204 0.205 0.202 0.206
D = R1 − R2 0.015 0.005 −0.002 0.001 0.011 0.009 0.003
Solution
Let D be the difference between the throttle reaction times of the two turbochargers. We assume
that the distribution of D is normal. Our null hypothesis is that µD , the mean of the population of
differences, is zero. We must decide between the two hypotheses
H 0 : µD = 0 H1 : µD 6= 0
The alternative hypothesis here indicates that we perform a two-tailed test.

Let d¯ be the sample mean of the seven observed differences. Then
P
d 0.042
d¯ = = = 0.006
7 7
The sample variance of the differences is

P ¯2
(d − d) 0.000214
2
sd = = = 3.5667 × 10−5
n−1 6
The value of the test statistic is
|d¯ − 0| 0.006
|t| = p 2 =p = 2.658
sd /n 3.5667 × 10−5 /7
The number of degrees of freedom is 7 − 1 = 6 and the critical value from the table is 2.447. Since
2.658 > 2.447 we reject H0 at the 5% level and conclude that the evidence suggests that there is
a difference in the throttle reaction times between the two turbochargers.
HELM (2006): 17
Task
Two different methods of analysis were used to determine the levels of impurity
present in a particular aircraft quality aluminium alloy. Eight specimens were
analysed using both methods. Does the available evidence suggest that both
methods lead to the same results?
Alloy Specimen 1 2 3 4 5 6 7 8
Test 1 1.24 1.23 1.24 1.20 1.21 1.22 1.23 1.22
Test 2 1.23 1.20 1.20 1.21 1.20 1.20 1.21 1.25
D = Test1 − Test2 0.01 0.03 0.04 −0.01 0.01 0.02 0.02 −0.03
Your solution
Answer
Let D be the difference between the two methods of analysis. We assume that the distribution of
D is normal. Our null hypothesis is that µD , the mean of the population of differences, is zero. We
must decide between the two hypotheses
H 0 : µD = 0 H1 : µD 6= 0
The alternative hypothesis here indicates that we perform a two-tailed test.
Let d¯ be the sample mean of the eight observed differences. Then
P
d 0.09
d¯ = = = 0.01125
8 8
The sample variance of the differences is
P ¯2
(d − d) 0.0034875
s2d = = = 0.0004982
n−1 7
The value of the test statistic is
|d¯ − 0| 0.01125
|t| = p 2 =p = 1.426
sd /n 0.0004982/8
The number of degrees of freedom is 8 − 1 = 7 and the critical value from the table is 2.306. Since
−2.306 < 1.426 < 2.306 we do not reject H0 at the 5% level and conclude that there is insufficient
evidence to show that there is a difference between the two methods.
18 HELM (2006):
®
Tests Concerning
Two Samples 41.3
Introduction
So far we have dealt with situations in which we either had a single sample drawn from a population,
or paired data whose differences were considered essentially as a single sample.
In this Section we shall look at the situations occurring when we have two random samples each
drawn from independent populations. While the basic ideas involved will essentially repeat those
already met, you will find that the calculations involved are more complex than those already covered.
However, you will find as before that calculations do follow particular routines. Note that in general
the samples will be of different sizes. Cases involving samples of the same size, while included, should
be regarded as special cases.

• be familiar with the normal distribution,
Prerequisites t-distribution, F -distribution and chi-squared
Before starting this Section you should . . . distribution

• apply the ideas of hypothesis testing to a
Learning Outcomes range of problems underpinned by a
substantial range of statistical distributions
On completion you should be able to . . . and involving two samples of different sizes

HELM (2006): 19
Section 41.3: Tests Concerning Two Samples
1. Tests concerning two samples
Two independent populations each with a known variance
We assume that the populations are normally distributed. This may not always be true and you
should note this basic assumption while studying this Section of the Workbook.
A standard notation often used to describe the populations and samples is:
Population Sample
2
X1 ∼ N (µ1 , σ1 ) x11 , x12 , x13 , · · · , x1n1 with n1 members.
If you are not familiar with the double suffix notation used to represent the samples, simply remember
that a random sample of size n1 is drawn from X1 ∼ N (µ1 , σ12 ) and a random sample of size n2 is
drawn from X1 ∼ N (µ1 , σ12 ).
In diagrammatic form the populations may be represented as follows:
X1 ∼ N (μ1 , σ12 )
X2 ∼ N (μ2 , σ22 )
μ1 μ2
Figure 6
When we look at hypothesis testing using two means, we will be considering the difference µ1 − µ2
of the means and writing null hypotheses of the form
H0 : µ1 − µ2 = Value
As you might expect, Value will often be zero and we will be trying to detect whether there is any
statistically significant evidence of a difference between the means.
We know, from our previous work on continuous distributions (see 38) that:
E(X̄1 − X̄2 ) = E(X̄1 ) − E(X̄2 ) = µ1 − µ2
and that
σ12 σ22
V(X̄1 − X̄2 ) = V(X̄1 ) − V(X̄2 ) = +
n1 n2
since X̄1 and X̄2 are independent. Given the assumptions made we can assert that the quantity Z
defined by
(X̄1 − X̄2 ) − (µ1 − µ2 )
Z= r 2
σ1 σ22
+
n1 n2
follows the standard normal distribution N (0, 1).
20 HELM (2006):
®
We are now ready to apply this formula to practical problems in which random samples of different
sizes are drawn from normal populations. The conditions for the rejection of H0 at the 5% and the
1% levels of significance are exactly the same as those previously used for single sample problems.
Example 4
A motor manufacturer wishes to replace steel suspension components by aluminium
components to save weight and thereby improve performance and fuel consump-
tion. Tensile strength tests are carried out on randomly chosen samples of two
possible components before a final choice is made. The results are:
Component Sample Mean Tensile Standard Deviation
−2
Number Size Strength (kg mm ) (kg mm−2 )
1 15 90 2.3
2 10 88 2.2
Is there any difference between the measured tensile strengths at the 5% level of
significance?
Solution
H 0 : µ1 − µ2 = 0 H1 : µ1 − µ2 6= 0
The null hypothesis represent the statement ‘there is no difference in the tensile strengths of the
two components.’ The test statistic Z is calculated as:
(X̄1 − X̄2 ) − (µ1 − µ2 )

Z = r 2
σ1 σ22
+
n1 n2
(90 − 88) − (0)
= r
2.32 2.22
+
15 10
2
= √
0.3527 + 0.484
= 2.186
Since 2.186 > 1.96 we conclude that, on the basis of the (limited) evidence available, there is a
difference in tensile strength between the components tested. The manufacturer should carry out
more comprehensive tests before making a final decision as to which component to use. The decision
is a serious one with safety implications as well as economic implications. As well as carrying out
more tests the manufacturer should consider the level of rejection of the null hypothesis, perhaps
using 1% instead of 5%. Component 1 appears to be stronger but this may not be the case after
more tests are carried out.
HELM (2006): 21
Task
A motor manufacturer is considering whether or not a new fuel formulation will
improve the maximum power output of a particular type of engine. Tests are
carried out on randomly chosen samples of the two fuels in order to inform a
decision. The results are:
Fuel Sample Mean Maximum Standard Deviation

Type Size Power Output (bhp) (bhp)
1 20 1350 10
2 16 131 8
Is there any difference between the measured power outputs at the 5% level of
significance?
Your solution
Answer
H 0 : µ1 − µ2 = 0 H1 : µ1 − µ2 6= 0
The null hypothesis represent the statement ‘there is no difference in the measured maximum power
outputs’. The test statistic Z is calculated as:
(X̄1 − X̄2 ) − (µ1 − µ2 ) (135 − 131) − (0) 4
Z= q 2 = q =√ = 1.33
σ1
+
2
σ2 10 2
+ 8 2 5+4
n1 n2 20 16
Since 1.33 < 1.96 we conclude that, on the basis of the (limited) evidence available, there is
insufficient evidence to conclude that there is a difference in the maximum power output of the
engines tested when run on the different types of fuel.
22 HELM (2006):
®
Two independent populations each with an unknown variance

Again we assume that the populations are normally distributed and use the same standard notation
used previously to describe the populations and samples, namely:
Population Sample
There are two distinct cases to consider. Firstly, we will assume that although the variances are
unknown, they are in fact equal. Secondly, we will assume that the unknown variances are not
necessarily equal.
Case (i) - Unknown but equal variances
Again, when we look at hypothesis testing using two means, we will be considering the difference
µ1 − µ2 of the means and writing null hypotheses of the form
and again Value will often be zero and we will be trying to detect whether there is any statistically
significant difference between the means.
We will take σ12 = σ22 = σ 2 so that in diagrammatic form the populations are:
X1 ∼ N (μ1 , σ 2 ) X2 ∼ N (μ2 , σ 2 )
μ1 μ2
Figure 7
The results from our work on continuous distributions (see 38) tell us that:
E(X̄1 − X̄2 ) = E(X̄1 ) − E(X̄2 ) = µ1 − µ2
as before, and that
σ12 σ22
V(X̄1 − X̄2 ) = V(X̄1 ) − V(X̄2 ) = +
n1 n2
Given that we do not know the value of σ, we must estimate it. This is done by combining (or
pooling) the sample variances say S12 and S22 for samples 1 and 2 respectively according to the
formula:
(n1 − 1)S12 + (n2 − 1)S22
Sc2 =
n1 + n2 − 2
Notice that
(n1 − 1)S12 + (n2 − 1)S22 (n1 − 1)S12 (n2 − 1)S22
Sc2 = = +
n1 + n2 − 2 n1 + n2 − 2 n1 + n2 − 2
HELM (2006): 23
so that you can see that Sc2 is a weighted average of S12 and S22 . In fact, each sample variance is
weighted according to the number of degrees of freedom available. Notice also that the first sample
contributes n1 − 1 degrees of freedom and the second sample contributes n2 − 1 degrees of freedom
so that Sc2 has n1 + n2 − 2 degrees of freedom.
Since we are estimating unknown variances, the quantity T defined by
(X̄1 − X̄2 ) − (µ1 − µ2 )

T = q
Sc n11 + n12
will follow Student’s t-distribution with n1 + n2 − 2 degrees of freedom.
We are now ready to apply this formula to practical problems in which random samples of different
sizes with unknown but equal variances are drawn from independent normal populations. The con-
ditions for the rejection of H0 at the 5% and the 1% levels of significance are found from tables of
the t-distribution (Table 2), a copy of which is included to the end of this Workbook.
Example 5
A manufacturer of electronic equipment has developed a circuit to feed current
to a particular component in a computer display screen. While the new design is
cheaper to manufacture, it can only be adopted for mass production if it passes
the same average current to the component. In tests involving the two circuits,
the following results are obtained.
Test Number Circuit 1 - Current (mA) Circuit 2 - Current (mA)
1 80.1 80.7
2 82.3 81.3
3 84.1 84.6
4 82.6 81.7
5 85.3 86.3
6 81.3 84.3
7 83.2 83.7
8 81.7 84.7
9 82.2 82.8
10 81.4 84.4
11 85.2
12 84.9
On the assumption that the populations from which the samples are drawn have
equal variances, should the manufacturer replace the old circuit design by the
new one? Use the 5% level of significance.
24 HELM (2006):
®
Solution
If the average current flows are represented by µ1 and µ2 we form the hypotheses
H 0 : µ1 − µ2 = 0 H1 : µ1 − µ2 6= 0
The sample means are X̄1 = 82.42 and X̄2 = 83.72.
The sample variances are S12 = 2.00 and S22 = 2.72.
The pooled estimate of the variance is
(n1 − 1)S12 + (n2 − 1)S22 9 × 2.00 + 11 × 2.72
Sc2 = = = 2.396
n1 + n2 − 2 20
The test statistic is
(X̄1 − X̄2 ) − (µ1 − µ2 ) 82.42 − 83.72
T = q =√ q = −1.267
Sc n11 + n12 1
2.396 10 + 121
From t-tables, the critical values with 20 degrees of freedom and a two-tailed test are ±2.086. Since
−2.086 < −1.267 < 2.086 we conclude that we cannot reject the null hypothesis in favour of the
alternative. A 95% confidence
r interval for the difference between the mean currents is given by
1 1
x̄1 − x̄2 ± 2.086 × Sc + . The confidence interval is −2.683 < µ1 − µ2 < 0.083.
n1 n2
Task
A manufacturer of steel cables used in the construction of suspension bridges has
experimented with a new type of steel which it is hoped will result in the cables
produced being stronger in the sense that they will accept greater tension loads
before failure. In order to test the performance of the new cables in comparison
with the old cables, samples are tested for failure under tension. The following
results were obtained, the failure tensions being given in kg×103 .
Test Number New Cable Original Cable
1 92.7 90.2
2 91.6 92.4
3 94.7 94.7
4 93.7 92.1
5 96.5 95.9
6 94.3 91.1
7 93.7 93.2
8 96.8 91.5
9 98.9
10 99.9
The cable manufacturer, on looking at health and safety legislation, decides that a
1% level of significance should be used in any statistical testing procedure adopted
to distinguish between the cables. On the basis of the results given, should the
manufacturer replace the old cable by the new one? You may assume that the
populations from which the samples are drawn have equal variances.
HELM (2006): 25
Your solution
Answer
If the average tensions are represented by µ1 (new cable) and µ2 (old cable) we form the hypotheses
H 0 : µ1 − µ2 = 0 H 1 : µ1 − µ2 > 0
in order to test the hypothesis that the new cable is stronger on average than the old cable.
The pooled estimate of the variance is
(n1 − 1)S12 + (n2 − 1)S22 9 × 6.47 + 7 × 3.14
Sc2 = = = 5.013
n1 + n2 − 2 16
(X̄1 − X̄2 ) − (µ1 − µ2 ) 95.28 − 92.64 2.64
T = = = √ = 2.486
√
r r
1 1 1 1 2.239 × 0.225
Sc + 2.239 +
n1 n2 10 8
Using t-distribution tables with 16 degrees of freedom, we see that the critical value at the 1% level
of significance is 2.583. Since 2.486 < 2.583 we conclude that we cannot reject the null hypothesis
in favour of the alternative. However, the close result indicates that more tests should be carried
out before making a final decision. At this stage the cable manufacturer should not replace the old
cable by the new one on the basis of the evidence available.
26 HELM (2006):
®
Case (ii) - Unknown and unequal variances

In this case we will take σ12 6= σ22 so that in diagrammatic form the populations may be represented
as shown below.
X1 ∼ N (μ1 , σ12 )
X2 ∼ N (μ2 , σ22 )
μ1 μ2
Figure 8
Again, when we look at hypothesis testing using two means, we will be considering the difference
µ1 − µ2 of the means and writing null hypotheses of the form
and again Value will often be zero and we will be trying to detect whether there is any statistically
significant difference between the means.
In the case where we assume unequal variances, there is no exact statistic which we can use to test
the validity or otherwise of the null hypothesis H0 : µ1 − µ2 = Value. However, the following
approximation in Key Point 1 allows us to overcome this problem.
Key Point 1
Provided that the null hypothesis is true, the statistic
(X̄1 − X̄2 ) − (µ1 − µ2 )
T = q
Sc n11 + n12
will approximately follow Student’s distribution with the number of degrees of freedom given by the
expression:
2
S12 S22

+
n1 n2
ν = 2 2 2 2 − 2
S1 S2
n1 n2
+
n1 + 1 n2 + 1
Essentially, this means that the actual test procedure is similar to that used previously but with T
and the number of degrees of freedom ν calculated using the above formulae.
HELM (2006): 27
We are now ready to apply these formulae to practical problems in which random samples of different
sizes with unknown and unequal variances are drawn from independent normal populations. We will
illustrate the test procedure by reworking an Example and Task done previously but we will assume
unequal rather than equal variances.
This next Example is a repeat of Example 5 but here assuming unequal variances.
Example 6
the results are obtained are:
1 80.1 80.7
2 82.3 81.3
3 84.1 84.6
4 82.6 81.7
5 85.3 86.3
6 81.3 84.3
7 83.2 83.7
8 81.7 84.7
9 82.2 82.8
10 81.4 84.4
11 85.2
12 84.9
On the assumption that the populations from which the samples are drawn do not
have equal variances, should the manufacturer replace the old circuit design by
the new one? Use the 5% level of significance.
Solution
If the average current flows are represented by µ1 and µ2 we form the hypotheses
H 0 : µ1 − µ2 = 0 H1 : µ1 − µ2 6= 0
(X̄1 − X̄2 ) − (µ1 − µ2 ) 82.42 − 83.72 1.3
T = q 2 = q = − √ = −1.990
S1 S22 2.00 2.72 0.427
n1
+ n2 10
+ 12
28 HELM (2006):
®
Solution (contd.)
The number of degrees of freedom is given by
2
S12 S22

+
n1 n2
ν = 2 2 2 2 − 2
S1 S2
n1 n2
+
n1 + 1 n2 + 1
2
2.00 2.72
+
10 12 0.182
= 2 2 −2 = − 2 ≈ 21
(2.00/10) (2.72/12) 0.004 + 0.004
+
11 13
From t-tables, the critical values (two-tailed test, 5% level of significance) are ±2.080. Since
−2.080 < −1.990 < 2.080 we conclude that there is insufficient evidence to reject the null hypoth-
esis in favour of the alternative at the 5% level of significance.
This next Task is a repeat of the Task on page 25 but assuming unequal variances.
Task
with the old cables, samples are tested for failure under tension. The results
obtained are given below where the failure tensions are given in kg×103 .
Test Number New Cable Original Cable
1 92.7 90.2
2 91.6 92.4
3 94.7 94.7
4 93.7 92.1
5 96.5 95.9
6 94.3 91.1
7 93.7 93.2
8 96.8 91.5
9 98.9
10 99.9
The cable manufacturer, on looking at health and safety legislation, decides that a
1% level of significance should be used in any statistical testing procedure adopted
to distinguish between the cables. On the basis of the results given and assuming
that the populations from which the samples are drawn do not have equal
variances, should the manufacturer replace the old cable by the new one?
HELM (2006): 29
Your solution
Answer
If the average tensions are represented by µ1 (new cable) and µ2 (old cable), we form the hypotheses
H 0 : µ1 − µ2 = 0 H 1 : µ1 − µ2 > 0
in order to test the hypothesis that the new cable is stronger on average than the old cable.
(X̄1 − X̄2 ) − (µ1 − µ2 ) 95.28 − 92.64 2.64
T = q 2 = q = √ = 2.589
S1 S22 6.47 3.14 1.017
n1
+ n2 10
+ 8
The number of degrees of freedom is given by
2 2 2
S22

S1 6.47 3.14
+ +
n1 n2 10 8 1.081
ν = 2 2 2 2 − 2 = 2 2 −2 = −2≈4
S1 S2 (6.47/10) (3.14/8) 0.038 + 0.017
+
n1 n2 11 9
+
n1 + 1 n2 + 1
Using t-distribution tables with 18 degrees of freedom, we see that the critical value at the 1%
level of significance is 2.552. Since 2.589 < 2.552 we conclude that we reject the null hypothesis
in favour of the alternative. Notice that the result could still be considered marginal. The cable
manufacturer should exercise caution if the old cable is replaced by the new one on the basis of the
evidence available.
30 HELM (2006):
®
The F -test
In the tests above, we distinguished between the cases of equal and unequal variances of samples
chosen from independent normal populations. As you have seen, the analysis changes according to
the assumptions made, conclusions reached and recommendations made - accepting or rejecting a
null hypothesis for example - may also change. In view of this, we may wish to test in order to decide
whether the assumption that the variances σ12 and σ22 of the independent normal populations shown
in the diagram below, may be regarded as equal.
X1 ∼ N (μ1 , σ12 )
X2 ∼ N (μ2 , σ22 )
μ1 μ2
Figure 9
Essentially, we will test the null hypothesis
H0 : σ12 = σ22
against one of the alternatives
H1 : σ12 6= σ22 H1 : σ12 > σ22 H1 : σ12 < σ22
In order to do this, we use the F -distribution. The hypothesis test for the equality of two variances
σ12 and σ22 is encapsulated in the following Key Point.
Key Point 2
Consider a random sample of size n1 taken from a normal population with mean µ1 and variance σ12
and a random sample of size n1 taken from a second normal population with mean µ2 and variance
σ22 . Denote the respective sample variances by S12 and S22 and assume that the populations are
independent. The ratio
S12 S22
F = /
σ12 σ22
follows an F distribution in which the numerator has n1 −1 degrees of freedom and the denominator
has n2 − 1 degrees of freedom.
Note that if the null hypothesis H0 : σ12 = σ22 is true, then the value of F reduces to the ratio of
the sample variances and that in this case
S12
F =
S22
HELM (2006): 31
Note
Recall that if a random sample of size n1 is taken from a normal population with mean µ1 and
variance σ12 and if the sample variance is denoted by S12 , the random variable
(n1 − 1)S12
X12 =
σ12
has a χ2 distribution with n1 − 1 degrees of freedom. Similarly, if a random sample of size n2 is
taken from a normal population with mean µ2 and variance σ22 and if the sample variance is denoted
by S22 , the random variable
(n2 − 1)S22
X22 =
σ22
has a χ2 distribution with n2 − 1 degrees of freedom. This means that the ratio
S12 S22
F = /
σ12 σ22
is a ratio of χ2 random variables with n1 − 1 degrees of freedom in the numerator and n2 − 1 degrees
of freedom in the denominator. Under the null hypothesis
H0 : σ12 = σ22
we know that the expression for F reduces to
S12
F =
S22
and we say that F has an F -distribution with n1 − 1 degrees of freedom in the numerator and n2 − 1
degrees of freedom in the denominator. This distribution is denoted by
Fn1 −1,n2 −1
and some tabulated values are given in Tables 3 and 4 at the end of this Workbook.
If you check Tables 3 and 4, you will find that only right-tail values are given. The left-tail values
are calculated by using the following formula:
1
f1−α, n1 −1, n2 −1 =
fα, n2 −1, n1 −1
Note the reversal in the order in which the expressions for the number of degrees of freedom occur.
32 HELM (2006):
®
Example 7
The following is an extract from the F -distribution tables (5% tail) given at the
end of this Workbook.
f
5%
f0.05,u,ν
Degrees of Freedom for the Numerator (u)

ν 1 2 3 4 5 6 7 8 9 10 20 30 40 60 ∞
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 248.0 250.1 251.1 252.2 254.3
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.45 19.46 19.47 19.48 19.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.66 8.62 8.59 8.55 8.53
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.80 5.75 5.72 5.69 5.63
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.56 4.53 4.46 4.43 4.36
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.87 3.81 3.77 3.74 3.67
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.44 3.38 3.34 3.30 3.23
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.15 3.08 3.04 3.01 2.93
Figure 10
Write down or calculate as appropriate, the following values of F from the table:
Right-tail Values Left-tail Values

f0.05,4,3 f0.95,4,3
f0.05,8,2 f0.95,8,2
f0.05,7,8 f0.95,7,8
Solution
The right-tail values are read directly from the tables. The left-tail values are calculated using the
formula given above.

1 1
f0.05,4,3 = 9.12 f0.95,4,3 = f0.05,3,4 = 6.59 = 0.152
1 1
f0.05,8,2 = 19.37 f0.95,8,2 = f0.05,2,8 = 4.46 = 0.224
1 1
f0.05,7,8 = 3.50 f0.95,7,8 = f0.05,8,7 = 3.73 = 0.268
HELM (2006): 33
Task
Write down or calculate as appropriate, the following values of F from the tables
given at the end of this Workbook.

f0.05,10,20 f0.95,10,20
f0.05,5,30 f0.95,5,30
f0.05,20,7 f0.95,20,7
f0.025,10,10 f0.975,10,10
f0.025,8,30 f0.975,8,30
f0.025,20,30 f0.975,20,30
Your solution

f0.05,10,20 = f0.95,10,20 =
f0.05,5,30 = f0.95,5,30 =
f0.05,20,7 = f0.95,20,7 =
f0.025,10,10 = f0.975,10,10 =
f0.025,8,30 = f0.975,8,30 =
f0.025,20,30 = f0.975,20,30 =
Answer

1 1
f0.05,10,20 = 2.35 f0.95,10,20 = f0.05,20,10 = 2.77 = 0.361
1 1
f0.05,5,30 = 2.53 f0.95,5,30 = f0.05,30,5 = 4.53 = 0.221
1 1
f0.05,20,7 = 3.44 f0.95,20,7 = f0.05,7,20 = 2.51 = 0.398
1 1
f0.025,10,10 = 3.72 f0.975,10,10 = f0.025,10,10 = 3.72 = 0.269
1 1
f0.025,8,30 = 2.65 f0.975,8,30 = f0.025,30,8 = 3.89 = 0.257
1 1
f0.025,20,30 = 2.20 f0.975,20,30 = f0.025,30,20 = 2.35 = 0.426
We are now in a position to use the F -test to solve engineering problems. The application of the
F -test will be illustrated by using the data given in a previous worked example in order to determine
whether the assumption of equal variability in the samples used is realistic.
34 HELM (2006):
®
This next Example was met as Example 5 (page 24). Here we test one of the underlying assumptions.
Example 8
the results obtained are
1 80.1 80.7
2 82.3 81.3
3 84.1 84.6
4 82.6 81.7
5 85.3 86.3
6 81.3 84.3
7 83.2 83.7
8 81.7 84.7
9 82.2 82.8
10 81.4 84.4
11 85.2
12 84.9
In Example 5 we worked on the assumption that the populations from which the
samples are drawn have equal variances. Is this assumption valid at the 5% level
of significance?
Note that the manufacturer may also be interested in knowing whether the vari-
ances are equal as well as the means. We shall not address that problem here but
it can be argued that equality of variances will facilitate consistent performance
from the components.
Solution
We form the hypotheses
H0 : σ12 = σ22 H1 : σ12 6= σ22
and perform a two-tailed test.
S2 2.00
F = 12 = = 0.735
S2 2.72
which has an F -distribution with 9 degrees of freedom in the numerator and 11 degrees of freedom
in the denominator.
HELM (2006): 35
Solution (contd.)
We require two 2.5% tails, that is we require right-tail f0.025,9,11 = 3.59 and left-tail f0.975,9,11 . The
latter may be approximated as follows:
1 1 1 1 1 1

1 11
− 20 f0.025,10,9
+ 10 − 11 f0.025,20,9
f0.975,9,11 = ≈ 1 1

f0.025,11,9 10
− 20
0.040909
+ 0.009091
3.96 3.67
≈
0.05
0.81818 0.18182
≈ + = 0.256
3.96 3.67
Since 0.256 < 0.735 < 3.59 we conclude that we cannot reject the null hypothesis in favour of the
alternative at the 5% level of significance. The evidence supports the conclusion that the samples
have equal variability.
Note that we can adopt the rule (many statisticians do this) of always dividing the larger S 2 value
by the smaller S 2 value so that you only need to look up right tail values.
This next Task was first met on page 25. Here we test one of the underlying assumptions.
36 HELM (2006):
®
Task
with the old cables, samples are tested for failure under tension. The results
obtained are as follows, where the failure tensions are given in tonnes.
Test Number New steel cable tension Old steel cable tension
1 80.1 80.7
2 82.3 81.3
3 84.1 84.6
4 82.6 81.7
5 85.3 86.3
6 81.3 84.3
7 83.2 83.7
8 81.7 84.7
9 82.2 82.8
10 81.4 84.4
11 85.2
12 84.9
Last time we assumed that the populations from which the samples are drawn did
not have equal variances. Is this assumption valid at the 5% level of significance?
Your solution
HELM (2006): 37
Answer
We form the hypotheses
H0 : σ12 = σ22 H1 : σ12 6= σ22
and perform a two-tailed test.
S12 6.47
F = 2 = = 2.061
S2 3.14
which has an F -distribution with 9 degrees of freedom in the numerator and 7 degrees of freedom
in the denominator. We require two 2.5% tails. That is, we require right-tail f0.025,9,7 = 4.42 and
left-tail f0.975,9,7 which may be calculated as
1 1
f0.975,9,7 = = = 0.238
f0.025,7,9 4.20
Since 0.238 < 2.061 < 4.82 we conclude that we cannot reject the null hypothesis in favour of the
alternative at the 5% level of significance. The evidence does not support the conclusion that the
populations have unequal variances.
38 HELM (2006):
®
Table 1: The Normal Probability Integral

The area is denoted by A and is measured from the mean z = 0 to any ordinate z = z1 .
0 z1
Z = X−μ
σ
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 .0000 0040 0080 0120 0159 0199 0239 0279 0319 0359
0.1 .0398 0438 0478 0517 0557 0596 0636 0657 0714 0753
0.2 .0793 0832 0871 0910 0948 0987 1026 1064 1103 1141
0.3 .1179 1217 1255 1293 1331 1368 1406 1443 1480 1517
0.4 .1554 1591 1628 1664 1700 1736 1772 1808 1844 1879
0.5 .1915 1950 1985 2019 2054 2088 2123 2157 2190 2224
0.6 .2257 2291 2324 2357 2389 2422 2454 2486 2518 2549
0.7 .2530 2611 2642 2673 2704 2734 2764 2794 2823 2852
0.8 .2881 2910 2939 2967 2995 3023 3051 3078 3106 3133
0.9 .3159 3186 3212 3238 3264 3289 3315 3340 3365 3389
1.0 .3413 3438 3461 3485 3508 3531 3554 3577 3599 3621
1.1 .3643 3665 3686 3708 3729 3749 3770 3790 3810 3830
1.2 .3849 3869 3888 3907 3925 3944 3962 3980 3997 4015
1.3 .4032 4049 4066 4082 4099 4115 4131 4147 4162 4177
1.4 .4192 4207 4222 4236 4251 4265 4279 4292 4306 4319
1.5 .4332 4345 4357 4370 4382 4394 4406 4418 4430 4441
1.6 .4452 4463 4474 4485 4495 4505 4515 4525 4535 4545
1.7 .4554 4564 4573 4582 4591 4599 4608 4616 4625 4633
1.8 .4641 4649 4656 4664 4671 4678 4686 4693 4699 4706
1.9 .4713 4719 4726 4732 4738 4744 4750 4756 4762 4767
2.0 .4772 4778 4783 4788 4793 4798 4803 4808 4812 4817
2.1 .4621 4826 4830 4835 4838 4842 4846 4850 4854 4857
2.2 .4861 4865 4868 4871 4875 4878 4881 4884 4887 4890
2.3 .4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.4 .4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.5 .4938 4940 4941 4943 4945 4946 4948 4949 4951 4952
2.6 .4953 4955 4956 4957 4959 4960 4961 4962 4963 4964
2.7 .4965 4966 4967 4968 4969 4970 4971 4972 4973 4974
2.8 .4974 4975 4976 4977 4977 4978 4979 4980 4980 4981
2.9 .4981 4982 4983 4983 4984 4984 4985 4985 4986 4986
3.0 .4986 4987 4987 4988 4988 4989 4989 4989 4990 4990
3.1 .4990 4991 4991 4991 4992 4992 4992 4992 4993 4993
3.2 .4993 4994 4994 4994 4994 4994 4994 4995 4995 4995
3.3 .4995 4995 4995 4996 4996 4996 4996 4996 4996 4997
3.4 .4997 4997 4997 4997 4997 4997 4997 4997 4997 4998
3.5 .4998 4998 4998 4998 4998 4998 4998 4998 4998 4998
3.6 .4998 4998 4999 4999 4999 4999 4999 4999 4999 4999
3.7 .4999 4999 4999 4999 4999 4999 4999 4999 4999 4999
3.8 .4999 4999 4999 4999 4999 4999 4999 4999 4999 4999
3.9 .4999 4999 4999 4999 4999 4999 4999 4999 4999 4999
Note that some text books give the final line entries as 0.5 rather than 0.4999.
In these workbooks we shall use 0.4999.
HELM (2006): 39
Table 2: Percentage Points of the Students t -distribution
tα,ν
α .40 .25 .10 .05 .025 .01 .005 .0025 .001 .0005
ν
1 .325 1.000 3.078 6.314 12.706 31.825 63.657 127.32 318.31 636.62
2 .289 .816 1.886 2.902 4.303 6.965 9.925 14.089 23.326 31.598
3 .277 .765 1.638 2.353 3.182 4.514 5.841 7.453 10.213 12.924
4 .271 .741 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 .267 .727 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 .265 .718 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 .263 .711 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 .262 .706 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 .261 .703 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 .260 .700 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.487
11 .260 .697 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 .259 .695 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 .259 .694 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 .258 .692 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 .258 .691 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 .258 .690 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 .257 .689 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 .257 .688 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 .257 .688 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 .257 .687 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 .257 .686 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 .256 .686 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 .256 .685 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.767
24 .256 .685 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 .256 .684 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 .256 .684 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 .256 .684 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 .256 .683 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 .256 .683 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 .256 .683 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 .255 .681 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
60 .254 .679 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
120 .254 .677 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
∞ .253 .674 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291
40 HELM (2006):
®
Table 3: Percentage Points of the F -Distribution (5% tail)
5%
f0.05,u ,ν
ν 1 2 3 4 5 6 7 8 9 10 20 30 40 60 ∞
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 248.0 250.1 251.1 252.2 254.3
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.45 19.46 19.47 19.48 19.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.66 8.62 8.59 8.55 8.53
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.80 5.75 5.72 5.69 5.63
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.56 4.53 4.46 4.43 4.36
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.87 3.81 3.77 3.74 3.67
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.44 3.38 3.34 3.30 3.23
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.15 3.08 3.04 3.01 2.93
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 2.94 2.86 2.83 2.79 2.71
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.77 2.70 2.66 2.62 2.54
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.65 2.57 2.53 2.49 2.40
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.54 2.47 2.43 2.38 2.30
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.46 2.38 2.34 2.30 2.21
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.39 2.31 2.27 2.22 2.13
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.33 2.25 2.20 2.16 2.07
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.28 2.19 2.15 2.11 2.01
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.23 2.15 2.10 2.06 1.96
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.19 2.11 2.06 2.02 1.92
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.16 2.07 2.03 1.93 1.88
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.12 2.04 1.99 1.95 1.84
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.10 2.01 1.96 1.92 1.81
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.07 1.98 1.94 1.89 1.78
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.05 1.96 1.91 1.86 1.76
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.03 1.94 1.89 1.84 1.73
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.01 1.92 1.87 1.82 1.71
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 1.99 1.90 1.85 1.80 1.69
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 1.97 1.88 1.84 1.79 1.67
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 1.96 1.87 1.82 1.77 1.65
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 1.94 1.85 1.81 1.75 1.64
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 1.93 1.84 1.79 1.74 1.62
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 1.84 1.74 1.69 1.64 1.51
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.75 1.65 1.59 1.53 1.39
∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 1.57 1.46 1.39 3.32 1.00
HELM (2006): 41
Table 4: Percentage Points of the F -Distribution (2.5% tail)
2.5%
f0.025,u,ν

ν 1 2 3 4 5 6 7 8 9 10 20 30 40 60 ∞
1 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.7 963.3 968.6 993.1 1001 1006 1010 1018
2 38.51 39.00 39.17 39.25 39.30 39.33 39.36 39.37 39.39 39.40 39.45 39.46 39.47 39.48 39.50
3 17.44 16.04 15.44 15.10 14.88 14.73 14.62 14.54 14.47 14.42 14.17 14.08 14.04 13.99 13.90
4 12.22 10.65 9.98 9.60 9.36 9.20 9.07 8.98 8.90 8.84 8.56 8.46 8.41 8.36 8.26
5 10.01 8.43 7.76 7.39 7.15 6.98 6.85 6.76 6.68 6.62 6.33 6.23 6.18 6.12 6.02
6 8.81 7.26 6.60 6.23 5.99 5.82 5.70 5.60 5.52 5.46 5.17 5.07 5.01 4.96 4.85
7 8.07 6.54 5.89 5.52 5.29 5.12 4.99 4.90 4.82 4.75 4.47 4.36 4.31 4.25 4.14
8 7.57 6.06 5.42 5.05 4.82 4.65 4.53 4.43 4.36 4.30 4.00 3.89 3.84 3.78 3.67
9 7.21 5.71 5.08 4.72 4.48 4.32 4.20 4.10 4.03 3.96 3.67 3.56 3.51 3.45 3.33
10 6.94 5.46 4.83 4.47 4.24 4.07 3.95 3.85 3.78 3.72 3.42 3.31 3.26 3.20 3.08
11 6.72 5.26 4.63 4.28 4.04 3.88 3.76 3.66 3.59 3.53 3.23 3.12 3.06 3.00 2.88
12 6.55 5.10 4.47 4.12 3.89 3.73 3.61 3.51 3.44 3.37 3.07 2.96 2.91 2.85 2.72
13 6.41 4.97 4.35 4.00 3.77 3.60 3.48 3.39 3.31 3.25 2.95 2.84 2.78 2.72 2.60
14 6.30 4.86 4.24 3.89 3.66 3.50 3.38 3.29 3.21 3.15 2.84 2.73 2.67 2.61 2.49
15 6.20 4.77 4.15 3.80 3.58 3.41 3.29 3.20 3.12 3.06 2.76 2.64 2.59 2.52 2.40
16 6.12 4.69 4.08 3.73 3.50 3.34 3.32 3.12 3.05 2.99 2.68 2.57 2.51 2.45 2.32
17 6.04 4.62 4.01 3.66 3.44 3.28 3.16 3.06 2.98 2.92 2.62 2.50 2.44 2.38 2.25
18 5.98 4.56 3.95 3.61 3.38 3.22 3.10 3.01 2.93 2.87 2.56 2.44 2.38 2.32 2.19
19 5.92 4.51 3.90 3.56 3.33 3.17 3.05 2.96 2.88 2.82 2.51 2.39 2.33 2.27 2.13
20 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 2.77 2.46 2.35 2.29 2.22 2.09
21 5.83 4.42 3.82 3.48 3.25 3.09 2.97 2.87 2.80 2.73 2.42 2.31 2.25 2.18 2.04
22 5.79 4.38 3.78 3.44 3.22 3.05 2.93 2.84 2.76 2.70 2.39 2.27 2.21 2.14 2.00
23 5.75 4.35 3.75 3.41 3.18 3.02 2.90 2.81 2.73 2.67 2.36 2.24 2.18 2.11 1.97
24 5.72 4.32 3.72 3.38 3.15 2.99 2.87 2.78 2.70 2.64 2.33 2.21 2.15 2.08 1.94
25 5.69 4.29 3.69 3.35 3.13 2.97 2.85 2.75 2.68 2.61 2.30 2.18 2.12 2.05 1.91
26 5.66 4.27 3.67 3.33 3.10 2.94 2.82 2.73 2.65 2.59 2.28 2.16 2.09 2.03 1.88
27 5.63 4.24 3.65 3.31 3.08 2.92 2.80 2.71 2.63 2.57 2.25 2.13 2.07 2.00 1.85
28 5.61 4.22 3.63 3.29 3.06 2.90 2.78 2.69 2.61 2.55 2.23 2.11 2.05 1.91 1.83
29 5.59 4.20 3.61 3.27 3.04 2.88 2.76 2.67 2.59 2.53 2.21 2.09 2.03 1.96 1.81
30 5.57 4.18 3.59 3.25 3.03 2.87 2.75 2.65 2.57 2.51 2.20 2.07 2.01 1.94 1.79
40 5.42 4.05 3.46 3.13 2.90 2.74 2.62 2.53 2.45 2.39 2.07 1.94 1.88 1.80 1.64
60 5.29 3.93 3.34 3.01 2.79 2.63 2.51 2.41 2.33 2.27 1.94 1.82 1.74 1.67 1.48
∞ 5.02 3.69 3.12 2.79 2.57 2.41 2.29 2.19 2.11 2.05 1.71 1.57 1.48 1.39 1.00
42 HELM (2006):
Contents 42
Goodness of Fit and
Contingency Tables
42.1 Goodness of Fit 2
42.2 Contingency Tables 16
Learning outcomes
You will learn how to decide whether a set of data fits a particular distribution. You will
also learn about a situation in which hypothesis tests are applied to non-numeric data.

Goodness of Fit 42.1
Introduction
If you are applying statistics to practical problems in industry, you may find that much of your work is
concerned with making decisions concerning probability distributions. Sometimes it is advantageous
to be able to describe the approximate probability distribution followed by a data set obtained ex-
perimentally. For example you may be asked to decide whether a data set is approximately normal.
In order to make such decisions, you will find that you may use the chi-squared test provided that
certain conditions are satisfied. On other occasions you may be given data concerning non-numeric
variables in the form of a contingency table. This is one of those occasions when hypothesis tests
can be applied to non-numeric variables.
#
• understand how to find probabilities for a
chi-squared distribution ( 40)
Prerequisites
• understand the principles of hypothesis
testing ( 41)
"
!

• explain the term goodness-of-fit
Learning Outcomes
• perform hypothesis tests based on the
On completion you should be able to . . . chi-squared distribution

2 HELM (2006):
Workbook 42: Goodness of Fit and Contingency Tables
®
1. Goodness-of-fit tests
The aim of a goodness-of-fit test is to determine the underlying nature of the probability distribution
describing the population from which a random sample has been drawn. For example, we may wish
to determine whether the population from which a sample has been drawn has a normal, binomial or
Poisson distribution. While a variety of goodness-of-fit tests exist, the test described here depends
on the χ2 -distribution and is usually called the chi-squared test.
We assume that a random sample of size n has been drawn from a population with an unknown
probability distribution and that we wish to determine the nature of that distribution.
• Firstly, if the data are continuous we organize the data into k intervals (often equal but not
necessarily so) in order that we can write down the observed frequency, say Oi , of the ith
interval for 1 ≤ i ≤ k.
• Secondly, we form a hypothesis about the nature of the unknown distribution. That is, we
assume that it is normal, binomial, Poisson or some other appropriate probability distribution.
• Thirdly, we calculate, on the basis of the hypothesis outlined above, the expected frequency,
say Ei , of the ith interval for 1 ≤ i ≤ k. The values of Ei are calculated using the formula
Ei = nPi
where Pi is the probability associated with the interval i.

• Fourthly, we calculate the goodness-of-fit statistic as defined in Key Point 1.
Key Point 1
The goodness-of-fit statistic is given by
k
X (Oi − Ei )2
W =
i=1
Ei
It can be shown that, if the assumption made about the nature of the population (normal, binomial,
Poisson etc.) is true then W follows (approximately) a chi-squared distribution with k −p−1 degrees
of freedom. Note that p represents the number of parameters needed to describe the probability
distribution of the population which we have to estimate from the data. For example the normal
distribution has two parameters µ and σ, the binomial distribution has two parameters n and p but
we usually only need to estimate p, while the Poisson distribution has one parameter, µ.
• Fifthly, we reject the hypothesis concerning the nature of the underlying probability distribution
if the calculated value of W exceeds the value of χ2α,k−p−1 where α is the area in the tail of
the χ2 -distribution, typically 5% or 1%.
HELM (2006): 3
Section 42.1: Goodness of Fit
Notes
(a) The larger the sample, the more reliable the result since the assertion that W follows
(approximately) a chi-squared distribution improves with increasing sample size.
(b) The size of the expected frequencies should be monitored carefully. Various authors
recommend that minimum expected frequencies of 3, 4 or 5 are acceptable. It is reasonably
safe to accept expected frequencies provided that they are greater than 5 and 10 is
certainly acceptable.
(c) Some authors recommend that the k intervals into which the data are organized are
chosen so that the frequencies in each interval are roughly equal - remember that equal
intervals are not necessary for the test to be performed.
We will now look at two examples of goodness-of-fit tests, the first uses a (discrete) Poisson distri-
bution and the second uses a (continuous) normal distribution. Each worked Example is immediately
followed by a Task for you to do.
Example 1
A manufacturer produces high-quality sheet aluminium for use in highly stressed
aircraft wings. A random sample of 100 sheets is inspected and the number of
faults per sheet recorded. The results are given in the table below.
Number of Faults per Sheet Frequency of Occurrence
0 50
1 24
2 14
3 8
4 4
Suggest a possible probability distribution from which the sample may have been
drawn and perform a chi-squared test to determine the validity of your suggestion.
Solution
The data are already given in 5 classes with observed frequencies as shown. We will assume that
the underlying distribution is Poisson and calculate the expected frequencies accordingly using the
e−µ µr
Poisson formula P(X = r) = We need the value of the mean.
r!
50 × 0 + 24 × 1 + 14 × 2 + 8 × 3 + 4 × 4
This is calculated as µ = = 0.92
100
Hence the Poisson probabilities and the corresponding expected frequencies are:
e−µ µ0
p0 = P(X = 0) = = e−0.92 = 0.399, E0 = 39.9
0!
e−µ µ1
p1 = P(X = 1) = = e−0.92 × 0.92 = 0.367, E1 = 36.7
1!
4 HELM (2006):
®
Solution (contd.)
e−µ µ2 e−0.92 × 0.922
p2 = P(X = 2) = = = 0.169, E2 = 16.9
2! 2
e−µ µ3 e−0.92 × 0.923
p3 = P(X = 3) = = = 0.052, E3 = 5.2
3! 6
p4 = P(X ≥ 4) = 1 − (0.399 + 0.367 + 0.169 + 0.052) = 0.013, E4 = 1.3
Note that in calculating p4 we have ensured that our probabilities sum to unity.
Since the last frequency is very small we will combine the last two and use 4 classes so that O3 = 12
and E3 = 6.5.
3
X (Oi − Ei )2 (50 − 39.9)2 (24 − 36.7)2 (14 − 16.9)2 (12 − 6.5)2
W = = + + + = 12.103
i=0
Ei 39.9 36.7 16.9 6.5
and the number of degrees of freedom is k − p − 1 = 4 − 1 − 1 = 2 so that the critical value from
Table 1 (at the end of the Workbook) is χ20.05,2 = 5.99. Clearly 12.103 > 5.99 and we must reject
the null hypothesis that the underlying distribution is Poisson.
Task
A manufacturer produces electronic components for use in computer controlled
monitoring systems. A random sample of 100 components is inspected and the
number of faults per component recorded. The results are given in the table below.
Number of Faults per Component Frequency of Occurrence
0 45
1 35
2 16
3 4
Perform a chi-squared test to determine the validity of the assumption that the
occurrence of faults in the components is Poisson.
Your solution
HELM (2006): 5
Answer
The data are given in 4 classes with observed frequencies as shown. The expected frequencies using
the Poisson formula with a mean
45 × 0 + 35 × 1 + 16 × 2 + 4 × 3
µ= = 0.79
100
are
e−µ µ0
p0 = P(X = 0) = = e−0.79 = 0.454, E0 = 45.4
0!
e−µ µ1
p1 = P(X = 1) = = e−0.79 × 0.79 = 0.359, E1 = 35.9
1!
e−µ µ2 e−0.79 × 0.792
p2 = P(X = 2) = = = 0.142, E2 = 14.2
2! 2
p3 = P(X ≥ 3) = 1 − (0.454 + 0.359 + 0.142) = 0.045, E3 = 4.5
The last frequency is small but since it is greater than 3 we will allow its use.
3
X (Oi − Ei )2 (45 − 45.4)2 (35 − 35.9)2 (16 − 14.2)2 (4 − 4.5)2
W = = + + + = 0.310
i=0
Ei 45.4 35.9 14.2 4.5
tables is χ20.05,2 = 5.99. Clearly 0.310 < 5.99 and we accept the null hypothesis that the underlying
distribution is Poisson. Note that the decision to accept the value E3 = 4.5 is fairly marginal and
that some personal judgement in such situations as to whether such values should be accepted or
combined with another class is unavoidable.
Task
Using the data of the previous Task but combining the expected frequencies of
the last two classes, perform a chi-squared test to determine the validity of the
assumption that the occurrence of faults in the components is Poisson.
Your solution
6 HELM (2006):
®
Answer
The data are given in 4 classes with observed frequencies as shown. The expected frequencies using
the Poisson formula with a mean
45 × 0 + 35 × 1 + 16 × 2 + 4 × 3
µ= = 0.79
100
are
e−µ µ0
p0 = P(X = 0) = = e−0.79 = 0.454, E0 = 45.4
0!
e−µ µ1
p1 = P(X = 1) = = e−0.79 × 0.79 = 0.359, E1 = 35.9
1!
e−µ µ2 e−0.79 × 0.792
p2 = P(X = 2) = = = 0.142, E2 = 14.2
2! 2
p3 = P(X ≥ 3) = 1 − (0.454 + 0.359 + 0.142) = 0.045, E3 = 4.5
We will combine the expected frequencies of the last two classes and use 3 classes in total with
expected frequencies of E0 = 45.4, E1 = 35.9, E2 = 18.7.
3
X (Oi − Ei )2 (45 − 45.4)2 (35 − 35.9)2 (20 − 18.7)2
W = = + + = 0.113
i=0
Ei 45.4 35.9 18.7
Table 1 (at the end of the Workbook) is χ20.05,2 = 5.99. Clearly 0.113 < 5.99 and we accept the
null hypothesis that the underlying distribution is Poisson. Note that the decision to combine the
last two classes has not, in this case, affected the acceptance of the null hypothesis.
Example 2
A quality control engineer is given the job of checking the voltage output char-
acteristics of a circuit component in a CD player. After checking 100 randomly
selected components and plotting a histogram of the results, the engineer con-
cludes that the mean output of the 100 checked components is x̄ = 6.12 volts,
that the standard deviation is s = 0.1 volts and that the voltage distribution is
probably normal. Choose a suitable test to decide whether the assumption of
normality is valid at the 5% level of significance.
HELM (2006): 7
Solution
The engineer decides to use a chi-squared test to test the assumption of normality and follow the
(common) practice of ensuring that the expected frequencies are equal. To do this, the data are
put into eight equal classes and the class boundaries calculated as follows.
From the standard normal distribution the Z values corresponding to class boundaries giving a
probability of 0.125 (i.e. 1/8) may be read off from tables as 0, 0.32, 0.675, 1.15 and ∞ for positive
values and 0, −0.32, −0.675, −1.15 and −∞ for negative values. Using
x − x̄
Z= → x = x̄ + Z.s
s
the class boundaries are calculated to be: 6.005, 6.053, 6.088, 6.120, 6.152, 6.188, 6.235. This gives
the eight classes, the observed frequencies found by the engineer (you are given this information
here), and the expected frequencies as:
Classes Observed Frequencies Oi Expected Frequencies Ei
x < 6.005 8 12.5
6.005 ≤ x < 6.053 11 12.5
6.053 ≤ x < 6.088 16 12.5
6.088 ≤ x < 6.120 19 12.5
6.120 ≤ x < 6.152 18 12.5
6.152 ≤ x < 6.188 13 12.5
6.188 ≤ x < 6.235 9 12.5
6.235 ≤ x 6 12.5
The hypotheses are: H0 : distribution is normal, H1 : distribution is not normal
8
X (Oi − Ei )2
W =
i=1
Ei
(8 − 12.5)2 (11 − 12.5)2 (16 − 12.5)2 (19 − 12.5)2 (18 − 12.5)2

= + + + +
12.5 12.5 12.5 12.5 12.5
(13 − 12.5)2 (9 − 12.5)2 (16 − 12.5)2
+ + +
12.5 12.5 12.5
= 1.62 + 0.18 + 0.98 + 3.38 + 2.42 + 0.02 + 0.98 + 3.38 = 12.96
Table 1 is χ20.05,5 = 11.07.
Since 11.07 < 12.96 we have sufficient evidence to reject the null hypothesis and so the engineer
should conclude that the distribution of voltages is not normal.
8 HELM (2006):
®
Task
An electrical engineer working for a Health and Safety Executive measures the
radiation emitted through the closed doors of 100 used microwave ovens. The
measurements, in mw cm−2 , are given in the table below.
0.19 0.16 0.14 0.20 0.17 0.21 0.18 0.22 0.26 0.23
0.13 0.17 0.16 0.21 0.18 0.22 0.20 0.23 0.16 0.26
0.19 0.16 0.14 0.20 0.18 0.21 0.19 0.22 0.27 0.24
0.12 0.17 0.15 0.20 0.18 0.22 0.19 0.23 0.29 0.25
0.06 0.16 0.14 0.20 0.17 0.21 0.18 0.22 0.26 0.23
0.13 0.17 0.16 0.20 0.18 0.22 0.19 0.23 0.30 0.25
0.19 0.17 0.14 0.20 0.18 0.21 0.19 0.22 0.27 0.24
0.11 0.17 0.15 0.20 0.18 0.21 0.19 0.23 0.27 0.24
0.13 0.17 0.16 0.21 0.18 0.22 0.19 0.23 0.33 0.25
0.13 0.17 0.16 0.21 0.18 0.22 0.19 0.23 0.36 0.26
The mean radiation of the checked ovens is x̄ = 0.20 mw cm−2 , and the standard
deviation is s = 0.05 mw cm−2 . Verify that the table below giving the eight classes
corresponding to the observed and expected frequencies shown is correct.
Classes Observed Frequencies Oi Expected Frequencies Ei
x < 0.143 11 12.5
0.143 ≤ x < 0.166 10 12.5
0.166 ≤ x < 0.184 19 12.5
0.184 ≤ x < 0.200 10 12.5
0.200 ≤ x < 0.216 16 12.5
0.216 ≤ x < 0.234 17 12.5
0.234 ≤ x < 0.258 6 12.5
0.258 ≤ x 11 12.5
Use a chi-squared test to decide whether the radiation readings obtained from the
ovens are normally distributed at the 5% level of significance.
Your solution
HELM (2006): 9
Answer
Although the choice of class boundaries is arbitrary, for convenience we choose boundaries to make
eight classes with equal probabilities of 0.125.
From the standard normal distribution the Z values corresponding to class boundaries giving a
probability of 0.125 may be read off from tables as 0, 0.32, 0.675, 1.15 and ∞ for positive values
and 0, −0.32, −0.675, −1.15 and −∞ for negative values. Using
x − x̄
Z= → x = x̄ + Z.s
s
the class boundaries are calculated to be:
0.143, 0.166, 0.184, 0.200, 0.216, 0.234, 0.258
This gives the eight classes, the observed frequencies found by the engineer and the expected
frequencies as given in the table above.
The hypotheses are: H0 : distribution is normal, H1 : distribution is not normal.
8
X (Oi − Ei )2
W =
i=1
Ei
(11 − 12.5)2 (10 − 12.5)2 (19 − 12.5)2 (10 − 12.5)2 (16 − 12.5)2
= + + + +
12.5 12.5 12.5 12.5 12.5
(17 − 12.5)2 (6 − 12.5)2 (11 − 12.5)2
+ + +
12.5 12.5 12.5
= 0.18 + 0.5 + 3.38 + 0.5 + 0.98 + 1.62 + 3.38 + 0.18 = 10.72
Table 1 is χ20.05,5 = 11.07.
Since 10.72 < 11.07 we do not have sufficient evidence to reject the null hypothesis and so the
engineer should conclude that the distribution of microwave radiation readings taken from the ovens
is normal.
10 HELM (2006):
®
Exercises
1. A factory produces portable CD players. Every week a sample of ten players is selected and
subjected to 100 hours of continuous use. At the end of this time the players are tested and the
number not reaching a specified standard is recorded. The numbers recorded in 100 consecutive
weeks are given below. Test the hypothesis that the data come from a binomial distribution.
Use the 5% level of significance.
Number failing standard 0 1 2 3 4 5

Number of weeks 34 24 19 14 9 0
2. A highway engineer records the numbers of vehicles passing a point in a road in 120 consecutive
one-minute intervals, as follows. Test the hypothesis that the data come from a Poisson
distribution. Use the 5% level of significance.
Number of vehicles 0 1 2 3 4 5 6 7 8 9 10 11
Number of intervals 0 5 10 20 30 20 15 7 6 4 2 1
3. In a test of a device to generate electricity from wave power at sea, 60 observations are made
of the root mean square bending moment Y of a component (in newton metres). The data
are summarised as follows. The sample mean is 5.08 and the sample variance is 3.29. Test the
hypothesis that Y has a normal distribution. Use the 5% level of significance.
Class Frequency Class Frequency

Y ≤2 1 6<Y ≤7 5
2<Y ≤3 4 7<Y ≤8 4
3<Y ≤4 12 8<Y ≤9 2
4<Y ≤5 18 9 < Y ≤ 10 2
5<Y ≤6 11 10 < Y 1
4. Eighty aircraft components are tested until they fail. The failure times T in hours are sum-
marised as follows. The sample mean is 6434. Test the hypothesis that the distribution of T
is exponential. Use the 5% level of significance.
Class Frequency Class Frequency

0 < T ≤ 2000 11 10000 < T ≤ 12000 3
2000 < T ≤ 4000 21 12000 < T ≤ 14000 5
4000 < T ≤ 6000 19 14000 < T ≤ 16000 1
6000 < T ≤ 8000 9 16000 < T ≤ 18000 3
8000 < T ≤ 10000 4 18000 < T 4
HELM (2006): 11
Answers
1. Total number of failures: 0 × 34 + 1 × 24 + · · · + 4 × 9 = 140.
Mean number of failures per week: 140/100 = 1.4.
Estimate of p : 1.4/5 = 0.28.
Use binomial(5, 0.28) distribution.

5
P(X = j) = 0.28j 0.725−j
j
No. failing Probability Frequency

Expected Observed
0 0.1935 19.35 34
1 0.3762 37.62 24
2 0.2926 29.26 19
3 0.1138 11.38 14
4 0.0221 2.21 9
5 0.0017 0.17 0
Some expected frequencies are too small so we combine neighbouring classes.
No. failing Probability Frequency

Expected Observed
0 0.1935 19.35 34
1 0.3762 37.62 24
2 0.2926 29.26 19
3,4,5 0.1376 13.76 23
Test statistic:
(34 − 19.35)2 (23 − 13.76)2

W = + ··· + = 25.825.
19.35 13.76
Degrees of freedom: 4 − 1 − 1 = 2 (4 classes, 1 estimated parameter).

Critical value: χ22 (5%) = 5.991.
The test statistic is significant at the 5% level. We reject the null hypothesis. We conclude that
the data do not come from a binomial distribution. There seems to be an excess of large and small
counts.
12 HELM (2006):
®
Answers
2. Total number of vehicles: 0 × 0 + 1 × 5 + 2 × 10 + · · · + 11 × 1 = 559.
Mean number of vehicles per minute: 559/120 = 4.658.
Use Poisson(4.658) distribution.
e−4.658 4.658j
P(X = j) = .
j!
No. vehicles Probability Frequency

Expected Observed
0 0.00949 1.14 0
1 0.04418 5.30 5
2 0.10290 12.35 10
3 0.15977 19.17 20
4 0.18606 22.33 30
5 0.17333 20.80 20
6 0.13456 16.15 15
7 0.08954 10.74 7
8 0.05214 6.26 6
9 0.02698 3.24 4
10 0.01257 1.51 2
≥ 11 0.00848 1.02 1
No. vehicles Probability Frequency

Expected Observed
0,1 0.05367 6.44 5
2 0.10290 12.35 10
3 0.15977 19.17 20
4 0.18606 22.33 30
5 0.17333 20.80 20
6 0.13456 16.15 15
7 0.08954 10.74 7
8 0.05214 6.26 6
≥9 0.04803 5.76 7
Test statistic:
(5 − 6.44)2 (7 − 5.76)2
W = + ··· + = 5.132.
6.44 5.76

The test statistic is not significant at the 5% level. We do not reject the null hypothesis. There is
insufficient evidence to conclude that the data do not come from a Poisson distribution.
HELM (2006): 13
Answers
3. Using a N (5.08, 3.29) distribution we can calculate the probabilities for the various class intervals.
For example,

6 − 5.08 5 − 5.08
P(5 < Y ≤ 6) = Φ √ −Φ √
3.29 3.29
= Φ(0.5072) − Φ(−0.0441)
= Φ(0.5072) + Φ(0.0441) − 1
= 0.694 + 0.518 − 1 = 0.694 − 0.482 = 0.212.
Bending moment Y Probability Frequency

Expected Observed
Y ≤2 0.045 2.70 1
2<Y ≤3 0.081 4.86 4
3<Y ≤4 0.150 9.00 12
4<Y ≤5 0.206 12.36 18
5<Y ≤6 0.212 12.72 11
6<Y ≤7 0.161 9.66 5
7<Y ≤8 0.091 5.46 4
8<Y ≤9 0.039 2.34 2
9 < Y ≤ 10 0.012 0.72 2
10 < Y 0.003 0.18 1
Bending moment Y Probability Frequency
Expected Observed
Y ≤3 0.126 7.56 5
3<Y ≤4 0.150 9.00 12
4<Y ≤5 0.206 12.36 18
5<Y ≤6 0.212 12.72 11
6<Y ≤7 0.161 9.66 5
7<Y ≤8 0.091 5.46 4
8<Y 0.054 3.24 5
Test statistic:
(5 − 7.56)2 (5 − 3.24)2
W = + ··· + = 8.267.
7.56 3.24
Degrees of freedom: 7 − 2 − 1 = 4 (7 classes, 2 estimated parameters).

The test statistic is not significant at the 5% level. We do not reject the null hypothesis. There is
insufficient evidence to conclude that the data do not come from a normal distribution.
14 HELM (2006):
®
Answers
4. The sample mean is 6434. We estimate λ using 1/6434 = 1.554 × 10−4 . We use an
exponential(1.554 × 10−4 ) distribution. For example
P(2000 < T ≤ 4000) = {1 − exp(−4000 × 1.544 × 10−4 )} − {1 − exp(−2000 × 1.544 × 10−4 )}

= exp(−2000/6434) − exp(−4000/6434)
= exp(−0.3108) − exp(−0.6217)
= 0.733 − 0.537 = 0.196
Failure time T Probability Frequency

Expected Observed
0 < T ≤ 2000 0.267 21.36 11
2000 < T ≤ 4000 0.196 15.68 21
4000 < T ≤ 6000 0.143 11.44 19
6000 < T ≤ 8000 0.106 8.48 9
8000 < T ≤ 10000 0.077 6.16 4
10000 < T ≤ 12000 0.056 4.48 3
12000 < T ≤ 14000 0.041 3.28 5
14000 < T ≤ 16000 0.031 2.48 1
16000 < T ≤ 18000 0.022 1.76 3
18000 < T 0.061 4.88 4
Failure time T Probability Frequency
Expected Observed
0 < T ≤ 2000 0.267 21.36 11
2000 < T ≤ 4000 0.196 15.68 21
4000 < T ≤ 6000 0.143 11.44 19
6000 < T ≤ 8000 0.106 8.48 9
8000 < T ≤ 10000 0.077 6.16 4
10000 < T ≤ 12000 0.056 4.48 3
12000 < T ≤ 14000 0.041 3.28 5
14000 < T ≤ 18000 0.053 4.24 4
18000 < T 0.061 4.88 4
Test statistic:
(11 − 21.36)2 (4 − 4.88)2
W = + ··· + = 14.18.
21.36 4.88
The test statistic is significant at the 5% level. We reject the null hypothesis. We conclude that the
data do not come from an exponential distribution. The observed frequency in the first class seems
to be too small.
HELM (2006): 15

Contingency Tables 42.2
Introduction
The practical application of statistics to engineering problems met in industry often concerns making
decisions concerning probability distributions. For example you may be asked to decide whether a
data set is approximately normal since much of the statistics you may apply makes this assumption.
On occasions you may have to make such decisions given data concerning non-numeric variables in
the form of a contingency table. Contingency tables are described in detail in this Workbook. This
is one of the relatively rare occasions when hypothesis tests can be applied to non-numeric variables.
#
• understand thoroughly what is meant by the
term degrees of freedom
Prerequisites
Before starting this Section you should . . . • have knowledge of the chi-squared
distribution described in 40
"
!

• explain the term contingency table
Learning Outcomes
• perform hypothesis tests involving data given
On completion you should be able to . . . as a contingency table

16 HELM (2006):
®
1. Contingency tables
On occasions, it is possible that the members of a sample taken from a population can be classified
by two different methods. Examples of this are:
(a) articles produced by three machines running during two shifts on a production line;
(b) the failure of electronic components and the position in which they are mounted in a
machine;
(c) the failure under compression testing of steel-alloy components and the rate of cooling
applied during their production.
We can represent the information obtained by observation in such situations in a contingency table.
By using the observed data to estimate expected data on the assumption that the classification
methods are independent, we can use the chi-squared test to investigate the statistical independence
(or otherwise) of the classification methods.
Consider the following contingency table with r rows and c columns. Such a table is referred to as
an r × c contingency table.
1 2 3 ... c Row Totals
1 O11 O12 O13 . . . O1c R1
2 O21 O22 O23 . . . O2c R2
3 O31 O32 O33 . . . O3c R3
.. .. .. .. .. .. ..
. . . . . . .
r Or1 Or2 Or3 . . . Orc Rr
Column Totals C1 C2 C3 . . . Cc N
Note that N is the total of the row totals and is the same as the total of the column totals, that is,
N is the number of members of the sample taken from a population.
On the basis of the observed data we can estimate the expected frequency, say Eij corresponding to
the observed frequency Oij . This is done as follows.
The probability that a randomly chosen element of the sample appears in row class i and column
class j is given by pij where
Ri Cj
pij = ×
N N
Hence the required expected frequency is given by Eij which is defined in Key Point 2.
Key Point 2
Expected Frequencies in Contingency Tables
Ri Cj Ri × Cj
Eij = N × pij = N × × =
N N N
HELM (2006): 17
Section 42.2: Contingency Tables
Using this formula repeatedly, we can calculate the expected frequencies corresponding to the ob-
served frequencies and hence calculate a test statistic W where
c X r
X (Oij − Eij )2
W =
i=1 j=1
Eij
(Oij − Eij )2
This formula tells you to calculate for every cell in the contingency table and sum them.
Eij
It can be shown that, provided N is large, and none of the expected frequencies are too small, say
less than 3, then the quantity
c X r
X (Oij − Eij )2
W =
i=1 j=1
Eij
follows approximately a chi-squared distribution with (r − 1) × (c − 1) degrees of freedom when the

null hypothesis is true. This number of degrees of freedom arises since each row has r−1 independent
entries and each column has c − 1 independent entries.
Notes
The above statements are correct provided that we can calculate the expected frequencies without
knowing the population parameters. If we have to estimate the population parameters, the number
of degrees of freedom becomes (r −1)×(c−1)−m where m is the number of population parameters
estimated. In the examples given here we shall not need to estimate the population parameters.
To complete the test procedure we note that the null hypothesis assumes class independence. For
example, referring back to Example 2 given at the start of this Section, the null hypothesis would
assume that the failure of electronic components and the position in which they are mounted in a
machine are independent.
Should the test statistic exceed the critical value of χ2 read from Table 1 at (say) the 5% level of
significance, we would reject the null hypothesis and conclude that a relationship of some kind exists
between the classes.
It is worth noting that in some cases (such as the following Example 3) one classification is chosen
deliberately but the other is random while in other cases, both classifications are random. The same
test applies in both cases.
18 HELM (2006):
®
Example 3
In an experiment to determine the most advantageous position in a machine to
mount an electronic component which may be prone to failure due to excessive
heat build-up, 300 machines are tested with 100 randomly chosen examples of the
component in each of 3 positions. The results obtained were as follows.
Position 1 2 3 Row Totals
Failure 40 30 50 120
Non-failure 60 70 50 180
Column Totals 100 100 100 300
Use a χ2 -test at the 5% level of significance to determine whether component
failure is related to mounting position.
Solution
The hypotheses are:
H0 : component failure is independent of position,
H1 : component failure is not independent of position
The expected frequencies are calculated are follows:
120 × 100 120 × 100 120 × 100
E11 = = 40, E12 = = 40, E13 = = 40
300 300 300
180 × 100 180 × 100 180 × 100
E21 = = 60, E22 = = 60, E23 = = 60
300 300 300
3 X 2
X (Oij − Eij )2
W =
i=1 j=1
Eij
(40 − 40)2 (30 − 40)2 (50 − 40)2 (60 − 60)2 (70 − 60)2 (50 − 60)2
= + + + + +
40 40 40 60 60 60
= 0 + 2.5 + 2.5 + 0 + 1.67 + 1.67 = 8.34
and the number of degrees of freedom is (r − 1) × (c − 1) = (2 − 1) × (3 − 1) = 2 so that the

critical value from tables is χ20.05,2 = 5.99.
Since 5.99 < 8.34 we reject the null hypothesis and so we should conclude that there is a relationship
between component failure and mounting position. Position 2 seems to be the most favourable and
position 3 the least.
HELM (2006): 19
Task
Washing machines are made on three production lines in a factory. A record is kept
of faults reported, during the guarantee period, in machines produced by each of
the three lines. The faults are classified into three types A, B and C. The results
are given in the table below.
Fault type
Production line A B C Row Totals
1 40 28 34 102
2 27 39 32 98
3 45 26 29 100
Column Totals 112 93 95 300
Use a χ2 -test at the 5% level of significance to determine whether fault type is
related to the production line on which the machine was produced.
Your solution
20 HELM (2006):
®
Answer
The hypotheses are:
H0 : fault type is independent of production line,
H1 : fault type is not independent of production line
The expected frequencies are calculated are follows:
102 × 112 102 × 93 102 × 95
E11 = = 38.08, E12 = = 31.62, E13 = = 32.30
300 300 300
98 × 112 98 × 93 98 × 95
E21 = = 36.59, E22 = = 30.38, E23 = = 31.03
300 300 300
100 × 112 100 × 93 100 × 95
E31 = = 37.30, E32 = = 31.00, E33 = = 31.70
300 300 300
3 X 3
X (Oij − Eij )2
W =
i=1 j=1
Eij
(40 − 38.08)2 (28 − 31.62)2 (34 − 32.30)2 (27 − 36.59)2 (39 − 30.38)2
= + + + +
38.08 31.62 32.30 36.59 30.38
(32 − 31.03)2 (45 − 37.30)2 (26 − 31.00)2 (29 − 31.70)2
+ + + +
31.03 37.30 31 31.7
= 0.097 + 0.414 + 0.089 + 2.512 + 2.446 + 0.030 + 1.590 + 0.806 + 0.230 = 8.214
and the number of degrees of freedom is (r − 1) × (c − 1) = (3 − 1) × (3 − 1) = 4 so that the

critical value from tables is χ20.05,4 = 9.49.
Since 8.214 < 9.49 we do not have sufficient evidence to reject the null hypothesis and so we should
conclude that there is no evidence that the distribution of fault types differs between production
lines.
HELM (2006): 21
Exercises
1. A new compound for the drive belt of domestic vacuum cleaners is tested. Twenty cleaners are
fitted with belts made from the new material and twenty are fitted with standard belts. The
cleaners are run for a fixed period after which the belts are examined for signs of wear. The
numbers showing significant wear are counted. The data are as follows.
Wear No wear
Standard 12 8
New compound 6 14
Test the hypothesis that there is no difference between the standard belts and those made
with the new compound in terms of the probability of showing wear. Use the 5% level of
significance.
2. Electronic devices are made on three production lines. Records are kept of faults found on de-
vices made on each line. Faults are classified as “electronics”, “power supply” or “mechanical”.
The data are as follows.
Production Line
1 2 3
Electronic 13 33 15
Power supply 7 4 11
Mechanical 18 10 14
Test the hypothesis that there is no association between production line and type of fault. Use
the 5% level of significance.
Answers
Wear No wear Total
Standard 12 8 20
1. Observed frequencies:
New compound 6 14 20
Total 18 22 40
Expected frequencies: 20 × 18/40 = 9, 20 × 22/40 = 11.
Wear No wear Total

Standard 9 11 20
New compound 9 11 20
Total 18 22 40
X (O − E)2 (12 − 9)2 (8 − 11)2 (6 − 9)2 (14 − 11)2

Test statistic W = = + + +
E 9 11 9 11
= 1 + 0.82 + 1 + 0.82 = 3.636
Degrees of freedom: (2 − 1) × (2 − 1) = 1.
The result is not significant at the 5% level. There is insufficient evidence to conclude that there
is a difference between the wear rates.
22 HELM (2006):
®
Answers
2. Observed frequencies:
Production Line
1 2 3 Total
Electronic 13 33 15 61
Power supply 7 4 11 22
Mechanical 18 10 14 42
Total 38 47 40 125
Expected frequencies, e.g. 61 × 38/125 = 18.544.
Production Line
1 2 3 Total
Electronic 18.544 22.936 19.520 61
Power supply 6.688 8.272 7.040 22
Mechanical 12.768 15.792 13.440 42
Total 38.000 47.000 40.000 125
Test statistics
X (O − E)2 (13 − 18.544)2 (14 − 13.440)2
W = = + ··· + = 15.860.
E 18.544 13.440
Degrees of freedom: (3 − 1) × (3 − 1) = 4.
The test statistic is significant at the 5% level. We reject the null hypothesis and conclude that
there is an association between fault type and production line. In particular there seems to be an
excess of electronic faults on Line 2.
HELM (2006): 23
f
Table 1: Percentage Points χ2α,ν of the χ2 distribution

α
χ2α,ν
α 0.995 0.990 0.975 0.950 0.900 0.500 0.100 0.050 0.025 0.010 0.005
v
1 0.00 0.00 0.00 0.00 0.02 0.45 2.71 3.84 5.02 6.63 7.88
2 0.01 0.02 0.05 0.01 0.21 1.39 4.61 5.99 7.38 9.21 10.60
3 0.07 0.11 0.22 0.35 0.58 2.37 6.25 7.81 9.35 11.34 12.28
4 0.21 0.30 0.48 0.71 1.06 3.36 7.78 9.49 11.14 13.28 14.86
5 0.41 0.55 0.83 1.15 1.61 4.35 9.24 11.07 12.83 15.09 16.75
6 0.68 0.87 1.24 1.64 2.20 5.35 10.65 12.59 14.45 16.81 18.55
7 0.99 1.24 1.69 2.17 2.83 6.35 12.02 14.07 16.01 18.48 20.28
8 1.34 1.65 2.18 2.73 3.49 7.34 13.36 15.51 17.53 20.09 21.96
9 1.73 2.09 2.70 3.33 4.17 8.34 14.68 16.92 19.02 21.67 23.59
10 2.16 2.56 3.25 3.94 4.87 9.34 15.99 18.31 20.48 23.21 25.19
11 2.60 3.05 3.82 4.57 5.58 10.34 17.28 19.68 21.92 24.72 26.76
12 3.07 3.57 4.40 5.23 6.30 11.34 18.55 21.03 23.34 26.22 28.30
13 3.57 4.11 5.01 5.89 7.04 12.34 19.81 22.36 24.74 27.69 29.82
14 4.07 4.66 5.63 6.57 7.79 13.34 21.06 23.68 26.12 29.14 31.32
15 4.60 5.23 6.27 7.26 8.55 14.34 22.31 25.00 27.49 30.58 32.80
16 5.14 5.81 6.91 7.96 9.31 15.34 23.54 26.30 28.85 31.00 34.27
17 5.70 6.41 7.56 8.67 10.09 16.34 24.77 27.59 30.19 33.41 35.72
18 6.26 7.01 8.23 9.39 10.87 17.34 25.99 28.87 31.53 34.81 37.16
19 6.84 7.63 8.91 10.12 11.65 18.34 27.20 30.14 32.85 36.19 38.58
20 7.43 8.26 9.59 10.85 12.44 19.34 28.41 31.41 34.17 37.57 40.00
21 8.03 8.90 10.28 11.59 13.24 20.34 29.62 32.67 35.48 38.93 41.40
22 8.64 9.54 10.98 12.34 14.04 21.34 30.81 33.92 36.78 40.29 42.80
23 9.26 10.20 11.69 13.09 14.85 22.34 32.01 35.17 38.08 41.64 44.18
24 9.89 10.86 12.40 13.85 15.66 23.34 33.20 36.42 39.36 42.98 45.56
25 10.52 11.52 13.12 14.61 16.47 24.34 34.28 37.65 40.65 44.31 46.93
26 11.16 12.20 13.84 15.38 17.29 25.34 35.56 38.89 41.92 45.64 48.29
27 11.81 12.88 14.57 16.15 18.11 26.34 36.74 40.11 43.19 46.96 49.65
28 12.46 13.57 15.31 16.93 18.94 27.34 37.92 41.34 44.46 48.28 50.99
29 13.12 14.26 16.05 17.71 19.77 28.34 39.09 42.56 45.72 49.59 52.34
30 13.79 14.95 16.79 18.49 20.60 29.34 40.26 43.77 46.98 50.89 53.67
40 20.71 22.16 24.43 26.51 29.05 39.34 51.81 55.76 59.34 63.69 66.77
50 27.99 29.71 32.36 34.76 37.69 49.33 63.17 67.50 71.42 76.15 79.49
60 35.53 37.48 40.48 43.19 46.46 59.33 74.40 79.08 83.30 88.38 91.95
70 43.28 45.44 48.76 51.74 55.33 69.33 85.53 90.53 95.02 100.42 104.22
80 51.17 53.54 57.15 60.39 64.28 79.33 96.58 101.88 106.63 112.33 116.32
90 59.20 61.75 65.65 69.13 73.29 89.33 107.57 113.14 118.14 124.12 128.30
100 67.33 70.06 74.22 77.93 82.36 99.33 118.50 124.34 129.56 135.81 140.17
24 HELM (2006):
Contents 43
Regression and
Correlation
43.1 Regression 2
43.2 Correlation 17
Learning outcomes
You will learn how to explore relationships between variables and how to measure the
strength of such relationships. You should note from the outset that simply establishing
a relationship is not enough. You may establish, for example, a relationship between the
number of hours a person works in a week and their hat size. Should you conclude that
working hard causes your head to enlarge? Clearly not, any relationship existing here is
not causal!

Regression 43.1
Introduction
Problems in engineering often involve the exploration of the relationship(s) between two or more
variables. The technique of regression analysis is very useful and well-used in this situation. This
Section will look at the basics of regression analysis and should enable you to apply regression
techniques to the study of relationships between variables. Just because a relationship exists between
two variables does not necessarily imply that the relationship is causal. You might find, for example
that there is a relationship between the hours a person spends watching TV and the incidence of
lung cancer. This does not necessarily imply that watching TV causes lung cancer.
Assuming that a causal relationship does exist, we can measure the strength of the relationship
by means of a correlation coefficient discussed in the next Section. As you might expect, tests of
significance exist which allow us to interpret the meaning of a calculated correlation coefficient.
' $
• have knowledge of Descriptive Statistics
( 36)
• be able to find the expectation and variance

of sums of variables ( 39.3)
Prerequisites
Before starting this Section you should . . . • understand the terms independent and
dependent variables
• understand the terms biased and unbiased

estimators
&
# %
• define the terms regression analysis and
regression line
Learning Outcomes
On completion you should be able to . . . • use the method of least squares for finding a
line of best fit
" !
2 HELM (2006):
Workbook 43: Regression and Correlation
®
1. Regression
As we have already noted, relationship(s) between variables are of interest to engineers who may
wish to determine the degree of association existing between independent and dependent variables.
Knowing this often helps engineers to make predictions and, on this basis, to forecast and plan.
Essentially, regression analysis provides a sound knowledge base from which accurate estimates of
the values of a dependent variable may be made once the values of related independent variables are
known.
It is worth noting that in practice the choice of independent variable(s) may be made by the engineer
on the basis of experience and/or prior knowledge since this may indicate to the engineer which
independent variables are likely to have a substantial influence on the dependent variable. In summary,
we may state that the principle objectives of regression analysis are:
(a) to enable accurate estimates of the values of a dependent variable to be made from known
values of a set of independent variables;
(b) to enable estimates of errors resulting from the use of a regression line as a basis of
prediction.
Note that if a regression line is represented as y = f (x) where x is the independent variable, then
the actual function used (linear, quadratic, higher degree polynomial etc.) may be obtained via the
use of a theoretical analysis or perhaps a scatter diagram (see below) of some real data. Note that
a regression line represented as y = f (x) is called a regression line of y on x .
Scatter diagrams
A useful first step in establishing the degree of association between two variables is the plotting of a
scatter diagram. Examples of pairs of measurements which an engineer might plot are:
(a) volume and pressure;

(b) acceleration and tyre wear;
(c) current and magnetic field;
(d) torsion strength of an alloy and purity.
If there exists a relationship between measured variables, it can take many forms. Even though an
outline introduction to non-linear regression is given at the end of this Workbook, we shall focus on
the linear relationship only.
In order to produce a good scatter diagram you should follow the steps given below:
1. Give the diagram a clear title and indicate exactly what information is being displayed;
2. Choose and clearly mark the axes;
3. Choose carefully and clearly mark the scales on the axes;
4. Indicate the source of the data.
HELM (2006): 3
Section 43.1: Regression
Examples of scatter diagrams are shown below.
+ +
+
+++ +
+ +
+ + +
+ +
+ + + ++ + +
+ +
+ +
+ + +
+ + +
+ ++
Figure 1 Figure 2 Figure 3
Figure 1 shows an association which follows a curve, possibly exponential, quadratic or cubic;
Figure 2 shows a reasonable degree of linear association where the points of the scatter diagram
lie in an area surrounding a straight line;
Figure 3 represents a randomly placed set of points and no linear association is present between
the variables.
Note that in Figure 2, the word ‘reasonable’ is not defined and that while points ‘close’ to the
indicated straight line may be explained by random variation, those ‘far away’ may be due to assignable
variation.
The rest of this Section will deal with linear association only although it is worth noting that techniques
do exist for transforming many non-linear relationships into linear ones. We shall investigate linear
association in two ways, firstly by using educated guess work to obtain a regression line ‘by eye’ and
secondly by using the well-known technique called the method of least squares.
Regression lines by eye

Note that at a very simple level, we may look at the data and, using an ‘educated guess’, draw
a line of regression ‘by eye’ through a set of points. However, finding a regression line by eye is
unsatisfactory as a general statistical method since it involves guess-work in drawing the line with
the associated errors in any results obtained. The guess-work can be removed by the method of least
squares in which the equation of a regression line is calculated using data. Essentially, we calculate
the equation of the regression line by minimising the sum of the squared vertical distances between
the data points and the line.
4 HELM (2006):
®
The method of least squares - an elementary view

We assume that an experiment has been performed which has resulted in n pairs of values, say
(x1 , y1 ), (x2 , y2 ), · · · , (xn , yn ) and that these results have been checked for approximate linearity on
the scatter diagram given below.
Pn (xn , yn )
y
y = a + bx
P1 (x1 , y1 )
Qn
Q2
Q1
P2 (x2 , y2 )
O x
Figure 4
The vertical distances of each point from the line y = a + bx are easily calculated as
y1 − a − bx1 , y2 − a − bx2 , y3 − a − bx3 ··· yn − a − bx4
These distances are squared to guarantee that they are positive and calculus is used to minimise the
sum of the squared distances. Effectively we are minimizing the sum of a two-variable expression
and need to use partial differentiation. If you wish to follow this up and look in more detail at
the technique, any good book (engineering or mathematics) containing sections on multi-variable
calculus should suffice. We will not look at the details of the calculations here but simply note that
the process results in two equations in the two unknowns m and c being formed. These equations
are:
X X X
xy − a x−b x2 = 0 (i)
and
X X
y − na − b x=0 (ii)
The second of these equations (ii) immediately gives a useful result. Rearranging the equation we
get
P P
y x
−a−b = 0 or, put more simply ȳ = a + bx̄
n n
where (x̄, ȳ) is the mean of the array of data points (x1 , y1 ), (x2 , y2 ), · · · , (xn , yn ).
This shows that the mean of the array always lies on the regression line. Since the mean is easily
calculated, the result forms a useful check for a plotted regression line. Ensure that any regression
line you draw passes through the mean of the array of data points.
Eliminating a from the equations gives a formula for the gradient b of the regression line, this is:
HELM (2006): 5
P P P
xy x y
− Sxy
b = Pn 2 nP n 2 often written as b=
x x Sx2
−
n n
The quantity Sx2 is, of course, the variance of the x-values. The quantity Sxy is known as the
covariance (of x and y) and will appear again later in this Workbook when we measure the degree
of linear association between two variables.
Knowing the value of b enables us to obtain the value of a from the equation ȳ = a + bx̄
Key Point 1
Least Squares Regression - y on x
The least squares regression line of y on x has the equation y = a + bx, where
P P P
xy x y
−
b = Pn 2 nP n 2 and a is given by the equation a = ȳ − bx̄
x x
−
n n
It should be noted that the coefficients b and a obtained here will give us the regression line of y on
x. This line is used to predict y values given x values. If we need to predict the values of x from
given values of y we need the regression line of x on y. The two lines are not the same except in the
(very) special case where all of the points lie exactly on a straight line. It is worth noting however,
that the two lines cross at the point (x̄, ȳ). It can be shown that the regression line of x on y is
given by Key Point 2:
Key Point 2
Least Squares Regression - x on y
The regression line of x on y is
x = a0 + b 0 y
where
P P P
xy x y
−
b0 = Pn 2 nP n and a0 = x̄ − b0 ȳ
2
y y
−
n n
6 HELM (2006):
®
Example 1
A warehouse manager of a company dealing in large quantities of steel cable needs
to be able to estimate how much cable is left on his partially used drums. A
random sample of twelve partially used drums is taken and each drum is weighed
and the corresponding length of cable measured. The results are given in the table
below:
Weight of drum and cable (x) kg. Measured length of cable (y) m.
30 70
40 90
40 100
50 120
50 130
50 150
60 160
70 190
70 200
80 200
80 220
80 230
Find the least squares regression line in the form y = mx + c and use it to predict
the lengths of cable left on drums whose weights are:
(i) 35 kg (ii) 85 kg (iii) 100 kg
In the latter case state any assumptions which you make in order to find the length
of cable left on the drum.
Solution
P P 2 P P
Excel calculations give x = 700, x = 44200, y = 1860 xy = 118600 so that
the formulae
P P P
xy x y
−
b = Pn 2 nP n 2 and a = ȳ − bx̄
x x
−
n n
give a = −20 and b = 3. Our regression line is y = −20 + 3x, so y = 3x − 20.
Hence, the required predicted values are:
y35 = 3 × 35 − 20 = 85 y85 = 3 × 85 − 20 = 235 y100 = 3 × 100 − 20 = 280
all results being in metres.
To obtain the last result we have assumed that the linearity of the relationship continues beyond
the range of values actually taken.
HELM (2006): 7
Task
An article in the Journal of Sound and Vibration 1991 (151) explored a possible
relationship between hypertension (defined as blood pressure rise in mm of mer-
cury) and exposure to noise levels (measured in decibels). Some data given is as
follows:
Noise Level (x) Blood pressure rise (y) Noise Level (x) Blood pressure rise (y)
60 1 85 5
63 0 89 4
65 1 90 6
70 2 90 8
70 5 90 4
70 1 90 5
80 4 94 7
90 6 100 9
80 2 100 7
80 3 100 6
(a) Draw a scatter diagram of the data.

(b) Comment on whether a linear model is appropriate for the data.
(c) Calculate a line of best fit of y on x for the data given.
(d) Use your regression line predict the expected rise in blood pressure for
a exposure to a noise level of 97 decibels.
Your solution
8 HELM (2006):
®
Answer
(a) Entering the data into Microsoft Excel and plotting gives
Blood Pressure increase versus recorded sound level
Blood Pressure rise (mm Mercury) 9

8
7
6
5
4
3
2
1
0
50 60 70 80 90 100
Sound Level (Db)
(b) A linear model is appropriate.

P P 2 P P
(c) Excel calculations give x = 1656, x = 140176, y = 86, xy = 7654
so that b = 0.1743 and a = −10.1315. Our regression line is y = 0.1743x − 10.1315.
(d) The predicted value is: y97 = 0.1743 × 97 − 10.1315 = 6.78 mm mercury.
The method of least squares - a modelling view

We take the dependent variable Y to be a random variable whose value, for a fixed value of x depends
on the value of x and a random error component say e and we write
Y = α + βx + e
Adopting the notation of conditional probability, we are looking for the expected value of Y for a
given value of x. The expected value of Y for a given value of x is denoted by
E(Y |x) = E(α + βx + e) = E(α + βx) + E(e)
The variance of Y for a given value of x is given by the relationship
V(Y |x) = V(α + βx + e) = V(α + βx) + V(e), assuming independence.
If µY |x represents the true mean value of Y for a given value of x then
µY |x = α + βx, assuming a linear relationship holds,
is a straight line of mean values. If we now assume that the errors e are distributed with mean 0 and
variance σ 2 we may write
E(Y |x) = E(α + βx) + E(e) = α + βx since E(e) = 0.
and
HELM (2006): 9
V(Y |x) = V(α + βx) + V(e) = σ 2 since V(α + βx) = 0.
This implies that for each value of x, Y is distributed with mean α + βx and variance σ 2 . Hence
when the variance is small the observed values of Y will be close to the regression line and when the
variance is large, at least some of the observed values of Y may not be close to the line. Note that
the assumption that the errors e are distributed with mean 0 may be made without loss of generality.
If the errors had any other mean, we could subtract it and then add the mean to the value of c. The
ideas are illustrated in the following diagram.
E (y|x) = α + βx
ei
yi
O x1 x2 x3 xi x
Figure 5
The regression line is shown passing through the means of the distributions for the individual values
of x. The value of y corresponding to the x-value xi can be represented by the equation
yi = α + βxi + ei
where ei is the error of the observed value of y, that is the difference from its expected value, namely
E(Y |xi ) = µy|xi = α + βxi
Now, if we estimate α and β with a and b, the residual, or estimated error, becomes
êi = yi − a − bxi
so that the sum of the squares of the residuals is given by
X X
S= ê2i = (yi − a − bxi )2
and we may minimize the quantity S by using the method of least squares as before. The mathe-
matical details are omitted as before and the equations obtained for b and a are as before, namely
P P P
xy x y
−
b = Pn 2 nP n 2 and a = ȳ − bx̄.
x x
−
n n
Note that since the error ei in the ith observation essentially describes
P 2 the error in the fit of the
model to the ith observation, the sum of the squares of the errors ei will now be used to allow us
to comment on the adequacy of fit of a linear model to a given data set.
10 HELM (2006):
®
Adequacy of fit
We now know that the variance V(Y |x) = σ 2 is the key to describing the adequacy of fit of our
simple linear model. In general, the smaller the variance, the better the fit although you should note
that it is wise to distinguish between ‘poor fit’ and a large error variance. Poor fit may suggest, for
example, that the relationship is not in fact linear and that a fundamental assumption made has been
violated. A large value of σ 2 does not necessarily mean that a linear model is a poor fit.
It can be shown that the sum of the squares of the errors say SSE can be used to give an unbiased
estimator σ̂ 2 of σ 2 via the formula
SSE
σ̂ 2 =
n−p
where p is the number of independent variables used in the regression equation. In the case of simple
linear regression p = 2 since we are using just x and c and the estimator becomes:
SSE
σ̂ 2 =
n−2
The quantity SSE is usually used explicitly in formulae whose purpose is to determine the adequacy
of a linear model to explain the variability found in data. Two ways in which the adequacy of a
regression model may be judged are given by the so-called Coefficient of Determination and the
Adjusted Coefficient of Determination.
The coefficient of determination

Denoted by R2 , the Coefficient of Determination is defined by the formula
SSE
R2 = 1 −
SST
where SSP E is the sum ofP the squares of the errors and SST is the sum of the squares of the totals
2
given by (yi − ŷi ) = yi2 − nȳ 2 . The value of R2 is sometimes described as representing the
amount of variability explained or accounted for by a regression model. For example, if after a
particular calculation it was found that R2 = 0.884, we could say that the model accounts for about
88% of the variability found in the data. However, deductions made on the basis of the value of
R2 should be treated cautiously, the reasons for this are embedded in the following properties of the
statistic. It can be shown that:
(a) 0 ≤ R2 ≤ 1
(b) a large value of R2 does not necessarily imply that a model is a good fit;
(c) adding a regressor variable (simple regression becomes multiple regression) always in-
creases the value of R2 . This is one reason why a large value of R2 does not necessarily
imply a good model;
(d) models giving large values of R2 can be poor predictors of new values if the fitted model
does not apply at the appropriate x-value.
Finally, it is worth noting that to check the fit of a linear model properly, one should look at plots of
residual values. In some cases, tests of goodness-of-fit are available although this topic is not covered
in this Workbook.
HELM (2006): 11
The adjusted coefficient of determination
2
Denoted (often) by Radj , the Adjusted Coefficient of Determination is defined as
2 SSE /(n − p)
Radj =1−
SST /(n − 1)
where p is the number of variables in the regression equation. For the simple linear model, p = 2
since we have two unknown parameters in the regression equation, the intercept c and the coefficient
m of x. It can be shown that:
2
(a) Radj is a better indicator of the adequacy of predictive power than R2 since it takes into
account the number of regressor variables used in the model;
2
(b) Radj does not necessarily increase when a new regressor variable is added.
Both coefficients claim to measure the adequacy of the predictive power of a regression model and
their values indicate the proportion of variability explained by the model. For example a value of
R2 or 2
Radj = 0.9751
may be interpreted as indicating that a model explains 97.51% of the variability it describes. For
example, the drum and cable example considered previously gives the results outlined below with
R2 = 96.2 and 2
Radj = 0.958
2
In general, Radj is (perhaps) more useful than R2 for comparing alternative models. In the context
of a simple linear model, R2 is easier to interpret. In the drum and cable example we would claim
that the linear model explains some 96.2% of the variation it describes.
Drum & Cable x2 Cable Length y2 xy Predicted Error

(x) (y) Values Squares
30 900 70 4900 2100 70 0.00
40 1600 90 8100 3600 100 100.00
40 1600 100 10000 4000 100 0.00
50 2500 120 14400 6000 130 100.00
50 2500 130 16900 6500 130 0.00
50 2500 150 22500 7500 130 400.00
60 3600 160 25600 9600 160 0.00
70 4900 190 36100 1330 190 0.00
70 4900 200 40000 14000 190 100.00
80 6400 200 40000 16000 220 400.00
80 6400 220 48400 17600 220 0.00
80 6400 230 52900 18400 220 100.00
Sum of x Sum of x2 Sum of y Sum of y 2 Sum of xy SSE =
= 700 = 44200 = 1860 = 319800 = 118600 1200.00
b=3 a = −20 SST = R2 = 2

Radj =
31500 0.962 0.958
12 HELM (2006):
®
Task
Use the drum and cable data given in Example 1 (page 7) and set up a spreadsheet
to verify the values of the Coefficient of Determination and the Adjusted Coefficient
of Determination calculated on page 12.
Your solution
Answer
As per the table on page 12 giving R2 = 0.962 and Radj
2
= 0.958.
Significance testing for regression

Note that the results in this Section apply to the simple linear model only. Some additions are
necessary before the results can be generalized.
The discussions so far pre-suppose that a linear model adequately describes the relationship between
the variables. We can use a significance test involving the distribution to decide whether or not y is
linearly dependent on x. We set up the following hypotheses:
H0 : β = 0 and H1 : β 6= 0
Key Point 3
Significance Test for Regression
SSR
Ftest =
SSE /(n − 2)
where SSR = SST − SSE and rejection at the 5% level of significance occurs if
Ftest > f0.05,1,n−2
Note that we have one degree of freedom since we are testing only one parameter (m) and that n
denotes the number of pairs of (x, y) values. A set of tables giving the 5% values of the F -distribution
is given at the end of this Workbook (Table 1).
HELM (2006): 13
Example 2
Test to determine whether a simple linear model is appropriate for the data previ-
ously given in the drum and cable example above.
Solution
We know that
SST = SSR + SSE
( y)2
X P
2
where SST = y − is the total sum of squares (of y) so that (from the spreadsheet
n
above) we have:
SSR = 31500 − 1200 = 30300
Hence
SSR 30300
Ftest = = = 252.5
SSE /(n − 2) 1200/(12 − 2)
From Table 1, the critical value is f0.05,1,10 = 241.9.
Hence, since Ftest > f0.05,1,10 , we reject the null hypothesis and conclude that β 6= 0.
Regression curves
The Section should be regarded as introductory only. The reason for including non-linear regression
is to demonstrate how the method of least squares can be extended to deal with cases where the
relationship between variables is, for example, quadratic or exponential.
A regression curve is defined to be the curve passing through the expected value of Y for a set of
given values of x. The idea is illustrated by the following diagram.
f (y)
Regression curve of y on x
y Distribution of y for given x
0 x1 x2 x3 xn x
Figure 6
14 HELM (2006):
®
We will look at the quadratic and exponential cases in a little detail.
The quadratic case

We are looking for a functional relation of the form
y = α + βx + γx2
and so, using the method of least squares, we require the values of a, b and c which minimize the
expression
n
X
f (a, b, c) = (yr − a − bxr − cx2r )2
r=1
Note here that the regression described by the form

y = α + βx + γx2
is actually a linear regression since the expression is linear in α, β and γ.
Omitting the subscripts and using partial differentiation gives
∂f X
= −2 (y − a − bx − cx2 )
∂a
∂f X
= −2 x(y − a − bx − cx2 )
∂b
∂f X
= −2 x2 (y − a − bx − cx2 )
∂c
At a minimum we require
∂f ∂f ∂f
= = =0
∂a ∂b ∂c
which results in the three linear equations
X X X
y − na − b x−c x2 = 0
X X X X
xy − a x−b x2 − c x3 = 0
X X X X
x2 y − a x2 − b x3 − c x4 = 0
which can be solved to give the values of a, b and c.
The exponential case

We use the same technique to look for a functional relation of the form
y = αeβx
As before, using the method of least squares, we require the values of a and b which minimize the
expression
n
X
f (a, b) = (yr − aebxr )2
r=1
Again omitting the subscripts and using partial differentiation gives
HELM (2006): 15
∂f X
= −2 ebx (y − aebx )
∂a
∂f X
= −2 axebx (y − aebx )
∂b
At a minimum we require
∂f ∂f
= =0
∂a ∂b
which results in the two non-linear equations
X X
yebx − a e2bx = 0
X X
xyebx − a xe2bx
which can be solved by iterative methods to give the values of a and b.
Note that it is possible to combine (for example) linear and exponential regression to obtain a
regression equation of the form
y = (α + βx)eγx
The method of least squares may then be used to find estimates a, b, c of α, β, γ.
16 HELM (2006):

Correlation 43.2
Introduction
While medical researchers might be interested in knowing the answers to questions such as ‘Is age
related to blood pressure?’ engineers might be interested in knowing the answers to questions such
as ‘Is the shear strength of a weld related to its diameter?’ or ‘Is the rate of wear of a petrol engine
related to its operating temperature?’ As you already know (from reading the introduction to Section
43.1 concerning the topic of regression), statisticians measure the strength of a relationship between
two variables by using a quantity called the correlation coefficient. As you might expect, tests exist
which allow us to interpret the meaning of a calculated correlation coefficient.
' $
• have knowledge of Descriptive Statistics as
presented in 36
Prerequisites • have knowledge of Hypothesis Testing based

on the t-distribution as presented in 41
• have knowledge of Regression as presented in
Section 43.1
&
' %
$
• explain what is meant by the term correlation
coefficient
Learning Outcomes
• perform a statistical test in order to interpret
On completion you should be able to . . . the possible meaning of a correlation
coefficient
& %
HELM (2006): 17
Section 43.2: Correlation
1. Correlation
So far we have assumed that we have a random variable Y related to an independent variable x
which can be measured with some accuracy. In the equation below, the dependent variable Y is a
random variable whose value, for a fixed value of x depends on a random error component say e and
we have
Y = mx + c + e
In some situations, both X and Y are random variables and you should note that we can still use a
regression line of y on x if we are required to predict values of y from observations made on x. In this
case the variables x and y play different roles. In correlation, the two variables are interchangeable.
Examples involving two random variables often quoted are the shear strength (y) and diameter of
spot welds (x) (neither can be precisely controlled) and the bending moment (y) and shear (x) at
the fixed point of a beam as illustrated below
Shear
Weight Load on
Moment of Beam Beam
Figure 6
Again, neither variable (shear or moment) can be precisely controlled, each is a random variable. In
cases such as these, we turn to the correlation coefficient (sometimes called Pearson’s coefficient of
correlation or simply Pearson’s r) defined as
σxy
r=
σx σy
where σxy is the covariance between X and Y and σx and σy are the standard deviations of X and
Y . We need to express this formula in terms of quantities which facilitate the easy calculation of the
correlation coefficient.
Key Point 4
Pearson’s Coefficient of Correlation, r
In terms of corresponding sample values (x, y),
P P P
n xy − x y
r=q P
n x2 − ( x)2 n y 2 − ( y)2
P P P
18 HELM (2006):
Further, it can also be shown that −1 ≤ r ≤ 1 and that:
(a) r = −1 represents perfect negative correlation with all (x, y) lying on a straight line with
negative gradient;
(b) r = 1 represents perfect positive correlation with all (x, y) lying on a straight line with
positive gradient;
(c) r = 0 represents the situation where either there is no linear relationship between the
variables or that any relationship existing is non-linear.
The calculation of Pearson’s r

The worked example below shows the setting out of a table which will facilitate the easy calculation
of Pearson’s r.
Example 3
Find the value of Pearson’s r for the following set of data obtained by reading
seven torque values (x) from an electric motor using current (y).
Student 1 2 3 4 5 6 7
x-Value 16 14 12 10 8 6 4
y-Value 12 8 16 14 4 10 6
Solution
The calculation is done as follows:
x y x2 y2 xy
16 12 256 144 192
14 8 196 64 112
12 16 144 256 192
10 14 100 196 140
8 4 64 16 32
6 10 36 100 60
X 4 X 6 X 16 X 36 X 24
2 2
x = 70 y = 70 x = 812 y = 812 xy = 752
Substituting in the formula we developed for r gives the result:
752 × 7 − 70 × 70
r=p = 0.46
(7 × 812 − 702 )(7 × 812 − 702 )
In practice, one would set up a spreadsheet or use a specialist statistical software package to do the
calculations.
Comment
Any value of r calculated says something about the degree of correlation present between the two
independent random variables present in the calculation. In order to give real meaning to the value
of the correlation coefficient we should test the significance of the value of r, in this case 0.46.
HELM (2006): 19
The significance of Pearson’s r
In order to test the significance of a calculated value of r we assume that both x and y are normally
distributed and set up the hypotheses:
H0 : ρ = 0 H1 : ρ 6= 0
where ρ is the ‘true’ value of the population correlation. If the assumption of normality is false the
test must not be used. We know that the value of −1 ≤ r ≤ 1 and we wish to know whether our
correlation coefficient is significantly different to zero.
Key Point 5
Significance of Pearson’s r
It can be shown that the test statistic
√
|r| n − 2
rtest = √
1 − r2
calculated from a sample of n pairs of values, follows a t-distribution with n − 2 degrees of freedom.
Note that many authors simply miss out the modulus sign and ignore the sign of r should it be
negative. The test statistic is then written
√
r n−2
rtest = √
1 − r2
and critical values depending on the level of significance required are read off from t-tables in the
usual way. A copy of t-distribution tables is included at the end of this Workbook (Table 2).
Example 4
Test the significance of the value of r obtained from Example 3 concerning electric
motor torque values. Use the 5% level of significance.
Solution
The sample size is 7 so we have 5 degrees of freedom. The value of rtest is given by
√ √
r n−2 0.46 × 7 − 2
rtest = √ = √ = 1.158
1 − r2 1 − 0.462
From Table 2, the critical value for a two-sided test at the 5% level of significance is 2.571. In
this case, since 1.158 < 2.571 we cannot reject the null hypothesis at the 5% level of significance
and conclude that for the motor under investigation, there is no evidence of a relationship between
torque produced and current used.
20 HELM (2006):
Task
Hooke’s law relates the extension of a spring under load to its extended length.
The following results were obtained experimentally.
Load (N ) 2 5 8 11 15
Extension (mm) 2 23 62 119 223
Calculate Pearson’s r and test its significance at the 5% level. What conclusion
can you draw?
Your solution
Answer
Setting up a spreadsheet to do the calculations gives:
Load (x) Exten. (y) xy x2 y2
2 2 4 4 4
5 23 115 25 529
8 62 496 64 3844
11 119 1309 121 14161
15 223 3345 225 49729
Sum(x) = Sum(y) = Sum(xy) = Sum(x2 ) = Sum(y 2 ) =
41 429 5269 439 68267
r = 0.97379629 rtest = 7.41645174
Hence, since the critical value for a two-sided t-test at the 5% level read off from tables is 3.182 we
see that since 7.416 > 3.182 we can reject the null hypothesis at the 5% level and conclude that
the correlation coefficient is significantly different from zero.
HELM (2006): 21
Comments on interpretation
Some care should always be taken when interpreting results obtained from correlation coefficient
calculations.
(a) A high correlation does not necessarily imply that a causal relationship exists between
the variables considered. For example, it may be that a high degree of correlation exists
between the number of road accidents in a particular city and the number of late trains
arriving at a station in another city both over the same time period. In general one would
not expect to find a causal relation between the variables involved. Similar comments
apply to, for example, water hardness and average income for towns in the UK.
(b) When considering the behaviour of two variables, one should realize that it is possible
that both variables may change because of the influence of a third variable. An example
often quoted in this context is the Gas law
PV
= constant
T
where say, pressure and volume may change because of a change in temperature.
(c) A low value of the correlation coefficient does not necessarily imply that no relationship
exists between the variables being considered. Remember that the correlation coefficient
is indicative of a linear relationship only and that a low or zero value of r may indicate
that a non-linear relationship exists. For example a set of points lying on the curve y = x2
might (see the Tasks below) result in a zero value of r.
Task
Write down five (x, y) points (symmetrical about zero) lying on the parabola
y = x2 . Show that the correlation coefficient between x and y is zero.
Your solution
x y xy x2 y2
22 HELM (2006):
Answer
Let the five points be (for example) (−2, 4), (−1, 1), (0, 0), (1, 1), (2, 4)
x y xy x2 y2
-2 4 -8 4 16
-1 1 -1 1 1
0 0 0 0 0
1 1 1 1 1
2 4 8 4 16
0 10 0 10 34
The value of r is given by
P P P
n xy − x y 5 × 0 − 0 × 10
r=q P =p =0
2
P 2 P 2
P 2 (5 × 10 − 02 )(5 × 34 − 102 )
n x − ( x) n y − ( y)
Task
Write down five (x, y) points (all involving positive values of x and y) lying on
the parabola y = x2 . Show that the correlation coefficient between x and y is
non-zero.
Your solution
x y xy x2 y2
HELM (2006): 23
Answer
Let the five points be (for example) (0, 0), (1, 1), (2, 4), (3, 9), (4, 16),
x y xy x2 y2
0 0 0 0 0
1 1 1 1 1
2 4 8 4 16
3 9 27 9 81
4 16 64 16 256
10 30 100 30 354
P P P
n xy − x y 5 × 100 − 10 × 30
r=q P = p = 0.959
n x2 − ( x)2 n y 2 − ( y)2
P P P (5 × 30 − 102 )(5 × 354 − 302 )
Spearman’s coefficient of correlation

There are times when data cannot be expressed in terms of numbers directly. For example, an audio
engineer might be asked to give an opinion on the quality of sound produced by three sets of speakers.
The results will represent a judgement made by the engineer. The engineer could adopt a set of
criteria including, for example, the clarity of the treble, the power of the base and the ability of the
speakers to distinguish between instruments. Suppose the results are as follows:
Test Item Rating Rank Order
Speaker Pair B 9/10 1
Speaker Pair A 8/10 2
Speaker Pair C 5/10 3
Note that the results are not numeric in an arithmetic sense so you cannot do meaningful arithmetic
using the results. In order to see this, just ask what a calculation based on the ranks such as
1 + 22
3
would actually mean. The answer is, of course, nothing!
During your career as an engineer you may be asked to rank data in a similar way to that outlined
above. You may be asked to assess the work of colleagues for promotion purposes or give an opinion
on the visual appeal of alternative designs of manufactured objects such as mobile telephones, food
containers or television sets.
Assigning numbers to data in order of size (often called ranking methods) can also be useful if one
does not wish to make assumptions about the nature of the distributions underlying the data. (For
example whenever at least one of the distributions describing the behaviour of the variables may not
be normal.) In order to check the level of correlation between results obtained by ranking data we
calculate Spearman’s coefficient of correlation.
24 HELM (2006):
Key Point 6
Spearman’s CoefficientPof Correlation, R
6 D2
R=1−
n(n2 − 1)
where D = RX − Ry is the difference of the rank RX of an item according to variable X and rank
RY of the item according to variable Y .
The formula indicates that the differences of each pair of ranked values are to be found, squared
and summed. It is worth noting that even though it is not obvious, Spearman’s coefficient is just
Pearson’s coefficient applied to ranks.
The calculation of Spearman’s R

The following worked example illustrates the procedure.
Example 5
A production engineer is asked to grade, on the basis of 12 criteria A to L, a
junior colleague who has applied for promotion. In order to try to ensure that he
treats the colleague fairly, the engineer repeats his gradings after a few days. On
the basis of the results below, can you conclude that the results are consistent?
The gradings are percentages.
Criterion First Grading(X) RX Second Grading(Y ) RY
A 55 8 75 7
B 53 9 80 6
C 78 3 89 4
D 50 10 63 11
E 48 11 67 10
F 61 7 69 9
G 66 6 73 8
H 76 4 93 2
I 85 2 87 5
J 90 1 95 1
K 69 5 92 3
L 45 12 59 12
HELM (2006): 25
Solution
The calculation may be set out as follows:
Criterion RX RY D = RX − RY D2
A 8 7 1 1
B 9 6 3 9
C 3 4 −1 1
D 10 11 −1 1
E 11 10 1 1
F 7 9 −2 4
G 6 8 −2 4
H 4 2 2 4
I 2 5 −3 9
J 1 1 0 0
K 5 3 2 4
L 12 12 0 X 0
D2 = 38
Substituting in the formula for R gives the value
6 × 38
R=1− = 0.87
12 × 143
Note that we have not made any attempt to interpret the meaning of this figure of 0.87. Methods
for doing this are discussed below.
The significance of spearman’s R

Like Pearson’s r the value of R may be shown to lie in the range −1 ≤ R ≤ 1 and in order to test
the significance of a calculated value of R we set up the hypotheses
H0 : ρ = 0 H1 : ρ 6= 0
Key Point 7
Significance of Spearman’s R
We wish to know whether our correlation coefficient is significantly different to zero. It can be
shown that for large samples, the test statistic
√
R n−2
Rtest = √
1 − R2
calculated from a sample of n pairs of values, follows a t-distribution with n − 2 degrees of freedom.
26 HELM (2006):
Critical values depending on the level of significance required are read from t-tables. When dealing
with Spearman’s coefficient of correlation, the size of the sample is important. Different authors
recommend different minimum sample sizes, a common recommendation being a minimum of n = 10.
Even though they are not used here, you should note that tables are available which allow us to read
critical values corresponding to small sample sizes.
Example 6
A production engineer is asked to grade, on the basis of 12 criteria (say) A to L
a junior colleague who has applied for promotion. He repeats his gradings after a
few days. The results (calculated in Example 5) gave a value of R = 0.87. Test
at the 5% level to determine whether the results are consistent.
Solution
The calculation is:
√ √
R n−2 0.87 × 12 − 2
Rtest = √ = √ = 5.580
1 − R2 1 − 0.872
The 5% critical value for a two sided test read from tables is 2.228 and since 5.580 > 2.228 we
conclude that we must reject the null hypothesis that the correlation coefficient is zero.
Task
As a result of two tests given to 10 students studying laboratory safety, the students
were placed in the following class order.
Student Test 1 Test 2
A 2 3
B 4 5
C 3 7
D 5 9
E 1 10
F 6 2
G 8 6
H 7 8
I 9 4
J 10 1
Use Spearman’s R to discuss the consistency of their performances. Can you
make any meaningful comment regarding the two tests as a means of assessing
laboratory safety?
HELM (2006): 27
Your solution
28 HELM (2006):
Answer
Setting up the hypotheses
H0 : R = 0 H1 : R 6= 0
and doing the appropriate calculations using a spreadsheet gives:
Test 1 Test 2 D D2
2 3 −1 1
4 5 −1 1
3 7 −4 16
5 9 −4 16
1 10 −9 81
6 2 4 16
8 6 2 4
7 8 −1 1
9 4 5 25
10 1 9 81
sum = 242
R = −0.4666667 Rtest = 1.49240501
From t-tables it may be seen that the critical value (8 degrees of freedom) at the 5% level of
significance is 2.306. Since 1.492 < 2.306 we cannot reject the null hypothesis that there is no
correlation between the results. This implies that the performances of the students on the tests may
not be related and we should question at least one of the tests as a means of assessing laboratory
safety. One could, of course, question the usefulness of both tests!
Task
As part of an educational research project, twelve engineering students were given
an intelligence test (IQ score) at the start of their first year course. At the end of
the first year their results in engineering science (ES score) were noted down on
the expectation that they would correlate with the results of the intelligence test.
The results were as follows:
Student 1 2 3 4 5 6 7 8 9 10 11 12
IQ Score 135 120 125 135 125 140 135 140 135 140 120 135
ES Score 85 74 76 90 85 87 94 98 81 91 76 74
Calculate Pearson’s r for these data. Can you conclude that there is a linear
relationship between IQ scores and ES scores? You may assume that the IQ scores
and the ES scores are each normally distributed.
HELM (2006): 29
Your solution
Answer
Setting up the hypotheses
H0 : R = 0 H1 : R 6= 0
and doing the appropriate calculations using a spreadsheet gives:
IQ(x) ES(y) xy x2 y2
135 85 11475 18225 7225
120 74 8880 14400 5476
125 76 9500 15625 5776
135 90 12150 18225 8100
125 85 10625 15625 7225
140 87 12180 19600 7569
135 94 12690 18225 8836
140 98 13720 19600 9604
135 81 10935 18225 6561
140 91 12740 19600 8281
120 76 9120 14400 5776
135 74 9990 18225 5476
2 2
sumx = 1585 sumy = 1011 sumxy = 134005 sumx = 209975 sumy = 85905
r = 0.696 rtest = 3.065
From t-tables it may be seen that the critical value (10 degrees of freedom) at the 5% level of
significance is 1.812. Since 3.065 > 1.812 we reject the null hypothesis that there is no linear
association between the results. This implies that the performances of the students on the ES tests
is linearly related to their IQ scores.
30 HELM (2006):
Table 1: Upper 5% points of the F distribution
5%
f0.05,u,v
v 1 2 3 4 5 6 7 8 9 10 20 30 40 60 ∞
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5 241.9 248.0 250.1 251.1 252.2 254.3
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 19.45 19.46 19.47 19.48 19.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.66 8.62 8.59 8.55 8.53
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.80 5.75 5.72 5.69 5.63
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.56 4.53 4.46 4.43 4.36
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.87 3.81 3.77 3.74 3.67
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.44 3.38 3.34 3.30 3.23
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.15 3.08 3.04 3.01 2.93
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 2.94 2.86 2.83 2.79 2.71
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.77 2.70 2.66 2.62 2.54
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.65 2.57 2.53 2.49 2.40
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.54 2.47 2.43 2.38 2.30
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.46 2.38 2.34 2.30 2.21
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.39 2.31 2.27 2.22 2.13
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.33 2.25 2.20 2.16 2.07
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.28 2.19 2.15 2.11 2.01
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.23 2.15 2.10 2.06 1.96
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.19 2.11 2.06 2.02 1.92
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.16 2.07 2.03 1.93 1.88
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.12 2.04 1.99 1.95 1.84
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.10 2.01 1.96 1.92 1.81
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.07 1.98 1.94 1.89 1.78
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.05 1.96 1.91 1.86 1.76
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.03 1.94 1.89 1.84 1.73
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.01 1.92 1.87 1.82 1.71
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 1.99 1.90 1.85 1.80 1.69
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 1.97 1.88 1.84 1.79 1.67
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 1.96 1.87 1.82 1.77 1.65
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 1.94 1.85 1.81 1.75 1.64
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 1.93 1.84 1.79 1.74 1.62
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 1.84 1.74 1.69 1.64 1.51
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.75 1.65 1.59 1.53 1.39
∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 1.57 1.46 1.39 3.32 1.00
HELM (2006): 31
Table 2: Critical points of student’s t distribution
tα,ν
α .40 .25 .10 .05 .025 .01 .005 .0025 .001 .0005
v
1 .325 1.000 3.078 6.314 12.706 31.825 63.657 127.32 318.31 636.62
2 .289 .816 1.886 2.902 4.303 6.965 9.925 14.089 23.326 31.598
3 .277 .765 1.638 2.353 3.182 4.514 5.841 7.453 10.213 12.924
4 .271 .741 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 .267 .727 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 .265 .718 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 .263 .711 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 .262 .706 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 .261 .703 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 .260 .700 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.487
11 .260 .697 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 .259 .695 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 .259 .694 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 .258 .692 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 .258 .691 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 .258 .690 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 .257 .689 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 .257 .688 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 .257 .688 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 .257 .687 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 .257 .686 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 .256 .686 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 .256 .685 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.767
24 .256 .685 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 .256 .684 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 .256 .684 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 .256 .684 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 .256 .683 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 .256 .683 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 .256 .683 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 .255 .681 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
60 .254 .679 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
120 .254 .677 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
∞ .253 .674 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291
32 HELM (2006):
Contents 44
Analysis of Variance
44.1 One-Way Analysis of Variance 2
44.2 Two-Way Analysis of Variance 15
44.3 Experimental Design 40
Learning Outcomes
In this Workbook you will learn the basics of this very important branch of Statistics and
how to do the calculations which enable you to draw conclusions about variance found in
data sets. You will also be introduced to the design of experiments which has great
importance in science and engineering.
One-Way Analysis
of Variance 44.1

Introduction
Problems in engineering often involve the exploration of the relationships between values taken by
a variable under different conditions. 41 introduced hypothesis testing which enables us to
compare two population means using hypotheses of the general form
H 0 : µ1 = µ2
H1 : µ1 6= µ2
or, in the case of more than two populations,
H 0 : µ1 = µ2 = µ3 = . . . = µk
H1 : H0 is not true
If we are comparing more than two population means, using the type of hypothesis testing referred
to above gets very clumsy and very time consuming. As you will see, the statistical technique called
Analysis of Variance (ANOVA) enables us to compare several populations simultaneously. We
might, for example need to compare the shear strengths of five different adhesives or the surface
toughness of six samples of steel which have received different surface hardening treatments.

• be familiar with the general techniques of
Prerequisites hypothesis testing
Before starting this Section you should . . . • be familiar with the F -distribution

'
$
• describe what is meant by the term one-way
ANOVA.
Learning Outcomes • perform one-way ANOVA calculations.
• interpret the results of one-way ANOVA
calculations
& %
2 HELM (2006):
Workbook 44: Analysis of Variance
®
1. One-way ANOVA
In this Workbook we deal with one-way analysis of variance (one-way ANOVA) and two-way analysis of
variance (two-way ANOVA). One-way ANOVA enables us to compare several means simultaneously
by using the F -test and enables us to draw conclusions about the variance present in the set of
samples we wish to compare.
Multiple (greater than two) samples may be investigated using the techniques of two-population
hypothesis testing. As an example, it is possible to do a comparison looking for variation in the
surface hardness present in (say) three samples of steel which have received different surface hardening
treatments by using hypothesis tests of the form
H 0 : µ1 = µ2
H1 : µ1 6= µ2
We would have to compare all possible pairs of samples before reaching a conclusion. If we are
dealing with three samples we would need to perform a total of
3 3!
C2 = =3
1!2!
hypothesis tests. From a practical point of view this is not an efficient way of dealing with the
problem, especially since the number of tests required rises rapidly with the number of samples
involved. For example, an investigation involving ten samples would require
10 10!
C2 = = 45
8!2!
separate hypothesis tests.
There is also another crucially important reason why techniques involving such batteries of tests are
unacceptable. In the case of 10 samples mentioned above, if the probability of correctly accepting a
given null hypothesis is 0.95, then the probability of correctly accepting the null hypothesis
H0 : µ1 = µ2 = . . . = µ10
is (0.95)45 ≈ 0.10 and we have only a 10% chance of correctly accepting the null hypothesis for
all 45 tests. Clearly, such a low success rate is unacceptable. These problems may be avoided by
simultaneously testing the significance of the difference between a set of more than two population
means by using techniques known as the analysis of variance.
Essentially, we look at the variance between samples and the variance within samples and draw
conclusions from the results. Note that the variation between samples is due to assignable (or
controlled) causes often referred in general as treatments while the variation within samples is due
to chance. In the example above concerning the surface hardness present in three samples of steel
which have received different surface hardening treatments, the following diagrams illustrate the
differences which may occur when between sample and within sample variation is considered.
HELM (2006): 3
Section 44.1: One-Way Analysis of Variance
Case 1
In this case the variation within samples is roughly on a par with that occurring between samples.
s̄2
s̄1
s̄3
Sample 1 Sample 2 Sample 3
Figure 1
Case 2
In this case the variation within samples is considerably less than that occurring between samples.
s̄1
s̄2
s̄3
Sample 1 Sample 2 Sample 3
Figure 2
We argue that the greater the variation present between samples in comparison with the variation
present within samples the more likely it is that there are ‘real’ differences between the population
means, say µ1 , µ2 and µ3 . If such ‘real’ differences are shown to exist at a sufficiently high level
of significance, we may conclude that there is sufficient evidence to enable us to reject the null
hypothesis H0 : µ1 = µ2 = µ3 .
Example of variance in data

This example looks at variance in data. Four machines are set up to produce alloy spacers for use in
the assembly of microlight aircraft. The spaces are supposed to be identical but the four machines
give rise to the following varied lengths in mm.
Machine A Machine B Machine C Machine D
46 56 55 49
54 55 51 53
48 56 50 57
46 60 51 60
56 53 53 51
4 HELM (2006):
®
Since the machines are set up to produce identical alloy spacers it is reasonable to ask if the evidence
we have suggests that the machine outputs are the same or different in some way. We are really
asking whether the sample means, say X̄A , X̄B , X̄C and X̄D , are different because of differences in
the respective population means, say µA , µB , µC and µD , or whether the differences in X̄A , X̄B , X̄C
and X̄D may be attributed to chance variation. Stated in terms of a hypothesis test, we would write
H 0 : µA = µB = µC = µD
H1 : At least one mean is different from the others
In order to decide between the hypotheses, we calculate the mean of each sample and overall mean
(the mean of the means) and use these quantities to calculate the variation present between the
samples. We then calculate the variation present within samples. The following tables illustrate the
calculations.
H 0 : µA = µB = µC = µD
H1 : At least one mean is different from the others
Machine A Machine B Machine C Machine D
46 56 55 49
54 55 51 53
48 56 50 57
46 60 51 60
56 53 53 51
X̄A = 50 X̄B = 56 X̄C = 52 X̄D = 54

The mean of the means is clearly
¯ = 50 + 56 + 52 + 54 = 53
X̄
4
so the variation present between samples may be calculated as
D
1 X ¯ 2

ST2 r = X̄i − X̄
n − 1 i=A
1
(50 − 53)2 + (56 − 53)2 + (52 − 53)2 + (54 − 53)2

=
4−1
20
= = 6.67 to 2 d.p.
3
Note that the notation ST2 r reflects the general use of the word ‘treatment’ to describe assignable
causes of variation between samples. This notation is not universal but it is fairly common.
Variation within samples
We now calculate the variation due to chance errors present within the samples and use the results to
obtain a pooled estimate of the variance, say SE2 , present within the samples. After this calculation
we will be able to compare the two variances and draw conclusions. The variance present within the
samples may be calculated as follows.
HELM (2006): 5
Sample A
X
(X − X̄A )2 = (46 − 50)2 + (54 − 50)2 + (48 − 50)2 + (46 − 50)2 + (56 − 50)2 = 88
Sample B
X
(X − X̄B )2 = (56 − 56)2 + (55 − 56)2 + (56 − 56)2 + (60 − 56)2 + (53 − 56)2 = 26
Sample C
X
(X − X̄C )2 = (55 − 52)2 + (51 − 52)2 + (50 − 52)2 + (51 − 52)2 + (53 − 52)2 = 16
Sample D
X
(X − X̄D )2 = (49 − 54)2 + (53 − 54)2 + (57 − 54)2 + (60 − 54)2 + (51 − 54)2 = 80
An obvious extension of the formula for a pooled variance gives
X X X X
(X − X̄A )2 + (X − X̄B )2 + (X − X̄C )2 + (X − X̄D )2
2
SE =
(nA − 1) + (nB − 1) + (nC − 1) + (nD − 1)
where nA , nB , nC and nD represent the number of members (5 in each case here) in each sample.
Note that the quantities comprising the denominator nA − 1, · · · , nD − 1 are the number of degrees
of freedom present in each of the four samples. Hence our pooled estimate of the variance present
within the samples is given by
88 + 26 + 16 + 80
SE2 = = 13.13
4+4+4+4
We are now in a position to ask whether the variation between samples ST2 r is large in comparison
with the variation within samples SE2 . The answer to this question enables us to decide whether the
difference in the calculated variations is sufficiently large to conclude that there is a difference in the
population means. That is, do we have sufficient evidence to reject H0 ?
Using the F -test

At first sight it seems reasonable to use the ratio
ST2 r
F =
SE2
but in fact the ratio
nST2 r
F = ,
SE2
where n is the sample size, is used since it can be shown that if H0 is true this ratio will have a value
of approximately unity while if H0 is not true the ratio will have a value greater that unity. This is
because the variance of a sample mean is σ 2 /n.
The test procedure (three steps) for the data used here is as follows.
(a) Find the value of F ;

(b) Find the number of degrees of freedom for both the numerator and denominator of the
ratio;
6 HELM (2006):
®
(c) Accept or reject depending on the value of F compared with the appropriate tabulated
value.
Step 1
The value of F is given by
nST2 r 5 × 6.67
F = 2
= = 2.54
SE 13.13
Step 2
The number of degrees of freedom for ST2 r (the numerator) is
Number of samples − 1 = 3
The number of degrees of freedom for SE2 (the denominator) is
Number of samples × (sample size − 1) = 4 × (5 − 1) = 16
Step 3
The critical value (5% level of significance) from the F -tables (Table 1 at the end of this Workbook)
is F(3,16) = 3.24 and since 2.54 < 3.224 we see that we cannot reject H0 on the basis of the evidence
available and conclude that in this case the variation present is due to chance. Note that the test
used is one-tailed.
ANOVA tables
It is usual to summarize the calculations we have seen so far in the form of an ANOVA table.
Essentially, the table gives us a method of recording the calculations leading to both the numerator
and the denominator of the expression
nST2 r
F =
SE2
In addition, and importantly, ANOVA tables provide us with a useful means of checking the accuracy
of our calculations. A general ANOVA table is presented below with explanatory notes.
Define a = number of treatments, n = number of observations per sample.
Source of Sum of Squares Degrees Mean Square Value of
Variation SS of Freedom MS F Ratio
a
2 SST r M ST r
Between samples ¯ M ST r = F =
SST r = n X̄i − X̄ (a − 1) (a − 1) M SE
(due to treatments) 2
= nSX̄
i=1 nST2 r
= 2
Differences between SE
¯
means X̄i and X̄
Within samples
a
n
2
SSE = Xij − X̄j a(n − 1) SSE
(due to chance errors) M SE =
i=1 j=1 a(n − 1)
Differences between = SE2
individual observations
Xij and means X̄i
n
a 2
TOTALS SST = ¯
Xij − X̄ (an − 1)
i=1 j=1
HELM (2006): 7
In order to demonstrate this table for the example above we need to calculate
a Xn
¯ 2
X
SST = Xij − X̄
i=1 j=1
a measure of the total variation present in the data. Such calculations are easily done using a
computer (Microsoft Excel was used here), the result being
a X n 2
¯
X
SST = Xij − X̄ = 310
i=1 j=1
The ANOVA table becomes

Source of Sum of Squares Degrees of Mean Square Value of
Variation SS Freedom MS F Ratio
SST r M ST r
M ST r = F =
Between samples (a − 1) M SE
100 3
(due to treatments) 100
= = 2.54
3
Differences between = 33.33
¯
means X̄i and X̄
SSE
M SE =
Within samples a(n − 1)
210 16
(due to chance errors) 210
=
Differences between 16
individual observations = 13.13
Xij and means X̄i
TOTALS 310 19
It is possible to show theoretically that

SST = SST r + SSE
that is
a X
n 2 a 2 a X
n
¯ ¯ 2
X X X
Xij − X̄ =n X̄i − X̄ + Xij − X̄j
i=1 j=1 i=1 i=1 j=1
As you can see from the table, SST r and SSE do indeed sum to give SST even though we can
calculate them separately. The same is true of the degrees of freedom.
Note that calculating these quantities separately does offer a check on the arithmetic but that using
the relationship can speed up the calculations by obviating the need to calculate (say) SST . As
you might expect, it is recommended that you check your calculations! However, you should note
that it is usual to calculate SST and SST r and then find SSE by subtraction. This saves a lot of
unnecessary calculation but does not offer a check on the arithmetic. This shorter method will be
used throughout much of this Workbook.
8 HELM (2006):
®
Unequal sample sizes

So far we have assumed that the number of observations in each sample is the same. This is not a
necessary condition for the one-way ANOVA.
Key Point 1
Suppose that the number of samples is a and the numbers of observations are n1 , n2 , . . . , na . Then
the between-samples sum of squares can be calculated using
a
X T2 i G2
SST r = −
i=1
ni N
a
X a
X
where Ti is the total for sample i, G = Ti is the overall total and N = ni .
i=1 i=1
It has a − 1 degrees of freedom.

The total sum of squares can be calculated as before, or using
ni
a X
X G2
SST = Xij2 −
i=1 j=1
N
It has N − 1 degrees of freedom.

The within-samples sum of squares can be found by subtraction:
SSE = SST − SST r
It has (N − 1) − (a − 1) = N − a degrees of freedom.
HELM (2006): 9
Task
Three fuel injection systems are tested for efficiency and the following coded data
are obtained.
System 1 System 2 System 3
48 60 57
56 56 55
46 53 52
45 60 50
50 51 51
Do the data support the hypothesis that the systems offer equivalent levels of efficiency?
Your solution
10 HELM (2006):
®
Answer
Appropriate hypotheses are
H 0 = µ1 = µ2 = µ3
H1 : At least one mean is different to the others
Variation between samples
System 1 System 2 System 3
48 60 57
56 56 55
46 53 52
45 60 50
50 51 51
X̄1 = 49 X̄2 = 56 X̄3 = 53
The mean of the means is X̄¯ = 49 + 56 + 53 = 52.67 and the variation present between samples
3
is
3
1 X ¯
2 1
(49 − 52.67)2 + (56 − 52.67)2 + (53 − 52.67)2 = 12.33
2

ST r = X̄i − X̄ =
n − 1 i=1 3−1
Variation within samples
System 1
X
(X − X̄1 )2 = (48 − 49)2 + (56 − 49)2 + (46 − 49)2 + (45 − 49)2 + (51 − 49)2 = 76
System 2
X
(X − X̄2 )2 = (60 − 56)2 + (56 − 56)2 + (53 − 56)2 + (60 − 56)2 + (51 − 56)2 = 66
System 3
X
(X − X̄3 )2 = (57 − 53)2 + (55 − 53)2 + (52 − 53)2 + (50 − 53)2 + (51 − 53)2 = 34
Hence
X X X
(X − X̄1 )2 + (X − X̄2 )2 + (X − X̄3 )2 76 + 66 + 34
SE2 = = = 14.67
(n1 − 1) + (n2 − 1) + (n3 − 1) 4+4+4
nST2 r 5 × 12.33
The value of F is given by F = 2
= = 4.20
SE 14.67
The number of degrees of freedom for ST2 r is No. of samples −1 = 2
The number of degrees of freedom for SE2 is No. of samples×(sample size − 1) = 12
The critical value (5% level of significance) from the F -tables (Table 1 at the end of this Workbook)
is F(2,12) = 3.89 and since 4.20 > 3.89 we conclude that we have sufficient evidence to reject H0
so that the injection systems are not of equivalent efficiency.
HELM (2006): 11
Exercises
1. The yield of a chemical process, expressed in percentage of the theoretical maximum, is mea-
sured with each of two catalysts, A, B, and with no catalyst (Control: C). Five observations
are made under each condition. Making the usual assumptions for an analysis of variance, test
the hypothesis that there is no difference in mean yield between the three conditions. Use the
5% level of significance.
Catalyst A Catalyst B Control C

79.2 81.5 74.8
80.1 80.7 76.5
77.4 80.5 74.7
77.6 81.7 74.8
77.8 80.6 74.9
2. Four large trucks, A, B, C, D, are used to move stone in a quarry. On a number of days,
the amount of fuel, in litres, used per tonne of stone moved is calculated for each truck. On
some days a particular truck might not be used. The data are as follows. Making the usual
assumptions for an analysis of variance, test the hypothesis that the mean amount of fuel used
per tonne of stone moved is the same for each truck. Use the 5% level of significance.
Truck Observations
A 0.21 0.21 0.21 0.21 0.20 0.19 0.18 0.21 0.22 0.21
B 0.22 0.22 0.25 0.21 0.21 0.22 0.20 0.23
C 0.21 0.18 0.18 0.19 0.20 0.18 0.19 0.19 0.20 0.20 0.20
D 0.20 0.20 0.21 0.21 0.21 0.19 0.20 0.20 0.21
12 HELM (2006):
®
Answers
1. We calculatePthe
P treatment totals for A: 392.1, B: 405.0 and C: 375.7. The overall total is
2
1172.8 and y = 91792.68.
The total sum of squares is
1172.82
91792.68 − = 95.357
15
on 15 − 1 = 14 degrees of freedom.
The between treatments sum of squares is
1 1172.82
(392.12 + 405.02 + 375.72 ) − = 86.257
5 15
By subtraction, the residual sum of squares is
95.357 − 86.257 = 9.100
The analysis of variance table is as follows:
Source of Sum of Degrees of Mean Variance

variation squares freedom square ratio
Treatment 86.257 2 43.129 56.873
Residual 9.100 12 0.758
Total 95.357 14
The upper 5% point of the F2,12 distribution is 3.89. The observed variance ratio is greater
than this so we conclude that the result is significant at the 5% level and we reject the null
hypothesis at this level. The evidence suggests that there are differences in the mean yields
between the three treatments.
HELM (2006): 13
Answer
2. We can summarise the data as follows.
y2
P P
Truck y n
A 2.05 0.4215 10
B 1.76 0.3888 8
C 2.12 0.4096 11
D 1.83 0.3725 9
Total 7.76 1.5924 38
7.762
1.5924 − = 7.7263 × 10−3
38
The between trucks sum of squares is
2.052 1.762 2.122 1.832 7.762

+ + + − = 3.4581 × 10−3
10 8 11 9 38
7.7263 × 10−3 − 3.4581 × 10−3 = 4.2682 × 10−3
The analysis of variance table is as follows:

Trucks 3.4581 × 10−3 3 1.1527 × 10−3 9.1824
Residual 4.2682 × 10−3 34 0.1255 × 10−3
Total 7.7263 × 10−3 37
The upper 5% point of the F3,34 distribution is approximately 2.9. The observed variance
ratio is greater than this so we conclude that the result is significant at the 5% level and we
reject the null hypothesis at this level. The evidence suggests that there are differences in the
mean fuel consumption per tonne moved between the four trucks.
14 HELM (2006):
Two-Way Analysis
of Variance 44.2
Introduction
In the one-way analysis of variance (Section 44.1) we consider the effect of one factor on the values
taken by a variable. Very often, in engineering investigations, the effects of two or more factors are
considered simultaneously.
The two-away ANOVA deals with the case where there are two factors. For example, we might
compare the fuel consumptions of four car engines under three types of driving conditions (e.g.
urban, rural, motorway). Sometimes we are interested in the effects of both factors. In other cases
one of the factors is a ‘nuisance factor’ which is not of particular interest in itself but, if we allow for
it in our analysis, we improve the power of our test for the other factor.
We can also allow for interaction effects between the two factors.
' $
• be familiar with the general techniques of
hypothesis testing
Prerequisites • be familiar with the F -distribution
• be familiar with the one-way ANOVA
calculations
&
' %
$
• state the concepts and terminology of
two-way ANOVA
Learning Outcomes • perform two-way ANOVA
• interpret the results of two-way ANOVA
calculations
& %
HELM (2006): 15
Section 44.2: Two-Way Analysis of Variance
1. Two-way ANOVA without interaction
The previous Section considered a one-way classification analysis of variance, that is we looked at the
variations induced by one set of values of a factor (or treatments as we called them) by partitioning
the variation in the data into components representing ‘between treatments’ and ‘within treatments.’
In this Section we will look at the analysis of variance involving two factors or, as we might say,
two sets of treatments. In general terms, if we have two factors say A and B, there is no absolute
reason to assume that there is no interaction between the factors. However, as an introduction to
the two-way analysis of variance, we will consider the case occurring when there is no interaction
between factors and an experiment is run only once. Note that some authors take the view that
interaction may occur and that the residual sum of squares contains the effects of this interaction
even though the analysis does not, at this stage, allow us to separate it out and check its possible
effects on the experiment.
The following example builds on the previous example where we looked at the one-way analysis of
variance.
Example of variance in data

In Section 44.1 we considered an example concerning four machines producing alloy spaces. This
time we introduce an extra factor by considering both the machines producing the spacers and the
performance of the operators working with the machines. In this experiment, the data appear as
follows (spacer lengths in mm). Each operator made one spacer with each machine.
Operator Machine 1 Machine 2 Machine 3 Machine 4
1 46 56 55 47
2 54 55 51 56
3 48 56 50 58
4 46 60 51 59
5 51 53 53 55
In a case such as this we are looking for discernible difference between the operators (‘operator
effects’) on the one hand and the machines (‘machine effects’) on the other.
We suppose that the observation for operator i and machine j is taken from a normal distribution
with mean
µij = µ + αi + βj
Here αi is an operator effect and βj is a machine effect. Our hypotheses may be stated as follows.

 H0 : µ1j = µ2j = µ3j = µ4j = µ5j = µ + βj
Operator Effects That is α1 = α2 = α3 = α4 = α5 = 0
H1 : At least one of the operator effects is different to the others


 H0 : µi1 = µi2 = µi3 = µi4 = µ + αi
Machine Effects That is β1 = β2 = β3 = β4 = 0
H1 : At least one of the machine effects is different to the others

Note that the five operators and four machines give rise to data which has only one observation per
‘cell.’ For example, operator 2 using machine 3 produces a spacer 51 mm long, while operator 1 using
machine 2 produces a spacer which is 56 mm long. Note also that in this example we have referred
to the machines by number and not by letter. This is not particularly important but it will simplify
16 HELM (2006):
some of the notation used when we come to write out a general two-way ANOVA table shortly. We
obtain one observation per cell and cannot measure variation within a cell. In this case we cannot
check for interaction between the operator and the machine - the two factors used in this example.
Running an experiment several times results in multiple observations per cell and in this case we
should assume that there may be interaction between the factors and check for this. In the case
considered here (no interaction between factors), the required sums of squares build easily on the
relationship used in the one-way analysis of variance
SST = SST r + SSE
to become
SST = SSA + SSB + SSE
where SSA represent the sums of squares corresponding to factors A and B. In order to calculate
the required sums of squares we lay out the table slightly more efficiently as follows.
Operator Machine Operator ¯)

( X̄.j − X̄ Operator SS
(j ) (i ) Means (X̄.j ) ¯ )2
(X̄.j − X̄
1 2 3 4
1 46 56 55 47 51 −2 4
2 54 55 51 56 54 1 1
3 48 56 50 58 53 0 0
4 46 60 51 59 54 1 1
5 51 53 53 55 53 0 0
Machine ¯ = 53
49 56 52 55 X̄ Sum = 0 6 × 4 = 24
Means (X̄i. )
¯)
( X̄.j − X̄
−4 3 −1 2 Sum = 0
Machine SS ¯ )2
(X̄.j − X̄
16 9 1 4 30 × 5 = 150
Note 1
The . notation means that summation takes place over that variable. For example, the five operator
46 + 56 + 55 + 47
means X̄.j are obtained as X̄.1 = = 51 and so on, while the four machine means
4
46 + 54 + 48 + 46 + 51
X̄i. are obtained as X̄1. = = 49 and so on. Put more generally (and this is
5
just an example)
m
X
xij
i=1
X̄.j =
m
HELM (2006): 17
Note 2
Multiplying factors were used in the calculation of the machine sum of squares (four in this case
since there are four machines) and the operator sum of squares (five in this case since there are five
operators).
Note 3
The two statements ‘Sum = 0’ are included purely as arithmetic checks.
We also know that SSO = 24 and SSM = 150.
Calculating the error sum of squares
Note that the total sum of squares is easy to obtain and that the error sum of squares is then obtained
by straightforward subtraction.
The total sum of squares is given by summing the quantities (Xij − X̄ ¯ )2 for the table of entries.
Subtracting X̄¯ = 53 from each table member and squaring gives:
Operator (j) Machine (i)

1 2 3 4
1 49 9 4 36
2 1 4 4 9
3 25 9 9 25
4 49 49 4 36
5 4 0 0 4
The total sum of squares is SST = 330.
The error sum of squares is given by the result
SSE = SST − SSA − SSB

= 330 − 24 − 150
= 156
At this stage we display the general two-way ANOVA table and then particularise the table for the
example we are engaged in and draw conclusions by using the test as we have previously done with
one-way ANOVA.
18 HELM (2006):
A General Two-Way ANOVA Table

Between samples a " #2 SSA M SA
! (a − 1) M SA = F =
(due to factor A) SSA = b ¯
X̄i. − X̄ (a − 1) M SE
Differences between i =1
means X̄i. and X̄¯

Between samples b " #2 SSB M SB
! (b − 1) M SB = F =
(due to factor B ) SSB = a ¯
X̄.j − X̄ (b − 1) M SE
Differences between j=1
means X̄.j and X̄¯

Within samples b "
a !
! #2 (a − 1) SSE
¯ M SE =
(due to chance errors) SSE = Xij − X̄i. − X̄.j + X̄ ×(b − 1) (a − 1)(b − 1)
i=1 j=1
Differences between
and fitted values.
b "
a !
! #2
Totals SST = ¯
Xij − X̄ (ab − 1)
i=1 j=1
Hence the two-way ANOVA table for the example under consideration is

Between samples
6
(due to factor A) 24 F =
24 4 =6 13
Differences between 4 = 0.46
means X̄ and X̄ ¯
i·
Between samples
50
(due to factor B ) 150 F =
150 3 = 50 13
Differences between 3 = 3.85
¯ height
means X̄ j and X̄
·
Within samples 156
156 12 = 13
Differences between
and fitted values.
TOTALS 330 19
From the F -tables (at the end of the Workbook) F4,12 = 3.26 and F3,12 = 3.49. Since 0.46 < 3.26
we conclude that we do not have sufficient evidence to reject the null hypothesis that there is no
difference between the operators. Since 3.85 > 3.49 we conclude that we do have sufficient evidence
at the 5% level of significance to reject the null hypothesis that there in no difference between the
machines.
HELM (2006): 19
Key Point 2
If we have two factors, A and B, with a levels of factor A and b levels of factor B, and one
observation per cell, we can calculate the sum of squares as follows.
The sum of squares for factor A is
a
1 X 2 G2
SSA = A − with a − 1 degrees of freedom
b i=1 i N
and the sum of squares for factor B is
b
1 X 2 G2
SSB = B − with b − 1 degrees of freedom
a j=1 j N
where
b
X
Ai = Xij is the total for level i of factor A,
j=1
a
X
Bj = Xij is the total for level j of factor B,
i=1
a X
X b
G= Xij is the overall total of the data, and
i=1 j=1
N = ab is the total number of observations.

a X
b
X G2
SST = Xij2 − with N − 1 degrees of freedom
i=1 j=1
N
The within-samples, or ‘error’, sum of squares can be found by subtraction. So

SSE = SST − SSA − SSB
with
(N − 1) − (a − 1) − (b − 1) = (ab − 1) − (a − 1) − (b − 1)
= (a − 1)(b − 1) degrees of freedom
20 HELM (2006):
Task
A vehicle manufacturer wishes to test the ability of three types of steel-alloy panels
to resist corrosion when three different paint types are applied. Three panels with
differing steel-alloy composition are coated with three types of paint. The following
coded data represent the ability of the painted panels to resist weathering.
Paint Steel-Alloy Steel-Alloy Steel-Alloy
Type 1 2 3
1 40 51 56
2 54 55 50
3 47 56 50
Use a two-way ANOVA procedure to determine whether any difference in the ability
of the panels to resist corrosion may be assigned to either the type of paint or the
steel-alloy composition of the panels.
Your solution
Do your working on separate paper and enter the main conclusions here.
HELM (2006): 21
Answer
Our hypotheses may be stated as follows.

H 0 : µ1 = µ2 = µ3
Paint type
H1 : At least one of the means is different from the others

H 0 : µ1 = µ2 = µ3
Steel-Alloy
H1 : At least one of the means is different from the others
Following the methods of calculation outlined above we obtain:
Paint Type Steel-Alloy Paint Means ¯)

(X̄.j − X̄ Paint SS
(j ) (i ) (X̄.j ) ¯ )2
(X̄.j − X̄
1 2 3
1 40 51 56 49 −2 4
2 54 55 50 53 2 4
3 47 54 52 51 0 0
Steel-Alloy 47 54 52 ¯ = 51
X̄ Sum = 0 8 × 3 = 24
Means (X̄i. )
¯)
(X̄.j − X̄
−4 3 1 Sum = 0
¯ )2
Steel-Alloy SS (X̄.j − X̄
16 9 1 26 × 3 = 78
Hence SSP a = 24 and SSSt = 78. We now require SSE . The calculations are as follows.
In the table below, the predicted outputs are given in parentheses.
Paint Type Machine Paint Means ¯)

(X̄.j − X̄
(j ) (i ) (X̄.j )
1 2 3
40 51 56
1 49 −2
(45) (52) (50)
54 55 50
2 53 2
(49) (56) (54)
47 56 50
3 (54) 51 0
(47) (52)
Steel-Alloy
47 54 52 ¯ = 51
X̄ Sum = 0
Means (X̄i. )
¯)
(X̄.j − X̄
−4 3 1 Sum = 0
22 HELM (2006):
Answers continued
A table of squared residuals is easily obtained as
Paint Steel
(j) (i)
1 2 3
1 25 1 36
2 25 1 16
3 0 4 4
Hence the residual sum of squares is SSE = 112. The total sum of squares is given by subtracting
¯ = 51 from each table member and squaring to obtain
X̄
Paint Steel
(j) (i)
1 2 3
1 121 0 25
2 9 16 1
3 16 25 1
The total sum of squares is SST = 214. We should now check to see that SST = SSP a +SSSt +SSE .
Substitution gives 214 = 24 + 78 + 112 which is correct.
The values of F are calculated as shown in the ANOVA table below.

Between samples 24 12
24 2 M SA = = 12 F =
(due to treatment A, 12 28
say, paint) = 0.429
Between samples 78
78 2 M SB = = 39
(due to treatment B , 2 39
F =
say, Steel − Alloy) 28
= 1.393
Within samples 112
112 4 M SE = = 28
Totals 214 8
From the F -tables the critical values of F2,4 = 6.94 and since both of the calculated F values are
less than 6.94 we conclude that we do not have sufficient evidence to reject either null hypothesis.
HELM (2006): 23
2. Two-way ANOVA with interaction
The previous subsection looked at two-way ANOVA under the assumption that there was no inter-
action between the factors A and B. We will now look at the developments of two-way ANOVA
to take into account possible interaction between the factors under consideration. The following
analysis allows us to test to see whether we have sufficient evidence to reject the null hypothesis that
the amount of interaction is effectively zero.
To see how we might consider interaction between factors A and B taking place, look at the following
table which represents observations involving a two-factor experiment.
Factor B
Factor A 1 2 3 4 5
1 3 5 1 9 12
2 4 6 2 10 13
3 6 8 4 12 15
A brief inspection of the numbers in the five columns reveals that there is a constant difference
between any two rows as we move from column to column. Similarly there is a constant difference
between any two columns as we move from row to row. While the data are clearly contrived, it
does illustrate that in this case that no interaction arises from variations in the differences between
either rows or columns. Real data do not exhibit such behaviour in general of course, and we expect
differences to occur and so we must check to see if the differences are large enough to provide
sufficient evidence to reject the null hypothesis that the amount of interaction is effectively zero.
Notation
Let a represent the number of ‘levels’ present for factor A, denoted i = 1, . . . , a.
Let b represent the number of ‘levels’ present for factor B, denoted j = 1, . . . , b.
Let n represent the number of observations per cell. We assume that it is the same for each cell.
In the table above, a = 3, b = 5, n = 1. In the examples we shall consider, n will be greater than 1
and we will be able to check for interaction between the factors.
We suppose that the observations at level i of factor A and level j of factor B are taken from a
normal distribution with mean µij . When we assumed that there was no interaction, we used the
additive model
µij = µ + αi + βj
So, for example, the difference µi1 − µi2 between the means at levels 1 and 2 of factor B is equal
to β1 − β2 and does not depend upon the level of factor A. When we allow interaction, this is not
necessarily true and we write
µij = µ + αi + βj + γij
Here γij is an interaction effect. Now µi1 − µi2 = β1 − β2 + γi1 − γi2 so the difference between
two levels of factor B depends on the level of factor A.
24 HELM (2006):
Fixed and random effects
Often the levels assigned to a factor will be chosen deliberately. In this case the factors are said to be
fixed and we have a fixed effects model. If the levels are chosen at random from a population of all
possible levels, the factors are said to be random and we have a random effects model. Sometimes
one factor may be fixed while one may be random. In this case we have a mixed effects model. In
effect, we are asking whether we are interested in certain particular levels of a factor (fixed effects) or
whether we just regard the levels as a sample and are interested in the population in general (random
effects).
Calculation method
The data you will be working with will be set out in a manner similar to that shown below.
The table assumes n observations per cell and is shown along with a variety of totals and means
which will be used in the calculations of the various test statistics to follow.
Factor B
Factor A Level 1 Level 2 ... Level j ... Level b Totals
x111 x121 x1j1 x1b1
Level 1 .. .. ... .. ... .. T1··
. . . .
x11n x12n x1jn x1bn
x211 x221 x2j1 x2b1
Level 2 .. .. ... .. ... .. T2··
. . . .
x21n x22n x2jn x2bn
.. .. .. .. .. .. ..
. . . . . ... . .
xi11 xij1 xib1
.. Sum of data in cell .. o ..
n
Level i . X . ... . Ti··
(i,j) is Tij· = xijk
xi1n xijn xibn
k=1
.. .. .. .. .. .. ..
. . . . . ... . .
xa11 xa21 xaj1 xab1
Level a .. .. ... .. ... .. Ta··
. . . .
xa1n xa2n xajn xabn
Totals T·1· T·2· ... T·j· ... T·b· T···
Notes
(a) T... represents the grand total of the data values so that
b
X a
X a X
X b X
n
T··· = T·j· = Ti·· = xijk
j=1 i=1 i=1 j=1 k=1
(b) Ti.. represents the total of the data in the ith row.
(c) T.j. represents the total of the data in the jth column.
(d) The total number of data entries is given by N = nab.
HELM (2006): 25
Partitioning the variation
We are now in a position to consider the partition of the total sum of the squared deviations from
the overall mean which we estimate as
T...
x=
N
The total sum of the squared deviations is
!a ! b !n
(xijk − x )2
i=1 j=1 k=1
and it can be shown that this quantity can be written as

SST = SSA + SSB + SSAB + SSE
where SST is the total sum of squares given by
a X b X n
X T2
SST = x2ijk − ··· ;
i=1 j=1 k=1
N
SSA is the sum of squares due to variations caused by factor A given by
a
X Ti··2 T2
SSA = − ···
i=1
bn N
SSB is the sum of squares due to variations caused by factor B given by
b 2
X T·j· T···2
SSB = −
j=1
an N
Note that bn means b × n which is the number of observations at each level of A and an means
a × n which is the number of observations at each level of B.
SSAB is the sum of the squares due to variations caused by the interaction of factors A and B and
is given by
a X
b 2
X Tij· T···2
SSAB = − − SSA − SSB .
i=1 j=1
n N
n
X
Note that the quantity Tij. = xijk is the sum of the data in the (i, j)th cell and that the quantity
k=1
a X
b 2
X Tij. T...2
− is the sum of the squares between cells.
i=1 j=1
n N
SSE is the sum of the squares due to chance or experimental error and is given by
SSE = SST − SSA − SSB − SSAB
The number of degrees of freedom (N − 1) is partitioned as follows:
SST SSA SSB SSAB SSE
N − 1 a − 1 b − 1 (a − 1)(b − 1) N − ab
26 HELM (2006):
Note that there are ab − 1 degrees of freedom between cells and that the number of degrees of
freedom for SSAB is given by
ab − 1 − (a − 1) − (b − 1) = (a − 1)(b − 1)
This gives rise to the following two-way ANOVA tables.
Two-Way ANOVA Table - Fixed-Effects Model
Source of Sum of squares Degrees of Mean Square Value of

SSA M SA
Factor A SSA (a − 1) M SA = F =
(a − 1) M SE
SSB M SB
Factor B SSB (b − 1) M SB = F =
(b − 1) M SE
SSAB M SAB
Interaction SSAB (a − 1) × (b − 1) M SAB = F =
(a − 1)(b − 1) M SE
SSE
Residual Error SSE (N − ab) M SE =
N − ab
Totals SST (N − 1)
Two-Way ANOVA Table - Random-Effects Model

SSA M SA
(a − 1) M SAB
SSB M SB
(b − 1) M SAB
SSAB M SAB
(a − 1)(b − 1) M SE
SSE
N − ab
HELM (2006): 27
Two-Way ANOVA Table - Mixed-Effects Model
Case (i) A fixed and B random.

SSA M SA
(a − 1) M SAB
SSB M SB
(b − 1) M SE
SSAB M SAB
(a − 1)(b − 1) M SE
SSE
N − ab
Case (ii) A random and B fixed.

SSA M SA
(a − 1) M SE
SSB M SB
(b − 1) M SAB
SSAB M SAB
(a − 1)(b − 1) M SE
SSE
N − ab
28 HELM (2006):
Example 1
In an experiment to compare the effects of weathering on paint of three different
types, two identical surfaces coated with each type of paint were exposed in each
of four environments. Measurements of the degree of deterioration were made as
follows.
Environment 1 Environment 2 Environment 3 Environment 4
Paint A 10.89 10.74 9.94 11.25 9.88 10.13 14.11 12.84
Paint B 12.28 13.11 14.45 11.17 11.29 11.10 13.44 11.37
Paint C 10.68 10.30 10.89 10.97 10.61 11.00 12.22 11.32
Making the assumptions of normality, independence and equal variance, derive the
appropriate ANOVA tables and state the conclusions which may be drawn at the
5% level of significance in the following cases.
(a) The types of paint and the environments are chosen deliberately be-
cause the interest is in these paints and these environments.
(b) The types of paint are chosen deliberately because the interest is in
these paints but the environments are regarded as a sample of possible
environments.
(c) The types of paint are regarded as a random sample of possible paints
and the environments are regarded as a sample of possible environ-
ments.
Solution
We know that case (a) is described as a fixed-effects model, case (b) is described as a mixed-effects
model (paint type fixed) and case (c) is described as a random-effects model. In all three cases the
calculations necessary to find M SP (paints), M SN (environments), M SP and M SN are identical.
Only the calculation and interpretation of the test statistics will be different. The calculations are
shown below.
Subtracting 10 from each observation, the data become:
Environment 1 Environment 2 Environment 3 Environment 4 Total
Paint A 0.89 0.74 −0.06 1.25 −0.12 0.13 4.11 2.84 9.78
(total 1.63) (total 1.19) (total 0.01) (total 6.95)
Paint B 2.28 3.11 4.45 1.17 1.29 1.10 3.44 1.37 18.21
Paint C 0.68 0.30 0.89 0.97 0.61 1.00 2.22 1.32 7.99
Total 8.00 8.67 4.01 15.30 35.98

35.982
SST = 0.892 + 0.742 + . . . + 1.322 − = 36.910
24
We can simplify the calculation by finding the between samples sum of squares
1 35.982
SSS = (1.632 + 5.392 + . . . + 3.542 ) − = 26.762
2 24
HELM (2006): 29
Solution (contd.)
Sum of squares for paints is
1 35.982
SSP = (9.782 + 18.152 + 7.992 ) − = 7.447
8 24
Sum of squares for environments is
1 2 2 2 2 35.982
SSN = (8.00 + 8.67 + 3.98 + 15.30 ) − = 10.950
6 24
So the interaction sum of squares is SSP N = SSS − SSP − SSN = 8.365 and
the residual sum of squares is SSE = SST −SSS = 10.148 The results are combined in the following
ANOVA table
Deg. of Sum of Mean Variance Variance Variance Ratio
Freedom Squares Square Ratio (fixed) Ratio (mixed) (random)
Paints 2 7.447 3.724 4.40 2.67 2.67
F2,12 = 3.89 F2,6 = 5.14 F2,6 = 5.14
Environments 3 10.950 3.650 4.31 4.31 2.61
F3,12 = 3.49 F3,12 = 3.49 F3,6 = 4.76
Interaction 6 8.365 1.394 1.65 1.65 1.65
F6,12 = 3.00 F6,12 = 3.00 F6,12 = 3.00
Treatment 11 26.762 2.433
combinations
Residual 12 10.148 0.846
Total 23 36.910
The following conclusions may be drawn. There is insufficient evidence to support the interaction
hypothesis in any case. Therefore we can look at the tests for the main effects.
Case (a) Since 4.40 > 3.89 we have sufficient evidence to conclude that paint type affects the
degree of deterioration. Since 4.07 > 3.49 we have sufficient evidence to conclude that environment
affects the degree of deterioration.
Case (b) Since 2.67 < 5.14 we do not have sufficient evidence to reject the hypothesis that paint
type has no effect on the degree of deterioration. Since 4.07 > 3.49 we have sufficient evidence to
conclude that environment affects the degree of deterioration.
Case (c) Since 2.67 < 5.14 we do not have sufficient evidence to reject the hypothesis that paint
type has no effect on the degree of deterioration. Since 2.61 < 4.76 we do not have sufficient
evidence to reject the hypothesis that environment has no effect on the degree of deterioration.
If the test for interaction had given a significant result then we would have concluded that there
was an interaction effect. Therefore the differences between the average degree of deterioration for
different paint types would have depended on the environment and there might have been no overall
‘best paint type’. We would have needed to compare combinations of paint types and environments.
However the relative sizes of the mean squares would have helped to indicate which effects were
most important.
30 HELM (2006):
Task
A motor company wishes to check the influences of tyre type and shock absorber
settings on the roadholding of one of its cars. Two types of tyre are selected
from the tyre manufacturer who normally provides tyres for the company’s new
vehicles. A shock absorber with three possible settings is chosen from a range of
shock absorbers deemed to be suitable for the car. An experiment is conducted
by conducting roadholding tests using each tyre type and shock absorber setting.
The (coded) data resulting from the experiment are given below.
Factor Shock Absorber Setting
Tyre B1=Comfort B2=Normal B3=Sport
5 8 6
Type A1 6 5 9
8 3 12
9 10 12
Type A2 7 9 10
7 8 9
Decide whether an appropriate model has random-effects, mixed-effects or fixed-
effects and derive the appropriate ANOVA table. State clearly any conclusions
that may be drawn at the 5% level of significance.
Your solution
Do the calculations on separate paper and use the space here and on the following page for your
summary and conclusions.
HELM (2006): 31
Answer
We know that both the tyres and the shock absorbers are not chosen at random from populations
consisting of all possible tyre types and shock absorber types so that their influence is described by
a fixed-effects model. The calculations necessary to find M SA , M SB , M SAB and M SE are shown
below.
B1 B2 B3 Totals
5 8 6
A1 6 5 9
8 3 12
T11 = 19 T12 = 16 T13 = 27 T1·· = 62
9 10 12
A2 7 9 10
7 8 9
T21 = 23 T22 = 27 T23 = 31 T2·· = 81
Totals T·1· = 42 T·2· = 43 T·3· = 58 T··· = 143
The sums of squares calculations are:
2 X
3 X
3
X T···2 1432 1432
SST = x2ijk − = 52 + 62 + . . . + 102 + 92 − = 1233 − = 96.944
i=1 j=1 k=1
N 18 18
2
X T2 i·· T···2 622 + 812 1432 10405 1432
SSA = − = − = − = 20.056
i=1
bn N 3×3 18 9 18
3 2
X T·j· T···2 422 + 432 + 582 1432 6977 1432
SSB = − = − = − = 26.778
j=1
an N 2×3 18 6 18
2 X
3 2
X Tij· T···2 192 + . . . + 312 1432
SSAB = − − SSA − SSB = − − 20.056 − 26.778
i=1 j=1
n N 3 18
3565 1432
= − − 20.056 − 26.778 = 5.444
3 18
SSE = SST − SSA − SSB − SSAB = 96.944 − 20.056 − 26.778 − 5.444 = 44.666
The results are combined in the following ANOVA table.
Source SS DoF MS F (Fixed) F (Fixed)
M SA
Factor 20.056 1 20.056 5.39
M SE
A F1,12 = 4.75
M SB
Factor 26.778 2 13.389 3.60
M SE
B F2,12 = 3.89
M SAB
Interaction 5.444 2 2.722 0.731
M SE
AB F2,12 = 3.89
Residual 44.666 12 3.722
E
Totals 96.944 17
32 HELM (2006):
Answer
The following conclusions may be drawn:
Interaction: There is insufficient evidence to support the hypothesis that interaction takes place
between the factors.
Factor A: Since 5.39 > 4.75 we have sufficient evidence to reject the hypothesis that tyre type does
not affect the roadholding of the car.
Factor B: Since 3.60 < 3.89 we do not have sufficient evidence to reject the hypothesis that shock
absorber settings do not affect the roadholding of the car.
Task
The variability of a measured characteristic of an electronic assembly is a source
of trouble for a manufacturer with global manufacturing and sales facilities. To
investigate the possible influences of assembly machines and testing stations on
the characteristic, an engineer chooses three testing stations and three assembly
machines from the large number of stations and machines in the possession of
the company. For each testing station - assembly machine combination, three
observations of the characteristic are made.
The (coded) data resulting from the experiment are given below.
Factor Testing Station
Assembly Machine B1 B2 B3
2.3 3.7 3.1
A1 3.4 2.8 3.2
3.5 3.7 3.5
3.5 3.9 3.3
A2 2.6 3.9 3.4
3.6 3.4 3.5
2.4 3.5 2.6
A3 2.7 3.2 2.6
2.8 3.5 2.5
Decide whether an appropriate model has random-effects, mixed-effects or fixed-
effects and derive the appropriate ANOVA table.
State clearly any conclusions that may be drawn at the 5% level of significance.
Your solution
Do the calculations on separate paper and use the space here and on the following page for your
summary and conclusions.
HELM (2006): 33
Your solution contd.
Answer
Both the machines and the testing stations are effectively chosen at random from populations
consisting of all possible types so that their influence is described by a random-effects model. The
calculations necessary to find M SA , M SB , M SAB and M SE are shown below.
B1 B2 B3 Totals
2.3 3.7 3.1
A1 3.4 2.8 3.2
3.5 3.7 3.5
T11 = 9.2 T12 = 10.2 T13 = 9.8 T1·· = 29.2
3.5 3.9 3.3
A2 2.6 3.9 3.4
3.6 3.4 3.5
T21 = 9.7 T22 = 11.2 T23 = 10.2 T2·· = 31.1
2.4 3.5 2.6
A3 2.7 3.2 2.6
2.8 3.5 2.5
T31 = 7.9 T32 = 10.2 T33 = 7.7 T3·· = 25.8
Totals T·1· = 26.8 T·2· = 31.6 T·3· = 27.7 T··· = 86.1
a = 3, b = 3, n = 3, N = 27 and the sums of squares calculations are:
3 X
3 X
3
X T···2 86.12
SST = x2ijk − = 2.32 + 3.42 + . . . + 2.62 + 2.52 − = 5.907
i=1 j=1 k=1
N 27
3
X T2 i·· T···2 29.22 + 31.12 + 25.82 86.12
SSA = − = − = 1.602
i=1
bn N 3×3 27
3 2
X T·j· T···2 26.82 + 31.62 + 27.72 86.12
SSB = − = − = 1.447
j=1
an N 3×3 27
3 X
3 2
X Tij· T···2
SSAB = − − SSA − SSB
i=1 j=1
n N
9.22 + 10.22 + . . . + 10.22 + 7.72 86.12

= − − 1.602 − 1.447 = 0.398
3 27
SSE = SST − SSA − SSB − SSAB = 5.907 − 1.602 − 1.447 − 0.398 = 2.46
34 HELM (2006):
Answer continued
The results are combined in the following ANOVA table
Source SS DoF MS F (Random) F (Random)

M SA
Factor 1.602 2 0.801 8.05
M SAB
A F2,4 = 6.94
(Machines)
M SB
Factor 1.447 2 0.724 7.28
M SAB
B F2,4 = 6.94
(Stations)
M SAB
Interaction 0.398 4 0.099(5) 0.728
M SE
AB F4,18 = 2.93
Residual 2.460 18 0.136
E
Totals 5.907 26
The following conclusions may be drawn.
Interaction: There is insufficient evidence to support the hypothesis that interaction takes place
between the factors.
Factor A: Since 8.05 > 6.94 we have sufficient evidence to reject the hypothesis that the assembly
machines do not affect the assembly characteristic.
Factor B: Since 7.28 > 6.94 we have sufficient evidence to reject the hypothesis that the choice of
testing station does not affect the assembly characteristic.
3. Two-way ANOVA versus one-way ANOVA

You should note that a two-way ANOVA design is rather more efficient than a one-way design. In
the last example, we could fix the testing station and look at the electronic assemblies produced by a
variety of machines. We would have to replicate such an experiment for every testing station. It would
be very difficult (impossible!) to exactly duplicate the same conditions for all of the experiments.
This implies that the consequent experimental error could be very large. Remember also that in a
one-way design we cannot check for interaction between the factors involved in the experiment. The
three main advantages of a two-way ANOVA may be stated as follows:
(a) It is possible to simultaneously test the effects of two factors. This saves both time and
money.
(b) It is possible to determine the level of interaction present between the factors involved.
(c) The effect of one factor can be investigated over a variety of levels of another and so
any conclusions reached may be applicable over a range of situations rather than a single
situation.
HELM (2006): 35
Exercises
1. The temperatures, in Celsius, at three locations in the engine of a vehicle are measured after
each of five test runs. The data are as follows. Making the usual assumptions for a two-
way analysis of variance without replication, test the hypothesis that there is no systematic
difference in temperatures between the three locations. Use the 5% level of significance.
Location Run 1 Run 2 Run 3 Run 4 Run 5

A 72.8 77.3 82.9 69.4 74.6
B 71.5 72.4 80.7 67.0 74.0
C 70.8 74.0 79.1 69.0 75.4
2. Waste cooling water from a large engineering works is filtered before being released into the
environment. Three separate discharge pipes are used, each with its own filter. Five samples
of water are taken on each of four days from each of the three discharge pipes and the
concentrations of a pollutant, in parts per million, are measured. The data are given below.
Analyse the data to test for differences between the discharge pipes. Allow for effects due to
pipes and days and for an interaction effect. Treat the pipe effects as fixed and the day effects
as random. Use the 5% level of significance.
Day Pipe A
1 160 181 163 173 178
2 175 170 219 166 171
3 169 186 179 178 183
4 230 206 216 195 250
Day Pipe B
1 172 164 186 185 172
2 177 170 156 140 155
3 193 194 189 156 181
4 212 235 195 206 209
Day Pipe C
1 214 196 207 219 200
2 186 184 181 189 179
3 209 220 199 185 228
4 254 293 283 262 259
36 HELM (2006):
Answers
1. We calculate totals as follows.
Run Total Location Total
1 215.1 A 377.0
2 223.7 B 365.6
3 242.7 C 368.3
4 205.4 Total 1110.9
5 224.0
Total 1110.9
XX
yij2 = 82552.17
1110.92
8255217 − = 278.916 on 15 − 1 = 14 degrees of freedom.
15
The between-runs sum of squares is
1 1110.92
(215.12 + 223.72 + 242.72 + 205.42 + 224.02 ) − = 252.796
3 15
The between-locations sum of squares is
1 1110.92
(377.02 + 365.62 + 368.32 ) − = 14.196 on 3 − 1 = 2 degrees of freedom.
5 15
278.916 − 252.796 − 14.196 = 11.924 on 14 − 4 − 2 = 8 degrees of freedom.
The analysis of variance table is as follows.

Runs 252.796 4 63.199
Locations 14.196 2 7.098 4.762
Residual 11.924 8 1.491
Total 278.916 14
The upper 5% point of the F2,8 distribution is 4.46. The observed variance ratio is greater than this
so we conclude that the result is significant at the 5% level and reject the null hypothesis at this
level. The evidence suggests that there are systematic differences between the temperatures at the
three locations. Note that the Runs mean square is large compared to the Residual mean square
showing that it was useful to allow for differences between runs.
HELM (2006): 37
Answers continued
2. We calculate totals as follows.
Day 1 Day 2 Day 3 Day 4 Total
Pipe A 855 901 895 1097 3748
Pipe B 879 798 913 1057 3647
Pipe C 1036 919 1041 1351 4347
Total 2770 2618 2849 3505 11742
XXX
2
yijk = 2356870
The total number of observations is N = 60.
117422
2356870 − = 58960.6
60
The between-cells sum of squares is
1 117422
(8552 + · · · + 13512 ) − = 58960.6
5 60
on 12 − 1 = 11 degrees of freedom, where by “cell” we mean the combination of a pipe and a day.
58960.6 − 48943.0 = 10017.6
The between-days sum of squares is
1 117422
(27702 + 26182 + 28492 + 35052 ) − = 30667.3
15 60
The between-pipes sum of squares is
1 117422
(37482 + 36472 + 43472 ) − = 14316.7
20 60
By subtraction the interaction sum of squares is
48943.0 − 30667.3 − 14316.7 = 3959.0
on 11 − 3 − 2 = 6 degrees of freedom.
38 HELM (2006):
Answers continued
The analysis of variance table is as follows.

Pipes 14316.7 2 7158.4 10.85
Days 30667.3 3 10222.4 48.98
Interaction 3959.0 6 659.8 3.16
Cells 48943.0 11 4449.4 21.32
Residual 10017.6 48 208.7
Total 58960.6 59
Notice that, because Days are treated as a random effect, we divide the Pipes mean square by the
Interaction mean square rather than by the Residual mean square.
The upper 5% point of the F6,48 distribution is approximately 2.3. Thus the Interaction variance
ratio is significant at the 5% level and we reject the null hypothesis of no interaction. We must
therefore conclude that there are differences between the means for pipes and for days and that
the difference between one pipe and another varies from day to day. Looking at the mean squares,
however, we see that both the Pipes and Days mean squares are much bigger than the Interaction
mean square. Therefore it seems that the interaction effect is relatively small compared to the
differences between days and between pipes.
HELM (2006): 39

Experimental Design 44.3
Introduction
In Sections 44.1 and 44.2 we have considered how to analyse data from experiments of certain
types. Of course, before we can analyse any data we must conduct the experiment and before we
can conduct an experiment we must design it. The work of applying statistical ideas to engineering
experiments does not begin with the analysis of data. It begins with the design of the experiment. It
is important to give proper consideration to experimental design to make sure that our experiment
is efficient and that it will, in fact, give us the information we require. A badly designed experiment
may give poor or misleading results or may turn out to be an expensive waste of time and money.
' $
• explain the concepts and terminology of the
one-way and two-way ANOVA
Prerequisites • be familiar with the F -distribution
• understand the general techniques of
hypothesis testing
&
' %
$
• explain the basic concepts of experimental
design
Learning Outcomes • apply randomized blocks and Latin Square

designs
• analyse the results from randomized blocks
and Latin Square designs
& %
40 HELM (2006):
1. Experimental design
So far in this Workbook we have looked at some of the statistical methods used in the analysis and
interpretation of experimental results. There are occasions when the planning of an experiment is
not in the control of the statistician responsible for analysing the results. It is always preferable to
have some idea of the likely variability of the data so that any experimental design can take this into
account. For this reason, the design of experiments is of crucial importance if weight is to be
given to the results obtained. Usually, the experimenter will have to take into account:
(a) The definition of the problem to be investigated. This would usually include the selection
of the response variable to be measured and the factors or treatments influencing the
response. Remember that the factors may be quantitative (such as temperature, pressure
or force), qualitative (such as days of the week, machine operators or machines themselves)
and decisions must be taken as to whether these factors are fixed or random and at what

Helm

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Helm

Uploaded by

Copyright:

Available Formats

Contents 1

1.3 Simplification and Factorisation 40

1.4 Arithmetic of Algebraic Fractions 62

1.5 Formulae and Transposition 78

and Symbols 1.1 

Learning Outcomes • recognise and use a wide range of common

1. Numbers, operations and common notations

The number line

Figure 1: Numbers can be represented on a number line

The plus or minus sign (±)

For example, (−4) × 5 = −20, and (−3) × (−6) = 18.

Division (÷) or (/)

The reciprocal of a number

The modulus notation (| | )

The factorial symbol (!)

(a) 4! = 4 × 3 × 2 × 1 = 24. Similarly, 5! = 5 × 4 × 3 × 2 × 1 = 120. Note that

3. Rounding to n decimal places

π = 3.142 rounded to 3 decimal places

2.3403 = 2.340 rounded to 3 decimal places

4. Rounding to n significant figures

Examples are given on the next page.

π = 3.142 rounded to 4 significant figures

2136 = 2000 rounded to 1 significant figure

36.78 = 37 rounded to 2 significant figures

6.2399 = 6.240 rounded to 4 significant figures

Addition, + Third priority: carry out all additions and subtractions

The bracketed expression is evaluated first:

8. Which of the following statements are true ?

5. (a) −1, (b) 26, (c) −20, (d) −8

9. For example (a) (1 + 2) + 3 = 1 + (2 + 3), and both are equal to 6. (b) 8 + 2 = 2 + 8.

A α alpha I ι iota P ρ rho

Figure 3: The temperature is measured at four points

6. Combining numbers together using +, −, ×, ÷

The equals sign (=)

The notation for the change in a variable (δ )

Sigma (or summation) notation ( )

This provides a concise and convenient way of writing long sums.

(b) Here k starts at zero so there are n + 1 terms where n = 4:

3. Express as simply as possible (a) (−3) × x × (−2) × y, (b) 9 × x × z × (−5).

4. Simplify (a) 8(2y), (b) 17x(−2y), (c) 5x(8y), (d) 5x(−8y)

5. What is the distinction between 5x(2y) and 5x − 2y ?

3. (a) 6xy, (b) −45xz

4. (a) 16y, (b) −34xy, (c) 40xy, (d) −40xy

5. 5x(2y) = 10xy, 5x − 2y cannot be simplified.

9. y(x1 )δx1 + y(x2 )δx2 + y(x3 )δx3 .

Learning Outcomes • state and use the laws of indices

On completion you should be able to . . . • use scientific notation

(a) In the expression 811 , 8 is the base and 11 is the index.

2. Evaluate using a calculator (a) 73 , (b) (14)3.2 .

3. Write each of the following using index notation:

4. Evaluate without using a calculator. Leave any fractions in fractional form.

2. (a) 343, (b) 4651.7 (1 d.p.).

(a) The indices must be added, thus a5 × a4 = a5+4 = a9 .

2x5 (x3 ) = 2(x5 x3 ) = 2x5+3 = 2x8

(a) (82 )3 = 82×3 = 86

Any number raised to power 1 is itself, that is a1 = a

(am bn )k = amk bnk

or, alternatively (3x)2 = (3x) × (3x) = 9x2

(b) (x3 y 7 )4 = x3×4 y 7×4 = x12 y 28

2. Write each of the following expressions with a single index:

2. (a) 616 , (b) 6−12 , (c) x12

3. (a) 64a2 , (b) 343a3 b3 , (c) 7a3 b3 , (d) 1296x4 y 4

4. (a) 15x5 , (b) 15x3 , (c) 54x3

and Symbols 1.1

and Factorisation 1.3