You are on page 1of 35

TOPIC 2:

NON-LINEAR EQUATIONS

INTRODUCTION
In this topic you will learn several methods of solving
numerically the non-linear equation f ( x) = 0 . This root-finding
problem is one of the most basic problems of numerical
approximation. The solution to f ( x) = 0 is a root of f as
defined below:

Definition (Root of a Function)

Suppose that f ( x) is a continuous function. A number r such


that f (r ) = 0 is called the root of the equation f ( x) = 0 . We
may also say that r is the zero of the function f ( x) .

Although quadratic equations of one variable can be solved


analytically, numerical approximation of the zeros may be
desired. For many other types of equations, it is either difficult
or impossible to find an exact solution. In this chapter we
investigate several techniques for finding roots of nonlinear
equations of a single variable. The first method is called fixed-
point iteration. The other methods include the bisection
method, the method of false position (or Regula Falsi
method), the Newton-Raphson method (or simply Newton’s
method) and Secant method.

1
2.1 FIXED-POINT ITERATION
In general, we can write a non-linear equation y = f ( x) = 0 in
the form

x = g ( x) . (2.1)

For example, consider the problem of solving

f ( x) = 5 x3 − 10 x + 3 = 0 .

By writing this equation as

10 x = 5 x 3 + 3 ,

we can reformulate the problem as finding a fixed point of the


equation

x = g ( x) = 0.5 x3 + 0.3 .

There are other ways of converting f ( x) = 0 into the form


x = g ( x) . However, not all such formulations will converge.

Let’s consider the following definitions:

Definition 2.1 (Fixed Point)

A fixed point of a function g ( x) is a real number P such that


P = g ( P) .

Definition 2.2 (Fixed Point Iteration)

The iteration pk +1 = g ( pk ) for k = 0,1,K is called fixed point

2
iteration.

One of the characteristics of an iteration method is that it


requires an initial value to start with. Therefore, a rule or
function g ( x) for computing the subsequent terms is
necessary together with an initial value p0 . Then we can
generate a sequence of values { pk } using the iteration rule

pk +1 = g ( pk ) (2.2)

Hence, we can write (2.1) in details as below:

Letting the initial value be p0 , then

p1 = g ( p0 )
p2 = g ( p1 )
M
(2.3)
pk = g ( pk −1 )
pk +1 = g ( pk )
M

Notice that if the sequence (2.3) is convergent to a particular


number, then we are successful in our attempt to determine
the solution, otherwise we fail to obtain the solution. In other
words, the iterative process can be stopped whenever the
convergence criterion

pk +1 − g ( pk ) ≤ ε (2.4)

is satisfied, where ε is a small number on the order of 10−3 to


10−6 .

3
Example 2.1

Discuss the following iteration:

p0 = 1 and pk +1 = 1.001 pk for k = 0,1,K

Solution:

Consider the first 100 terms obtained using the iteration rule:

p1 = 1.001 p0 = 1.001(1.000000) = 1.001000,


p2 = 1.001 p1 = 1.001(1.001000) = 1.002001,
p3 = 1.001 p2 = 1.001(1.002001) = 1.003003,
M
p100 = 1.001 p99 = 1.001(1.104012) = 1.105116.

You may continue the process and it is easy to conclude that


the lim pn = +∞ . Indeed, the sequence { pk } is a numerical
n→∞
solution of the differential equation y′ = 0.001 y . Its analytical
solution is y ( x) = exp(0.001x) . Note that the 100th-term, p100 , in
the sequence and the solution of the differential equation at
x = 100 , i.e. y (100) , is approximately the same, that is
p100 = 1.105116 ≈ 1.105171 = exp(0.1) = y (100) .

We shall next concentrate on the function g ( x) that yields the


convergent sequence { pk } .

Theorem 2.1 (Fixed Point)

{ pk }k =0

Suppose that g ( x) is a continuous function and is a

4
sequence generated by the fixed-point iteration. If lim pk = P ,
k →∞
then P is a fixed point of g ( x) .

Theorem 2.2

Suppose that g ( x) ∈ C[a, b] .

(a) If the range of mapping y = g ( x) satisfies a ≤ y ≤ b for all


a ≤ x ≤ b , then g ( x) has a fixed point in [a, b] .

(b) If g ′( x) is defined over (a, b) and there exists a positive


constant K < 1 such that g ′( x) ≤ K < 1 for all x ∈ (a, b) , then
g ( x) has a unique fixed point P in [a, b] .

The following theorem can be used to determine whether the


iteration (2.3) above will yield a convergent or divergent
sequence.

Theorem 2.3 (Fixed Point Theorem)

Suppose that g ( x) and g ′( x) are continuous on a balanced


interval (a, b) = ( P − δ , P + δ ) that contains the unique fixed
point P and that the starting value p0 is chosen in this
interval.

(a) If g ′( x) ≤ K < 1 for all x ∈ [a, b] , then the iteration


pk = g ( pk −1 ) will converge to P .

(b) If g ′( x ) > 1 for all x ∈ [a, b] , then the iteration pk = g ( pk−1 )


will not converge to P .

5
The following example illustrates the theorem above.

Example 2.2

Investigate the nature of the iteration (2.3) when the function


x2
g ( x) = 1 + x − is used.
4

Solution:

The fixed points can be determined by obtaining the solution


of
P2
P = 1+ P − .
4

We will obtain two solutions, namely P = −2 and P = 2 . Next,


x
we obtain the derivative of the function as g ′( x) = 1 − .
2
Therefore, we need to consider two cases as follows:

Case 1: P = −2

Starting with p0 = −2.05 , we obtain

(−2.05) 2
p1 = 1 + (−2.05) − = −2.100625
4
p2 = −2.20378135
p3 = −2.41794441
M
lim pk = −∞
k →∞

Since

6
g ′( x) > 32 > 1

on [ −3, −1] and from Theorem 2.3, the sequence does not
converge to P = −2 .

Case 2: P = 2

Starting with p0 = 1.6 , we obtain

(1.6) 2
p1 = 1 + 1.6 − = 1.96
4
p2 = 1.9996
p3 = 1.99999996
M
lim pk = 2
k →∞

Since g ′( x) < 12 < 1 on [1,3] and from Theorem 2.3, the


sequence converges to P = 2 .

Notice that Theorem 2.3 does not say anything about the case
g ′( P ) = 1. Now, consider the following example.

Example 2.3

Investigate the nature of the iteration (2.3) for the case


g ( x) = 2( x − 1)1/ 2 for x ≥ 1.

Solution:

7
We determine the fixed points by finding the solutions of
P = 2( P − 1)1/ 2 . This yields P = 2 . Next, we obtain the
derivative, g ′( x) = 1/( x − 1)1/ 2 giving g ′(2) = 1. Hence Theorem
2.3 cannot be used. There are two cases to be considered
when we initially set the starting value, either to the left or right
of P = 2 .

Case 1: Start with p0 = 1.5 , that is to the left of 2.

Therefore, we obtain

p1 = 1.41421356
p2 = 1.28718851
p3 = 1.07179943
p4 = 0.53590832
p5 = 2(−0.46409168)1 2

Notice that p4 is outside the domain of g , therefore the term


p5 cannot be computed. This is clear from the values
indicated above.

Case 2: Start with p0 = 2.5 , that is to the right of 2.

Therefore, we obtain

p1 = 2.44948974
p2 = 2.40789513
p3 = 2.37309514
p4 = 2.34358284
M
lim pk = 2
k →∞

8
This sequence converges slowly to the value of P = 2 .

2.2 THE BISECTION METHOD


We shall construct a bracketing method to determine the zero
of a continuous function. Suppose that we start with [a, b] as
an initial interval with f (a ) and f (b) are of opposite signs.
Since f ( x) ∈ C[a, b] , this function will cross the x -axis at the
point x = r that is within the interval [a, b] . The bisection
method systematically moves the endpoints of the interval
close together until we obtain an arbitrarily small interval that
brackets the root of the equation.

The steps of bisecting the interval is as follows:

Firstly, choose the midpoint


a+b
c=
2
and then analyze the three possibilities that might arise:

If f (a ) and f (c) are of opposite signs, a root exists in the interval [a, c].
(2.5)

If f (c) and f (b) are of opposite signs, a root exists in the interval [c, b].
(2.6)

If f (c) = 0, a root is found at x = c. (2.7)

In the cases (2.5) and (2.6), we have found an interval that is


half as wide as the original interval that contains the root. Note

9
that, we have bracketed the root in an interval that is smaller
than the original interval. To continue the process, we
relabeled the new interval that contains the root and repeat
the process until the interval width is as small as desired.
Thus we generate a sequence of nested intervals and their
midpoints. The details of the process can be described as
follows:

Set the starting interval as [ a0 , b0 ] and c0 as the midpoint,


a +b
c0 = 0 0 .
2

The second interval that brackets the root r as [a1 , b1 ] and c1


as its midpoint; the interval [a1 , b1 ] is half as wide as the
interval [ a0 , b0 ] .

As we reach at the n -th interval, [ an , bn ] , that brackets r and


its midpoint cn , we construct the interval [an+1 , bn+1 ] that also
brackets r and is half as wide as the interval [ an , bn ] .

It is easy to show that the sequence of the left endpoints is


increasing and the sequence of the right endpoints is
decreasing; that is,

a0 ≤ a1 ≤ K ≤ an ≤ an+1 K ≤ r ≤ K ≤ bn+1 ≤ bn ≤ K ≤ b1 ≤ b0 (2.8)

an + bn
where cn = and either
2

[an+1 , bn+1 ] = [an , cn ] or [an+1 , bn+1 ] = [cn , bn ] for all n . (2.9)

Next, we state the following theorem that can be used to


determine the location of a root of a function in a given
interval.

10
Theorem 2.4 (Bisection Theorem)

Assume that f ( x) ∈ C[a, b] and that there exists a number


r ∈ [a, b] such that f ( r ) = 0 . If f (a) and f (b) are of opposite
signs, and {cn }n=0 represents the sequence of midpoints

generated by the bisection process of (2.8) and (2.9), then

b−a
r − cn ≤ for n = 0,1,K (2.10)
2n+1

{cn }n=0

and therefore the sequence converges to the root
x = r , that is,

lim cn = r (2.11)
n→∞

Example 2.4

Find the value of x in the interval [0, 2] satisfying h( x ) = 1.


Given that h( x) = x sin( x) .

Solution:

We write h( x) = x sin( x) = 1. This results in

x sin( x) − 1 = 0 .

We are interested in finding the value of x that satisfies


h( x ) = 1. Therefore we have to obtain the solution of
x sin( x ) − 1 = 0 , that is, the root of f ( x) = x sin( x ) − 1 in the
interval [0, 2] . Use the bisection method to find the root of the

11
function f ( x) = x sin( x) − 1 . Start with a0 = 0 and b0 = 2 , we
compute

f (0) = 0 × sin 0 − 1 = −1.000000 and f (2) = 0.818595 .

These two values have opposite sign; therefore there exists a


root in the interval [0, 2] . Next, we obtain the midpoint, c0 = 1.
Note that f (1) = −0.158529 . Therefore, the function changes
sign in the interval [c0 , b0 ] = [1, 2] . Continue the process of
halving the interval; we obtain the table below. Finally, we
observe that the sequence {ck }k =0 converges to the value of

r ≈ 1.114157141 .

Left Right Function


Midpoint
k Endpoint Endpoint Value
ck
ak bk f (ck )
0 0 1. 2. -0.158529
1 1.0 1.5 2.0 0.496242
2 1.00 1.25 1.50 0.186231
3 1.000 1.125 1.250 0.015051
4 1.0000 1.0625 1.1250 -0.071827
5 1.06250 1.09375 1.12500 -0.028362
6 1.093750 1.109375 1.125000 -0.006643
7 1.1093750 1.1171875 1.1250000 0.004208
8 1.10937500 1.11328125 1.11718750 -0.001216
M M M M M

The number N of repeated bisections required to guarantee


that the N -th midpoint cN is an approximation to a root and
has an error less than the predetermined value ∆ can be
shown to be

12
⎛ ln(b − a ) − ln(∆ ) ⎞
N = int ⎜ ⎟. (2.12)
⎝ ln(2) ⎠

2.3 THE METHOD OF FALSE POSITION


If we observe, the bisection method converges at a fairly low
rate. The next algorithm that is developed based on a similar
concept has a faster rate of convergence. This algorithm is
called the method of false position or the regula falsi method.
As before, we assume that f (a ) and f (b) both have opposite
signs (why?). In the bisection method, we use the midpoint of
the interval [a, b] as the next iterate.

Join the points (a, f (a )) and (b, f (b)) with a straight line L that
cuts the x -axis at the point (c, 0) . To determine the value of c
we write two versions of the slope m of the straight line L .
They are

By using the points (a, f (a )) and (b, f (b)) :

f (b) − f (a )
m= (2.13)
b−a

By using the points (c, 0) and (b, f (b)) :

f (b) − 0
m= (2.14)
b−c

Equations (2.13) and (2.14) are the same statements,


therefore we have

f (b) − f (a ) f (b) − 0
= .
b−a b−c

13
Thus, we obtain

f (b)(b − a )
c=b− (2.15)
f (b) − f (a )

As in the case of the bisection method, we also have three


possibilities in the case of the method of false position as
follows:

If f (a ) and f (c) are of opposite signs, a root exists in the interval [a, c].
(2.16)

If f (c) and f (b) are of opposite signs, a root exists in the interval [c, b].
(2.17)

If f (c) = 0, a root is found at x = c. (2.18)

The decision process implied by equations (2.16) and (2.17)


together with the equation (2.15) are used to construct a
sequence of intervals {[an , bn ]} each of which brackets the
root. At each step the approximation of the root r is given by

f (bn )(bn − an )
cn = bn − (2.19)
f (bn ) − f (an )

and it can be easily shown that the sequence {cn } converges


to r .

Example 2.5

Use the method of false position to approximate the root of


x sin( x) − 1 = 0 in the interval [0, 2] .

14
Solution:

Start with a0 = 0 and b0 = 2 compute

f (0) = −1.00000000 and f (2) = 0.81859485 ,

both have opposite signs. Therefore, a root exists in the


interval [0, 2] . Using (2.19), we obtain

0.81859485(2 − 0)
c0 = 2 − = 1.09975017
0.81859485 − (−1)

and f (c0 ) = −0.02001921.

The function changes sign in the interval

[c0 , b0 ] = [1.09975017, 2] .
Therefore we narrow the interval from the left and set

a1 = c0 and b1 = b0 .

From (2.19), we obtain the approximation

0.81859485(2 − 1.09975017)
c1 = 2 − = 1.12124074
0.81859485 − (−0.02001921)

and f (c1 ) = 0.00983461

Now f ( x) changes sign in the interval

[ a1 , c1 ] = [1.09975017, 1.12124074] .

15
Therefore, the next decision is to narrow the interval from the
right. Thus, we set

a2 = a 1 and b2 = c1 .

The Table below gives the summary of the computation


process.

Left Point of Right Function


k Endpoint Intersection Endpoint Value
ak ck bk f (ck )
0 0.00000000 1.09975017 2.00000000 -0.02001921
1 1.09975017 1.12124074 2.00000000 0.00983461
2 1.09975017 1.11416120 1.12124074 0.00000563
3 1.09975017 1.11415714 1.11416120 0.00000000

In the bisection method, the iteration terminating criterion only


uses the closeness of consecutive iterates. This is not useful
for the false position method and may result in an infinite loop.
Therefore, the closeness of consecutive iterates and the size
of f (cn ) are both used in the termination criterion of the false
position method.

2.4 THE NEWTON-RAPHSON METHOD


Perhaps the most widely used of all root-locating technique is
the Newton-Raphson method (or simply Newton’s method).
Newton's method was discovered by Isaac Newton (1642-
1727) and published in Method of Fluxions in 1736. Although
the method was described by Joseph Raphson (1648-1715) in
Analysis Aequationum in 1690, the relevant sections of
Method of Fluxions were written earlier, in 1671.

16
If f ( x) , f ′( x) and f ′′( x) are continuous near a root p , then
this additional information can be used to develop algorithms
that will generate sequences { pk } that converge to p faster
than either the bisection or the false position methods. The
Newton-Raphson method is an algorithm that depends on the
continuity of f ′( x) and f ′′( x) . We shall introduce the method
graphically.

Suppose that the starting approximation p0 is near the root p


as depicted in the Figure below:

The graph y = f ( x) will intersect the x -axis at the point ( p, 0)


and the point ( p0 , f ( p0 )) lies on the curve near the point ( p, 0) .
Define p1 as the point of intersection between the tangent line
to the curve at the point ( p0 , f ( p0 )) and x -axis. The point p1 is
closer to p than p0 . We can now obtain the equation that
relates p1 and p0 . The gradient m of the tangent line L can
be written in two versions as follows:

The gradient of the tangent line through the points ( p1 , 0 ) and


( p0 , f ( p0 ) ) :
0 − f ( p0 )
m= (2.20)
p1 − p0

17
The gradient of the tangent at the point ( p0 , f ( p0 ) ) :

m = f ′( p0 ) (2.21)

Equating the two equations (2.20) and (2.21) and solving for
p1 , we obtain

f ( p0 )
p1 = p0 − (2.22)
f ′( p0 )

By repeating the process, we finally obtain a sequence { pk }


that converges to p . If you want a ‘visual’ derivation of the
Newton’s formula please visit the following website:

http://archives.math.utk.edu/visual.calculus/3/newton.5/

We summarized the results in the following theorem.

Theorem 2.5 (Newton-Raphson Theorem)

Suppose that f ∈ C 2 [ a, b ] and there exists a number p ∈ [ a, b ]


with f ( p ) = 0 . If f ′( p ) ≠ 0 , then there exists a number δ > 0
such that the sequence { pk }k =0 defined by the iteration

f ( pk −1 )
pk = pk −1 − for k = 1, 2,K (2.23)
f ′( pk −1 )

converges to p for any starting approximation


p0 ∈ [ p − δ , p + δ ] .

18
Remark: The function g ( x) defined by

f ( x)
g ( x) = x − (2.24)
f ′( x)

is said as the Newton-Raphson iteration function. Since


f ( p ) = 0 it is clear that g ( p ) = p . Thus the Newton-Raphson
iteration for finding the root of a function f ( x) = 0 can be
implemented by finding a fixed point of the equation g ( x) = x .

Corollary 2.1 (Newton Iteration for Finding Square Roots)

Suppose that A > 0 is a real number and let p0 > 0 be an initial


{ pk }k =0

approximation to A . Define the sequence using the
recursive rule

pk −1 + A pk −1
pk = for k = 1, 2,K (2.25)
2

{ pk }k =0

Then the sequence converges to A , that is
lim pk = A .
k →∞

Example 2.6

Use the Newton’s square-root algorithm to determine 5.

Solution:

In solving the problem, we start with the function f ( x) = x 2 − A .


Observe that the roots of the equation x 2 − A = 0 are ± A .
Next, use f ( x) and its derivative f ′( x) in equation (2.23) and

19
write the Newton-Raphson iteration function

f ( x) x2 − A
g ( x) = x − = x− (2.26)
f ′( x) 2x

After simplifying equation (2.26), we obtain

x+ A x
g ( x) = (2.27)
2

Now, we can use equation (2.27) to solve our problem of


determining the root of 5 . Start with p0 = 2 . Using equation
(2.27), we have

2 + 5/ 2
p1 = = 2.25
2
2.25 + 5 / 2.25
p2 = = 2.236111111
2
2.236111111 + 5 / 2.236111111
p3 = = 2.236067978
2
2.236067978 + 5 / 2.236067978
p4 = = 2.236067978
2

If we continue for k > 4 , we will obtain pk = 2.236067979 .


Hence, we conclude that convergence accurate to nine
significant digits has been obtained.

Let us consider equation (2.24). What will happen if there is


division by zero? It will occur if f ′( pk −1 ) = 0 . There is the
possibility that f ′( pk −1 ) is extremely small and pk −1 is an
acceptable approximation to the root. This is our next point of

20
discussion and we shall investigate the rate of convergence of
an iteration.

We shall first consider the following definition.

Definition 2.3 (Order of a Root)

Suppose that f ( x) and its derivatives f ′( x), f ′′( x),K , f ( M ) ( x)


are defined and continuous in the interval around the point
x = p . We say that f ( x) = 0 has a root of order M at x = p if
and only if

f ( p) = 0, f ′( p) = 0, f ′′( p) = 0,K , f ( M −1) ( p) = 0

and f ( M ) ( p) ≠ 0 (2.28)

For the case of a root of order M = 1 , we called it a simple


root, and if M > 1 , we say that it is a multiple root. A root of
order M = 2 , is sometimes called a double root.

Lemma 2.1

If the equation f ( x) = 0 has a root of order M at x = p , then


there exists a continuous function h( x) so that f ( x) can be
expressed as the product

f ( x) = ( x − p) M h( x) , where h( x) ≠ 0 . (2.29)

Example 2.7

Consider the function f ( x) = x3 − 3x + 2 . Determine the order of

21
its roots.

Solution:

The derivatives of the function can be written as

f ′( x) = 3x 2 − 3 and f ′′( x) = 6 x

The function can be written in factorized form as

f ( x) = ( x + 2)( x − 1) 2

Next, its roots can be determined from f ( p) = 0 , so that

f ( p) = ( p + 2)( p − 1) 2 = 0

Hence, p = −2 and p = 1 are the roots.

At p = −2 , we found that f (−2) = 0 and f ′(−2) = −9 ≠ 0 hence


M = 1 from the Definition 2.3. Therefore, we say that p = −2 is
a simple root.

Next, at p = 1 we found that f (1) = 0 , f ′(1) = 0 and


f ′′(1) = 6 ≠ 0 . Hence, from Definition 2.3, M = 2 . Therefore, we
say that p = 1 is a double root.

Next, we consider the convergence characteristics of the root


finding methods of a function. If p is a simple root of f ( x) = 0 ,
the Newton-Raphson method will converge rapidly and the
number of decimal points accuracy increases twofold in each
iteration. On the other hand, if p is a multiple root, the error in
each successive approximation is a fraction of the previous
error. Therefore, we can use the order of convergence of a

22
sequence as a measure of how rapidly a sequence
converges.

Definition 2.4 (Order of Convergence)

Suppose that { p n }n=0 converges to p , and set en = p − pn for


n ≥ 0 . If there exists two positive constants A ≠ 0 and R > 0


such that

p − pn+1 en+1
lim = lim =A (2.30)
p − pn
n→∞ R n→∞ R
en

then the sequence is said to converge to p with order of


convergence R . The number A is called as the asymptotic
error constant. The special cases of R = 1, 2 are called as
follows:

{ p n }n =0

If R = 1, the convergence of the sequence is called
linear. (2.31)

{ p n }n =0

If R = 2 , the convergence of the sequence is called
quadratic. (2.32)

If R is large, the sequence { pk } converges rapidly to p ; that


is, relation (2.30) implies that for large values of n the error
approximation is en+1 ≈ A en .
R

Example 2.8 (Quadratic Convergence at a Simple Root)

Use the Newton-Raphson iteration to obtain the simple root

23
p = −2 for the polynomial f ( x) = x3 − 3x + 2 . Start with an initial
value of p0 = −2.4 .

Solution:

Use the iteration formula

f ( pk −1 )
pk = g ( pk −1 ) = pk −1 −
f ′( pk −1 )

After substituting the function and its derivatives, we obtain


the iteration formula for computing the sequence { pk } as

pk3−1 − 3 pk −1 + 2
pk = g ( pk −1 ) = pk −1 −
3 pk2−1 − 3

or

2 pk3−1 − 2
pk = 2 (2.33)
3 pk −1 − 3

Next, we obtain the data in the Table below which was


calculated in MS Excel:

ek +1
k pk pk +1 − pk ek = p − pk 2
ek
0 -2.400000000 0.323809524 0.400000000 0.476190476
1 -2.076190476 0.072594466 0.076190476 0.619469027
2 -2.003596011 0.003587421 0.003596011 0.664277916
3 -2.000008590 0.000008590 0.000008590 0.666660828
4 -2.000000000

24
If we investigate the rate of convergence in the example
above, we will notice that the error in each successive
iteration is proportional to the square of the error in the
previous iteration. That is,

p − pk +1 ≈ A p − pk
2

2
where A = ≈ 0.66666667 .
3

Example 2.9 (Linear Convergence at a Double Root)

Use the Newton-Raphson method to obtain the double root


p = 1 of the polynomial f ( x) = x3 − 3x + 2 . Start with an initial
value of p0 = 1.2 .

Solution:

Using (2.33), we obtain the results as tabulated below:

ek +1
k pk pk +1 − pk ek = p − pk
ek
0 1.200000000 -0.96969697 -0.200000000 0.515151515
1 1.103030303 -0.050673883 -0.103030303 0.508165253
2 1.052356420 -0.025955609 -0.052356420 0.496751115
3 1.026400811 -0.013143081 -0.026400811 0.509753688
4 1.013257730 -0.006614311 -0.013257730 0.501097775
5 1.006643419 -0.003318055 -0.006643419 0.500550093
M M M M M

Notice that the Newton-Raphson method converges to the


double root, but at a slow rate. The values of f ( pk ) in the
Example above approaches to zero faster than those of the

25
f ( pk )
values of f ′( pk ) . Hence, the value of in equation
f ′( pk )
(2.24) is defined when pk ≠ p . The sequence converges
linearly and its error is decreasing by a factor of approximately
1 2 with each successive iteration.

Theorem 2.6 (Convergence Rate for Newton-Raphson


Iteration)

Suppose that the Newton-Raphson method produces a


sequence { pk }k =0 that converges to the root of a function

f ( x) .

If p is a simple root, convergence is quadratic

1 f ′′( p )
en+1 ≈
2
en for n sufficiently large. (2.34)
2 f ′( p )

If p is a multiple root with order M , convergence is linear and

M −1
en+1 ≈ en for n sufficiently large. (2.35)
M

The theorem above can be used to determine the asymptotic


error constant A . As an illustration, consider Example 2.8, we
can obtain A from the computation

1 f ′′(−2) 1 −12 2
A= = =

2 f (−2) 2 9 3

26
In the following Theorem, we shall state a result of the
Newton-Rapnson iteration that will yield a quadratic
convergence at a multiple root.

Theorem 2.7 (Acceleration of Newton-Raphson Iteration)

Suppose that the Newton-Raphson algorithm yields a


sequence that converges linearly to the root x = p of order
M > 1 . Then the modified Newton-Raphson formula as given
by

f ( pk −1 )
pk = pk −1 − M , k = 1, 2,K (2.36)

f ( pk −1 )

will generate a sequence { pk }k =0 that converges quadratically


to p .

Example 2.10 (Accelerated Convergence at a Multiple


Root)

Use the accelerated Newton-Raphson iteration (2.36) to


determine the double root p = 1.0 of f ( x) = x3 − 3x + 2 . Start
with an initial value of p0 = 1.2 .

Solution:

Assume M = 2 and use the formula given in equation (2.36),


we obtain

f ( pk −1 ) pk3−1 + 3 pk −1 − 4
pk = pk −1 − 2 =
f '( pk −1 ) 3 pk2−1 − 3

27
The results are given below:

ek +1
k pk pk +1 − pk ek = p − pk 2
ek
0 1.200000000 -0.193939394 -0.200000000 0.151515150
1 1.006060606 -0.006054519 -0.006060606 0.165718578
2 1.000006087 -0.000006087 -0.000006087
3 1.000000000 -0.000000000 -0.000000000

Next, we consider several pitfalls or difficulties that may be


faced when the Newton-Raphson method is used to
determine the root of a function. Firstly, it concerns the
problem of the division-by-zero error. However, this can be
easily handled by designing an algorithm that incorporates a
step to check the occurrence of the division-by-zero error.
Thus, by avoiding the division by zero, the computation can
be continued.

Another problem happens when an initial approximation p0 is


very far away from the desired root and the sequence { pk }
converges to some other root. This usually occurs when the
slope f ′( x) is small and the tangent line to the curve y = f ( x)
is almost horizontal. As an example, we take the function
y = cos x and try to find the root p = π 2 . If we take the initial
approximation p0 = 3 , then (2.23) gives

p1 = −4.01525255, p2 = −4.85265757,K

which produces the sequence { pk } converging to


− 3π 2 ≈ −4.71238898 , i.e. the other root of y = cos x

Next, consider a function that is positive and monotonically


decreasing in an unbounded interval [ a, ∞) and p0 > a ; then

28
the sequence { pk } might diverge to +∞ . For example, using
the starting value p0 = 2 in (2.23) for the function y = xe − x , we
obtain

p1 = 4.0, p2 ≈ 5.33333,K , p15 ≈ 19.723549,K

which slowly diverges to +∞ .

Another case occurs when the terms in the sequence { pk }


tend to repeat. For example, using the initial approximation
p0 = 0 (which is farther away from the actual root
p = 1.671699881) in (2.23) for the function f ( x) = x3 − x − 3 , we
obtain the sequence
p1 = −3.000000, p2 = −1.961538, p3 = −1.147176, p4 = 0.006579,
p5 = −3.000389, p6 = −1.961818, p7 = −1.147430, K

tends to repeat as pk +1 ≈ pk for k = 0,1, 2,K. Instead, if we take


the initial approximation p0 to be near the actual root, then the
sequence { pk } converges.

Another problem that may arise is when g ′( x) < 1 in an


interval containing the root p ; there is the possibility that an
oscillating divergent sequence { pk } occurs. Take for example
the function f ( x) = arctan( x) . Using p0 = 1.45 in (2.23) yields

p0 = 1.45, p1 = −1.550263297, p2 = 1.845931751, p3 = −2.889109054,K

which diverges oscillatorily.

29
To overcome such a difficulty, we can choose an initial
approximation very close to the desired root. Then we can
guarantee a convergent sequence. Thus we need a prior
knowledge of the location of the desired root in order to set
the initial approximation of the root. This is not very difficult
since we can use the bracketing method to estimate or
determine the location of the root within a predetermined
interval. If knowledge of the behavior of the function or a
graph is available, then an initial approximate root is easy to
guess.

2.5 SECANT METHOD


Notice that the Newton-Raphson algorithm requires the
computation of two functions, f ( pk −1 ) and f ′( pk −1 ) in each
iteration. This is not very encouraging if the function involved
is very intricate and the derivative function is difficult to derive.
Hence, we need to design a simpler algorithm that does not
involve the derivation of the function derivative.

We shall now discuss the Secant method that meets our


criteria above. This method only need the computation of the
function in each iteration and at simple root, the order of
convergence is R ≈ 1.618033989 . This is comparable with the
Newton-Raphson method of order 2.

The formula involved in the secant method is the same as the


one used in the method of false position, except that the
logical decisions regarding the definition of each succeeding
term are different.

In the secant method, we need two initial points ( p0 , f ( p0 ))


and ( p1 , f ( p1 )) that are close to the point ( p, 0) . See the
Figure below:

30
Define p2 as the intersection point between the x -axis and
the straight line passing through these two points. Then p2 is
nearer to p than to either p1 or p0 . The equation relating p2 ,
p1 and p0 can be derived by considering the slope

f ( p1 ) − f ( p0 ) 0 − f ( p1 )
m= = (2.37)
p1 − p0 p2 − p1

The values of m in equation (2.37) are the slopes of the


secant line through the first two approximations and the slope
of the line passing through the points ( p1 , f ( p1 )) and ( p2 , 0)
respectively. Now solving for p2 = g ( p1 , p0 ) , we obtain

f ( p1 ) ( p1 − p0 )
p2 = g ( p1 , p0 ) = p1 − (2.38)
f ( p1 ) − f ( p0 )

We can generalize equation (2.38) to obtain

f ( pk ) ( pk − pk −1 )
pk +1 = g ( pk , pk −1 ) = pk − for k = 1, 2,K (2.39)
f ( pk ) − f ( pk −1 )

Equation (2.39) gives the two-point iteration formula for the


secant method.

31
Example 2.11 (Secant Method at a Simple Root)

Use the Secant method (2.39) to obtain the root p = −2 of


f ( x) = x3 − 3x + 2 . Start with initial values p0 = −2.6 and
p1 = −2.4 .

Solution:

Use the secant iteration formula (2.39) above, we have

( pk3 − 3 pk + 2) ( pk − pk −1 )
pk +1 = g ( pk , pk −1 ) = pk − (2.40)
( pk3 − pk3−1 − 3 pk + 3 pk −1 )

After some algebraic manipulation of equation (2.40), we


obtain

pk2 pk −1 + pk pk2−1 − 2
pk +1 = g ( pk , pk −1 ) = 2 (2.41)
pk + pk pk −1 + pk2−1 − 3

Hence, the sequence of iterates are tabulated below:

ek +1
k pk pk +1 − pk ek = p − pk 1.618
ek
0 -2.600000000 0.200000000 0.600000000 0.914152831
1 -2.400000000 0.293401015 0.400000000 0.469497765
2 -2.106598985 0.083957573 0.106598985 0.847290012
3 -2.022641412 0.021130314 0.022641412 0.693608922
4 -2.001511098 0.001488561 0.001511098 0.825841116
5 -2.000022537 0.000022515 0.000022537 0.727100987
6 -2.000000022 0.000000022 0.000000022
7 -2.000000000 0.000000000 0.000000000

Next, we observe the relationship between the secant method


and the Newton’s method. For a polynomial function f ( x) , the

32
two-point secant method g ( pk , pk −1 ) will reduce to a single
point Newton’s method g ( pk ) if pk is replaced by pk −1 . In fact,
if pk is replaced by pk −1 in equation (2.41), then the right-hand
side is the same as the right-hand side of equation (2.33).

SUMMARY
In this chapter you have studied some numerical techniques
of approximating a root of nonlinear equations of a single
variable. The various methods have their own advantages and
disadvantages. In real applications we ought to, for example,
employ two techniques in succession.

EXERCISES

1. Use fixed-point iteration to find a fixed-point of


x = 0.5 x3 + 0.3 . Take p0 = 0.1 as the initial approximation
and compute p1 , p2 , p3 .

2. Use the bisection method to find the zero of f ( x) = x 2 − 3


on the interval [1, 2] .

3. Use the method of false position to find the zero of


f ( x) = x3 − 2 on the interval [1, 2] .

4. Use the Newton-Raphson method to find the zero of


f ( x) = x3 + 4 x 2 − 10 . Let p0 = 1 .

5. Use the secant method to estimate the root of


f ( x) = e− x − x . Start with p0 = 0 and p1 = 1.

ANSWERS

33
1. p1 = 0.3005, p2 = 0.31357, p3 = 0.31542

2. The results of five iterations are tabulated below:

Left Right Function


Midpoint
k Endpoint Endpoint Value
ck
ak bk f (ck )
0 1.0000 1.5000 2.0000 -0.7500
1 1.5000 1.7500 2.0000 0.0625
2 1.5000 1.6250 1.7500 -0.3594
3 1.6250 1.6875 1.7500 -0.1523
4 1.6875 1.7188 1.7500 -0.0459

3. The results of five iterations are tabulated below:

Point of Right Function


Left Endpoint
k Intersection Endpoint Value
ak
ck bk f (ck )
0 1.0000 1.1429 2 -0.50729
1 1.1429 1.2097 2 -0.22986
2 1.2097 1.2388 2 -0.098736
3 1.2388 1.2512 2 -0.041433
4 1.2512 1.2563 2 -0.017216

4. The results of five iterations are tabulated below:

k pk f ( pk )
0 1.4545454545 1.5401953418
1 1.3689004011 0.0607196886
2 1.3652366002 0.0001087706
3 1.3652300134 0.0000000004

5. The results of four iterations are tabulated below:

34
k pk f ( pk )
0 0 1.0000
1 1 -0.63212
2 0.61270 -0.07081
3 0.56384 0.00518
4 0.56717 -0.00004

REFERENCE
1. Faires, J. D. and Burden, R. 2003. Numerical Methods. 3rd
Ed. Brooks/Cole: USA.

2. Fausett, L. 1999. Apllied Numerical Analysis Using Matlab.


Prentice Hall: USA.

3. Kincaid, D. and Cheney, W. 2002. Numerical Analysis:


Mathematics of Scientific Computing. 3rd Ed. Brooks/Cole:
USA.

4. Wheatly, G. 2004. Applied Numerical Analysis. 7th Ed.


Addison Wesley: USA.

35

You might also like