You are on page 1of 21

RMIT

Geospatial Science

COMBINED LEAST SQUARES: THE GENERAL LEAST SQUARES


ADJUSTMENT TECHNIQUE A common treatment of the least squares technique of estimation starts with simple linear mathematical models having observations (or measurements) as explicit functions of parameters with non-linear models developed as extensions. This adjustment technique is generally described as adjustment of indirect observations (also called parametric least squares). Cases where the mathematical models contain only measurements are usually treated separately and this technique is often described as adjustment of observations only (also called condition equations). Both techniques are of course particular cases of a general adjustment model, sometimes called Combined Least Squares, the solution of which is set out below. The general adjustment technique also assumes that the parameters, if any, can be treated as "observables", i.e., they have an a priori covariance matrix. This concept allows the general technique to be adapted to sequential processing of data where parameters are updated by the addition of new observations.

In general, least squares solutions require iteration, since a non-linear model is assumed. The iterative process is explained below. In addition, a proper treatment of covariance propagation is presented and cofactor matrices given for all the computed and derived quantities in the adjustment process. Finally, the particular cases of the general least squares technique are described.

The Combined Least Squares Adjustment Model Consider the following set of non-linear equations representing the mathematical model in an adjustment

F l, x = 0 where l is a vector of n observations and x is a vector of u parameters; l and x referring to estimates derived from the least squares process such that
l =l+v
x = x + x

d i

(1)

(2) (3)

Combined Least Squares

RMIT

Geospatial Science

where v is a vector of residuals or small corrections and x is a vector of small corrections. As is usual, the independent observations l have an a priori diagonal cofactor matrix Qll containing estimates of the variances of the observations, and in this general adjustment, the parameters x are treated as "observables" with a full a priori cofactor matrix Q xx . The diagonal elements of Q xx contain estimates of variances of the parameters and the offdiagonal elements contain estimates of the covariances between parameters. Cofactor matrices Qll and Q xx are related to the covariance matrices ll and xx by the variance factor

2 0
ll = 2 Qll 0
xx = 2 Q xx 0

(4) (5)

Also, weight matrices W are useful and are defined, in general, as the inverse of the cofactor matrices

W = Q 1
and covariance, cofactor and weight matrices are all symmetric, hence

(6)

QT = Q and WT = W where the superscript T denotes the transpose of the matrix.

Note also, that in this development where Q and W are written without subscripts they refer to the observations, i.e., Qll = Q and Wll = W Linearizing (1) using Taylor's theorem and ignoring 2nd and higher order terms, gives
F l, x

d i

= F l, x +

a f

F l

l, x

d l li + F bx xg x
l, x

= 0

(7)

and with v = l l and x = x x from (2) and (3), we may write the linearized model in symbolic form as
Av + B x = f

(8)

Equation (8) represents a system of m equations that will be used to estimate the u parameters from n observations. It is assumed that this is a redundant system where

n m u
and

(9)

Combined Least Squares

RMIT

Geospatial Science

r = mu is the redundancy or degrees of freedom.

(10)

In equation (8) the coefficient matrices A and B are design matrices containing partial derivatives of the function evaluated using the observations l and the "observed" parameters x. A m,n = F l F x (11)
l, x

Bm , u =

(12)
l, x

The vector f contains m numeric terms calculated from the functional model using l and x. fm,1 = F l, x

m a fr

(13)

The Least Squares Solution of the Combined Model

The least squares solution of (8), i.e., the solution which makes the sums of the squares of the weighted residuals a minimum, is obtained by minimizing the scalar function

= v T W v + x T Wxx x 2kT Av + B x f

(14)

where k is a vector of m Lagrange multipliers. is a minimum when its derivatives with respect to v and x are equated to zero, i.e.
= 2vT W 2kT A = 0T v = 2 x T Wxx 2k T B = 0 T x

These equations can be simplified by dividing both sides by two, transposing and changing signs to give
Wv + A T k = 0 Wxx x + BT k = 0

(15) (16)

Equations (15) and (16) can be combined with (8) and arranged in matrix form as

Combined Least Squares

RMIT

Geospatial Science

LMW MM A N0

AT 0 B
T

OP L v O L0O PP MM k PP = MMf PP W Q M x P M0P N Q NQ


0 B
xx

(17)

Equation (17) can be solved by the following reduction process given by Cross (1992, pp. 2223). Consider the partitioned matrix equation P y = u given as

LMP NP
which can be expanded to give

11 21

P12 P22

OP LMy OP = LM u OP Q N y Q Nu Q
1 2 1 2

(18)

P11 y1 + P12 y 2 = u1

or
y1 = P111 u1 P12 y 2

g
1 2

(19)

Eliminating y1 by substituting (19) into (18) gives

LMP NP

11 21

P12 P22

OP LMP bu P y gOP = LMu OP QN y Q Nu Q


1 11
1 12 2 2

Expanding the matrix equation gives


P21P111 u1 P12 y 2 + P22 y 2 = u 2 P21P111u1 P21P111P12 y 2 + P22 y 2 = u 2

and an expression for y 2 is given by

cP

22

P21P111P12 y 2 = u2 P21P111u1

(20)

Now partitioning (17) in the same way as (18)

LMW MM A N0 LML 0 NMNB


T

AT 0 BT

0 B Wxx

OP L v O L0O PP MM k PP = MMf PP Q MN xPQ MN0PQ


1

(21)

then eliminating v by applying (20) gives


B A W 1 A T Wxx 0

OP LM OP Q NQ

OP L k O = Lf O LAO W Q MN xPQ MN0PQ MN 0 PQ

Remembering that Q = W 1 the equation can be simplified as


Combined Least Squares 4

RMIT

Geospatial Science

LMAQA N B
T

OP L k O = Lf O W Q M x P M0P N Q NQ
B
xx

(22)

Again, applying (20) to the partitioned equation (22) gives

eW

xx

BT AQA T

h Bj x = 0 B cAQA h
1
T

T 1

and re-arranging gives the normal equations

eB cAQA h
T

T 1

B + Wxx x = BT AQA T

(23)

Mikhail (1976, p. 114) simplifies (23) by introducing equivalent observations le where


le = A l

(24)

Applying the matrix rule for cofactor propagation (Mikhail 1976, pp. 76-79) gives the cofactor matrix of the equivalent observations as
Q e = AQA T

(25)

With the usual relationship between weight matrices and cofactor matrices, see (6), we may write
We = Q e 1 = AQA T

h
T

(26)

Using (26) in (23) gives the normal equations as

cB W B + W h x = B W f
T e xx e

(27)

With the auxiliaries N and t


N = BT We B t = BT We f

(28) (29)

the vector of corrections x is given by

x = N + Wxx

(30)

The vector of Lagrange multipliers k are obtained from (22) by applying (19) to give
k = AQA T

h af B x f = W af B x f
1 e

(31)

and the vector of residuals v is obtained from (21) as

Combined Least Squares

RMIT

Geospatial Science

Wv + A T k = 0

giving

v = W 1A T k = QA T k

(32)

The Iterative Process of Solution

Remembering that x = x + x , see (3), where x is the vector of a priori estimates of the parameters, x is a vector of corrections and x is the least squares estimate of the parameters.

At the beginning of the iterative solution, it can be assumed that x equals the a priori estimates x1 and a set of corrections x1 computed. These are added to x1 to give an updated set x 2 . A and B are recalculated and a new weight matrix Wxx computed by cofactor propagation. The corrections are computed again, and the whole process cycles through until the corrections reach some predetermined value, which terminates the process.

x n +1 = x n + x n

(33)

Derivation of Cofactor Matrices

In this section, the cofactor matrices of the vectors x, x, v and l will be derived. The law of propagation of variances (or cofactors) will be used and is defined as follows (Mikhail 1976, pp. 76-89).

Given a functional relationship

z= Fx

af

(34)

between two random vectors z and x and the variance-covariance matrix xx , the variancecovariance matrix of z is given by
zz = J zx xx J T zx

(35)

where J zx is a matrix of partial derivatives

Combined Least Squares

RMIT

Geospatial Science

J zx

LM z MM zx F = = M x x M MM z MN x

1 1

2 1

z1 x2 z2 x2 zm x2

m 1

OP PP PP z P P x P Q
z1 xn z2 xn
m n

Using the relationship between variance-covariance matrices and cofactor matrices, see (5), the law of cofactor propagation may be obtained from (35), as
Q zz = J zx Q xx J T zx

(36)

For a function z containing two independent random variables x and y with cofactor matrices

Q xx and Q yy z = F x, y

a f

(37)

the law of propagation of variances gives the cofactor matrix Q zz as Q zz = J zx Q xx J T + J zy Q yy J T zx zy (38)

Cofactor Matrix for x

According to equations (33) and (30) with (29) the least squares estimate x is
x = x + N + Wxx

BT We f

(39)

and x is a function of the a priori parameters x (the "observables") and the observations l since the vector of numeric terms f contains functions of both. Applying the law of propagation of cofactors gives Q xx = The partial derivatives of (39) are
x = I + N + Wxx x

FG x IJ Q FG x IJ + FG x IJ Q FG x IJ H x K H x K H l K H l K
T xx

(40)

BT We

f x

(41)

x = N + Wxx l

BT We

f l

(42)

Combined Least Squares

RMIT

Geospatial Science

From (13), f = F x, l the partial derivatives given by (11) and (12)

a f

f f and , are the design matrices A and B x l

f = B x f = A l

(43)

(44)

Substituting (43) and (44) into (41) and (42) with the auxiliary N = BT We B gives
x = I N + Wxx x
xx

b g = I bN + W g
b g

BT We B N

(45)

x = N + Wxx l

BT We A

(46)

Substituting (45) and (46) into (40) gives


Q xx = I N + Wxx
xx

With the auxiliary

g Nt Q oI bN + W g Nt o b + obN + W g B W At Q obN + W g B W At N = bN + W g
1 1 T xx xx 1 T 1 T e xx e

(47)

xx

(48)

and noting that the matrices I, N, N and Wxx are all symmetric, (47) may be simplified as
Q xx = I N 1 N Q xx I N N 1 + N 1 BT We A Q A T We B N 1
Remembering that Q e = AQA T and We = Q e 1

FG H

IJ FG K H

IJ FG K H

IJ FG K H

IJ K
(49)

Q xx = Q xx Q xx N N 1 N 1 NQ xx + N 1 NQ xx N N 1 + N 1 N N 1

The last two terms of (49) can be simplified as follows

Combined Least Squares

RMIT

Geospatial Science

N 1 NQ xx N N 1 + N 1 N N 1 = N 1 NQ xx N N 1 + Wxx N 1 = N 1 NQ xx
1 xx

IJ FG H K bN + W g N

= N 1 NQ xx N N 1 = N 1 NQ xx

and substituting this result into (49) gives


Q xx = Q xx Q xx N N 1 N 1 NQ xx + N 1 NQ xx = Q xx Q xx N N
1

(50)

Further simplification gives


Q xx = Q xx I N N 1 = Q xx = Q xx

FG IJ H K FH N NIK N bN + W Ng N
1 xx

(51)

= Q xx Wxx N 1

and since Q xx Wxx = I the cofactor matrix of the least squares estimates x is
Q xx = N 1 = N + Wxx

(52)

Cofactor Matrix for l

Beginning with the final adjusted observations given by (2)


l =l+v

(53)

and using (32) and (31) we have


l = l + QA T k

= l + QA T We f B x

= l + QA T We f QA T We B x

Combined Least Squares

RMIT

Geospatial Science

Substituting the expression for x given by (30) with the auxiliaries t and N given by (29) and (48) respectively gives
l = l + QA T We f QA T We B N + Wxx
T T e e xx

b g = l + QA W f QA W B bN + W g

1 1

t BT We f

(54)

= l + QA T We f QA T We B N 1 BT We f

and l is function of the observables x and the observations l since f = F x, l . Applying the law of propagation of variances to (54) gives
Ql l

a f

F l I F l I F l I F l I = G J Q G J + G J QG J H x K H x K H l K H l K
T xx

(55)

and the partial derivatives are obtained from (54) as


l f f = Q A T We Q A T We B N 1 BT We x x x

f f l = I + Q A T We Q A T We B N 1 BT We l l l

With

f f = B and = A , and with the auxiliary N = BT We B the partial derivatives x l

become
l = Q A T We B N 1 BT We Q A T We B x

(56)

= Q A T We B N 1 N Q A T We B
l = I + Q A T We B N 1 BT We A Q A T We A l

(57)

Substituting (56) and (57) into (55) gives


Ql l = 1st term + 2 nd term

r m

(58)

where

Combined Least Squares

10

RMIT

Geospatial Science

m1

st

term = QA T We B N 1 NQ xx N N 1 BT We AQ QA T WeB N 1 NQ xx BT We AQ
QA T WeBQ xx N N 1 BT We AQ + QA T We BQ xx BT We AQ

m2

nd

term = Q + QA T We B N 1 BT We AQ QA T We AQ + QA T We B N 1 BT We AQ + QA T We B N 1 BT We AQA T We B N 1 BT We AQ QA T We B N 1 BT We AQA T We AQ QA T We AQ QA T We AQA T We B N 1 BT We AQ + QA T We AQA T We AQ


The 1st term can be simplified as

m1

st

term = QA T We B N 1 NQ xx N N 1 N 1 NQ xx Q xx N 1 N + Q xx BT We AQ
T 1

FG H F F = QA W B G N N G Q H H
e

xx

N N 1 Q xx

IJ Q K

xx

N 1 N + Q xx BT We AQ

IJ K

IJ K

but we know from (50) that Q xx = Q xx Q xx N N 1 , and from (52) that Q xx = N 1 so

m1

st

term = QA T We B Q xx N 1 NQ xx BT We AQ
T 1 e

FG H F = QA W B G N H

N 1 N N 1

= QA T We B N 1 I N N 1 BT We AQ

FG H

IJ K

IJ K IJ B W AQ K
T e

The term in brackets has been simplified in (51) as Wxx N 1 which gives the 1st term as

m1

st

term = QA T We B N 1 Wxx N 1 BT We AQ

(59)

The 2nd term of (58) can be simplified by remembering that AQA T = Q e = We1 so that after some cancellation of terms we have

m2

nd

term = Q + QA T We B N 1 N N 1 BT We AQ QA T We AQ

(60)

Substituting (59) and (60) into (58) gives the cofactor matrix of the adjusted observations as Ql l = Q + QA T We B N + Wxx

BT We AQ QA T We AQ

(61)

Combined Least Squares

11

RMIT

Geospatial Science

Cofactor Matrix for x

From (30) and (29)

x = N + Wxx
1 T

BT We f

(62)

= N B We f
and applying the law of propagation of variances gives

The cofactor matrix Q f f

FG IJ H K is obtained from f = Fa x, lf as F f I F f I F f I F f I Q = G J Q G J + G J QG J H x K H x K H l K H l K = a Bf Q a Bf + a Af Q a Af
Q x x = N 1 BT We Q f f N 1 BT We
T T ff xx T T xx

FG H

IJ K

(63)

(64)

= BQ xx BT + AQA T = BQ xx BT + Q e
Substituting (64) into (63) and simplifying gives
Q x x = N + Wxx

NQ xx N N + Wxx

g + bN + W g N bN + W g
1 1 xx xx

(65)

Equation (65) can be simplified further as

Q x x = N 1 NQ xx N N 1 + N 1 N N 1 = N 1 NQ xx N N 1 + Wxx N 1 = N 1 NQ xx
1 xx

FG IJ H K bN + W g N

= N 1 NQ xx N N 1
or

Q x x = N 1 NQ xx = N + Wxx

NQ xx

(66)

Combined Least Squares

12

RMIT

Geospatial Science

Cofactor Matrix for v

From (32), (31) and (30) we may write the following


v = QA T k

= QA T We f B x

f b g
1

= QA T We f QA T We B x = QA T We f QA T We B N + Wxx

and with (29) and the auxiliary N 1 = N + Wxx

v = QA T We f QA T We B N 1 BT We f

(67)

v is a function of the observables x and the observations l since f = F x, l and applying the
law of propagation of variances gives

a f

Q vv

F v I F v I F v I F v I = G J Q G J + G J QG J H x K H x K H l K H l K
T xx

(68)

The partial derivatives of (67) are


v f f = Q A T We Q A T We B N 1 BT We x x x v f f = Q A T We Q A T We B N 1 BT We l l l

With

f f = B and = A , and with the auxiliary N = BT We B the partial derivatives x l

become
v = Q A T We B N 1 N Q A T We B x v = Q A T We B N 1 BT We A Q A T We A l

(69)

(70)

Substituting (69) and (70) into (68) gives

Q vv = 1st term + 2 nd term

r m

(71)

where

Combined Least Squares

13

RMIT

Geospatial Science

m1

st

term = QA T We B N 1 NQ xx N N 1 BT We AQ QA T WeB N 1 NQ xx BT We AQ
QA T WeBQ xx N N 1 BT We AQ + QA T We BQ xx BT We AQ

m2

nd

term = QA T We B N 1 BT We AQA T We B N 1 BT We AQ QA T We B N 1 BT We AQA T We AQ QA T We AQA T We B N 1 BT We AQ + QA T We AQA T We AQ


The 1st term above is identical to the 1st term of (58) which simplifies to (59) as

m1

st

term = QA T We B N 1 Wxx N 1 BT We AQ

(72)

The 2nd term above can be simplified by remembering that AQA T = Q e = We1 so that after some manipulation we have

m2

nd

term = QA T We B N 1 N N 1 N 1 BT We AQ QA T We B N 1 BT We AQ + QA T We AQ

FG H

IJ K

The term in brackets can be expressed as N 1 N N 1 N 1 = N 1 N N N 1 = N 1


FH IK cN bN + W gh N
xx

= N 1 Wxx N 1 and the 2nd term becomes

m2

nd

term = QA T We B N 1 Wxx N 1 BT We AQ QA We B N B We AQ + QA We AQ
T T T 1

(73)

Substituting (72) and (73) into (71) gives the cofactor matrix of the residuals v as
Q vv = QA T We B N + Wxx

BT We AQ + QA T We AQ

(74)

and by inspection of (64) and (74) Q vv = Q Qll


Combined Least Squares

(75)
14

RMIT

Geospatial Science

Covariance Matrix

xx

xx = 2 Q xx 0 The estimated variance factor is

(76)

2 = 0
and the degrees of freedom r are

v T Wv + x T Wxx x r

(77)

r = m u + ux

(78)

where m is the number of equations used to estimate the u parameters from n observations. ux is the number of weighted parameters. [Equation (78) is given by Krakiwsky (1975, p.17, eqn 2-62) who notes that it is an approximation only and directs the reader to Bossler (1972) for a complete and rigorous treatment.]

Generation of the Standard Least Squares Cases

Combined Case with Weighted Parameters A, B, W, Wxx 0

The general case of a non-linear implicit model with weighted parameters treated as observables is known as the Combined Case with Weighted Parameters. It has a solution given by the following equations (30), (28), (29), (26), (3), (31), (32), (2), (65), (52), (74), (61), (64), (77) and (78).

x = N + Wxx

(79) (80) (81)

with N = BT We B
t = BT We f
We = Q e 1 = AQA T

(82)

Combined Least Squares

15

RMIT

Geospatial Science

x = x + x

(83)

k = We f B x

(84) (85) (86)


1 1

v = W 1A T k = QA T k l =l+v
Q x x = N + Wxx

b g = bN + W g
xx

NQ xx N N + Wxx NQ xx

g + bN + W g N bN + W g
1 1 xx xx

(87)

Q xx = N + Wxx

(88)

Q vv = QA T We AQ QA T We B N + Wxx

BT We AQ

(89) (90) (91)

Ql l = Q + QA T We B N + Wxx Q f f = BQ xx BT + Q e

BT We AQ QAT We AQ

2 = 0

v T Wv + x T Wxx x r

(92)

r = m u + ux

(93) (94) (95) (96) (97) (98)

x x = 2 Q x x 0 xx = 2 Q xx 0 vv = 2 Q vv 0
ll = 2 Qll 0

ff = 2 Qff 0

Combined Least Squares

16

RMIT

Geospatial Science

Combined Case A, B, W, Wxx = 0

The Combined Case is a non-linear implicit mathematical model with no weights on the parameters. The set of equations for the solution is deduced from the Combined Case with Weighted Parameters by considering that if there are no weights then Wxx = 0 and Q xx = 0 . This implies that x is a constant vector (denoted by x 0 ) of approximate values of the parameters, and partial derivatives with respect to x 0 are undefined. Substituting these two null matrices and the constant vector x = x 0 into equations (1) to (78) gives the following results.

x = N 1t
with
N = BT We B t = BT We f 0

(99) (100) (101) (102)

f 0 = F x0 , l

c h
c h
1

We = Q e 1 = AQA T

(103) (104)

x = x0 + x k = We f 0 B x

(105) (106) (107) (108) (109) (110) (111)

v = W 1A T k = QA T k l =l+v Q x x = Q xx = N 1
Q vv = QA T We AQ QA T We BN 1BT We AQ

Q l l = Q + QA T We B N 1BT We AQ QA T We AQ Q f 0 f 0 = Qe

Combined Least Squares

17

RMIT

Geospatial Science

2 = 0

v T Wv r

(112)

r =mu

(113) (114) (115) (116) (117)

xx = x x = 2 Q xx 0 vv = 2 Q vv 0
ll = 2 Qll 0 f0f0 = 2 Q f0f0 0

Parametric Case A = I, B, W, Wxx = 0

The Parametric Case is a mathematical model with the observations l explicitly expressed by some non-linear function of the parameters x only. This implies that the design matrix A is equal to the identity matrix I. Setting A = I in the Combined Case (with no weights) leads to the following equations.

x = N 1t
with
N = BT We B t = BT We f 0

(118) (119) (120)

f 0 = F x0 , l x = x0 + x k = W f 0 B x

c h

(121) (122)

(123) (124) (125)

v = W 1 k = f 0 B x
l =l+v

Combined Least Squares

18

RMIT

Geospatial Science

Q x x = Q xx = N 1
Q vv = Q BN 1BT

(126) (127) (128) (129)

Ql l = B N 1BT Qf0f0 = Q

2 = 0

v T Wv r

(130)

r = nu
xx = x x = 2 Q xx 0 vv = 2 Q vv 0
ll = 2 Qll 0 f0f0 = 2 Q f0f0 0

(131) (132) (133) (134) (135)

Condition Case A, B = 0, W, Wxx = 0

The Condition Case is characterized by a non-linear model consisting of observations only. Setting B = 0 in the Combined Case (with no weights) leads to the following equations.

k = We f

(136)
We = Q e 1 = AQA T

with

(137) (138) (139)

f = F l v = W 1A T k = QA T k

af

Combined Least Squares

19

RMIT

Geospatial Science

l =l+v
Q vv = QA T We AQ

(140) (141) (142)

Ql l = Q QA T We AQ

2 = 0
r =m

v T Wv r

(143)

(144) (145) (146)

vv = 2 Q vv 0
ll = 2 Qll 0

Combined Least Squares

20

RMIT

Geospatial Science

REFERENCES

Bossler, John D. 1972, 'Bayesian Inference in Geodesy', Ph.D. Dissertation, Department of Geodetic Science, Ohio State University, Columbus, Ohio, USA.

Cross, P.A. 1992, Advanced Least Squares Applied to Position Fixing, Working Paper No. 6, Department of Land Information, University of East London.

Krakiwsky, E.J. 1975, A Synthesis of Recent Advances in the Method of Least Squares, Lecture Notes No. 42, 1992 reprint, Department of Surveying Engineering, University of New Brunswick, Fredericton, Canada Mikhail, E.M. 1976, Observations and Least Squares, IEPA Dun-Donnelley, New York.

Combined Least Squares

21

You might also like