Least-Norm Solutions of Undetermined Equations: EE263 Autumn 2007-08 Stephen Boyd

EE263 Autumn 2007-08 Stephen Boyd
Lecture 8
Least-norm solutions of undetermined
equations
• least-norm solution of underdetermined equations
• minimum norm solutions via QR factorization
• derivation via Lagrange multipliers
• relation to regularized least-squares
• general norm minimization with equality constraints
8–1
Underdetermined linear equations
we consider
y = Ax
where A ∈ Rm×n is fat (m < n), i.e.,
• there are more variables than equations
• x is underspecified, i.e., many choices of x lead to the same y
we’ll assume that A is full rank (m), so for each y ∈ Rm, there is a solution
set of all solutions has form
{ x | Ax = y } = { xp + z | z ∈ N (A) }
where xp is any (‘particular’) solution, i.e., Axp = y
Least-norm solutions of undetermined equations 8–2

• z characterizes available choices in solution
• solution has dim N (A) = n − m ‘degrees of freedom’
• can choose z to satisfy other specs or optimize among solutions

Least-norm solution
one particular solution is
xln = AT (AAT )−1y
(AAT is invertible since A full rank)
in fact, xln is the solution of y = Ax that minimizes kxk
i.e., xln is solution of optimization problem
minimize kxk
subject to Ax = y
(with variable x ∈ Rn)

suppose Ax = y, so A(x − xln) = 0 and
(x − xln)T xln = (x − xln)T AT (AAT )−1y

= (A(x − xln))T (AAT )−1 y
= 0
i.e., (x − xln) ⊥ xln, so
kxk2 = kxln + x − xlnk2 = kxlnk2 + kx − xlnk2 ≥ kxlnk2
i.e., xln has smallest norm of any solution

{ x | Ax = y }
N (A) = { x | Ax = 0 }
xln
• orthogonality condition: xln ⊥ N (A)
• projection interpretation: xln is projection of 0 on solution set

{ x | Ax = y }

• A† = AT (AAT )−1 is called the pseudo-inverse of full rank, fat A
• AT (AAT )−1 is a right inverse of A
• I − AT (AAT )−1 A gives projection onto N (A)
cf. analogous formulas for full rank, skinny matrix A:
• A† = (AT A)−1 AT
• (AT A)−1AT is a left inverse of A
• A(AT A)−1AT gives projection onto R(A)

Least-norm solution via QR factorization
find QR factorization of AT , i.e., AT = QR, with
• Q ∈ Rn×m, QT Q = Im
• R ∈ Rm×m upper triangular, nonsingular
then
• xln = AT (AAT )−1y = QR−T y
• kxlnk = kR−T yk

Derivation via Lagrange multipliers
• least-norm solution solves optimization problem
minimize xT x
subject to Ax = y
• introduce Lagrange multipliers: L(x, λ) = xT x + λT (Ax − y)
• optimality conditions are
∇xL = 2x + AT λ = 0, ∇λ L = Ax − y = 0
• from first condition, x = −AT λ/2
• substitute into second to get λ = −2(AAT )−1y
• hence x = AT (AAT )−1y

Example: transferring mass unit distance
• unit mass at rest subject to forces xi for i − 1 < t ≤ i, i = 1, . . . , 10
• y1 is position at t = 10, y2 is velocity at t = 10
• y = Ax where A ∈ R2×10 (A is fat)
• find least norm force that transfers mass unit distance with zero final
velocity, i.e., y = (1, 0)

0.06
0.04
xln
0.02
−0.02
−0.04
−0.06
0 2 4 6 8 10 12
1
t
0.8
position
0.6
0.4
0.2
0
0 2 4 6 8 10 12
0.2
t
velocity
0.15
0.1
0.05
0
0 2 4 6 8 10 12

Relation to regularized least-squares
• suppose A ∈ Rm×n is fat, full rank

• define J1 = kAx − yk2 , J2 = kxk2
• least-norm solution minimizes J2 with J1 = 0
• minimizer of weighted-sum objective J1 + µJ2 = kAx − yk2 + µkxk2 is
T
−1
xµ = A A + µI AT y
• fact: xµ → xln as µ → 0, i.e., regularized solution converges to

least-norm solution as µ → 0
• in matrix terms: as µ → 0,
T
−1 T T T −1

A A + µI A →A AA
(for full rank, fat A)

General norm minimization with equality constraints
consider problem
minimize kAx − bk
subject to Cx = d
with variable x
• includes least-squares and least-norm problems as special cases
• equivalent to
minimize (1/2)kAx − bk2
subject to Cx = d
• Lagrangian is
L(x, λ) = (1/2)kAx − bk2 + λT (Cx − d)

= (1/2)xT AT Ax − bT Ax + (1/2)bT b + λT Cx − λT d

• optimality conditions are
∇xL = AT Ax − AT b + C T λ = 0, ∇λL = Cx − d = 0
• write in block matrix form as

T T T

A A C x A b
=
C 0 λ d
• if the block matrix is invertible, we have

T T
−1 T

x A A C A b
=
λ C 0 d

if AT A is invertible, we can derive a more explicit (and complicated)
formula for x
• from first block equation we get
x = (AT A)−1(AT b − C T λ)
• substitute into Cx = d to get
C(AT A)−1(AT b − C T λ) = d
so
T −1 T −1 T −1 T

λ = C(A A) C C(A A) A b−d
• recover x from equation above (not pretty)

T −1 T T T −1 T −1 T −1 T

x = (A A) A b−C C(A A) C C(A A) A b−d

Least-Norm Solutions of Undetermined Equations: EE263 Autumn 2007-08 Stephen Boyd

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Least-Norm Solutions of Undetermined Equations: EE263 Autumn 2007-08 Stephen Boyd

Uploaded by

Copyright:

Available Formats

EE263 Autumn 2007-08 Stephen Boyd

• least-norm solution of underdetermined equations

• minimum norm solutions via QR factorization

• derivation via Lagrange multipliers

• relation to regularized least-squares

• general norm minimization with equality constraints

• there are more variables than equations

• x is underspecified, i.e., many choices of x lead to the same y

set of all solutions has form

where xp is any (‘particular’) solution, i.e., Axp = y

Least-norm solutions of undetermined equations 8–2

• solution has dim N (A) = n − m ‘degrees of freedom’

• can choose z to satisfy other specs or optimize among solutions

Least-norm solutions of undetermined equations 8–3

one particular solution is

xln = AT (AAT )−1y

(AAT is invertible since A full rank)

in fact, xln is the solution of y = Ax that minimizes kxk

i.e., xln is solution of optimization problem

(with variable x ∈ Rn)

Least-norm solutions of undetermined equations 8–4

(x − xln)T xln = (x − xln)T AT (AAT )−1y

i.e., (x − xln) ⊥ xln, so

kxk2 = kxln + x − xlnk2 = kxlnk2 + kx − xlnk2 ≥ kxlnk2

i.e., xln has smallest norm of any solution

Least-norm solutions of undetermined equations 8–5

• orthogonality condition: xln ⊥ N (A)

• projection interpretation: xln is projection of 0 on solution set

Least-norm solutions of undetermined equations 8–6

• AT (AAT )−1 is a right inverse of A

• I − AT (AAT )−1 A gives projection onto N (A)

cf. analogous formulas for full rank, skinny matrix A:

• (AT A)−1AT is a left inverse of A

• A(AT A)−1AT gives projection onto R(A)

Least-norm solutions of undetermined equations 8–7

find QR factorization of AT , i.e., AT = QR, with

• R ∈ Rm×m upper triangular, nonsingular

• xln = AT (AAT )−1y = QR−T y

Least-norm solutions of undetermined equations 8–8

• least-norm solution solves optimization problem

• introduce Lagrange multipliers: L(x, λ) = xT x + λT (Ax − y)

• optimality conditions are

• from first condition, x = −AT λ/2

• substitute into second to get λ = −2(AAT )−1y

• hence x = AT (AAT )−1y

Least-norm solutions of undetermined equations 8–9

• unit mass at rest subject to forces xi for i − 1 < t ≤ i, i = 1, . . . , 10

• y1 is position at t = 10, y2 is velocity at t = 10

• y = Ax where A ∈ R2×10 (A is fat)

Least-norm solutions of undetermined equations 8–10

Least-norm solutions of undetermined equations 8–11

• suppose A ∈ Rm×n is fat, full rank

• fact: xµ → xln as µ → 0, i.e., regularized solution converges to

(for full rank, fat A)

Least-norm solutions of undetermined equations 8–12

• includes least-squares and least-norm problems as special cases

L(x, λ) = (1/2)kAx − bk2 + λT (Cx − d)

Least-norm solutions of undetermined equations 8–13

• write in block matrix form as

• if the block matrix is invertible, we have

Least-norm solutions of undetermined equations 8–14

• from first block equation we get

• substitute into Cx = d to get