Professional Documents
Culture Documents
**************************************************** **********************************
Contents
1 The General Optimization Problem 3
3 Basic MATLAB 10
3.1 Files and Directories in UNIX . . . . . . . . . . . . . . . . . . 10
3.2 Other UNIX Commands . . . . . . . . . . . . . . . . . . . . . 10
3.3 Starting and quitting MATLAB . . . . . . . . . . . . . . . . . 10
3.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Scripts and functions . . . . . . . . . . . . . . . . . . . . . . . 13
3.7 Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1
5 The method of steepest decent 21
5.1 The quadratic case . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Applying the method in Matlab . . . . . . . . . . . . . . . . . 22
5.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
8 Lagrange methods 40
8.1 Quadratic programming . . . . . . . . . . . . . . . . . . . . . 40
8.1.1 Equality constraints . . . . . . . . . . . .
. . . . 40. . .
8.1.2 Inequality constraints . . . . . . . . . . .
. . . . 41. . .
8.2 Sequential Quadratic Programming . . . . . . .
. . . . 42. . .
8.3 Newton’s Method . . . . . . . . . . . . . . . . .
. . . . 42. . .
8.4 Structured Methods . . . . . . . . . . . . . . . .
. . . . 42. . .
8.5 Merit function . . . . . . . . . . . . . . . . . . .
. . . . 43. . .
8.6 Enlargement of the feasible region . . . . . . . .
. . . . 44. . .
8.7 The Han–Powell method . . . . . . . . . . . . .
. . . . 45. . .
8.8 Constrained minimization in Matlab . . . . . . .
. . . . 45. . .
8.9 Constrained minimization in Matlab (using the function fmincon 46
8.10 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2
9.3 Minimization with no constraints. Hassien not provided . . . 54
9.4 Minimization with constraints. . . . . . . . . . . . . . . . . . 55
9.5 Project 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3
1 The General Optimization Problem
min f (x)
d
x∈R
subject to:
gi (x) = 0 i = 1, . . . , me
gi (x) ≤ 0 i = me + 1, . . . , m
xl ≤ x ≤ xu
Line descent methods: Here we deal with algorithms for finding the min-
imum in the case where d = 1. These algorithms are the basic building
blocks when solving more complex optimization problems.
4
Lagrange methods: These methods are based on the Lagrange first-order
conditions of a solution. The method is applied for quadratic pro-
graming.
5
2 Basic properties of solutions and algorithms
6
Example 2.3 Let the production function be f (x1 , . . . , xd ), where xi are the
inputs. The unit price of the produced commodity is q and the unit price of
the ith input is pi . The producer wants to maximize
qf (x1 , . . . , xd ) − p1 x1 − · · · − pd xd .
The first order conditions can be interpreted as stating that the marginal
value increase must be equal to pi .
7
Theorem 2.3 (Second-order sufficient conditions.) Let f ∈ C 2 . As-
sume that x∗ ∈ Ω0 . If f˙(x∗ ) = 0 and f¨(x∗ ) is positive definite then x∗ is a
strict relative minimum.
1. xk → x and
2. yk → y, yk ∈ A(xk ), imply
3. y ∈ A(x).
8
Example 2.6 Suppose for x ∈ R we define A(x) = [−x/2, x/2]. Starting
at x0 = 100, each of the sequences
100, 50, 25, 12, −6, −2, 1, 1/2, . . .
100, −40, 20, −5, −2, 1, 1/4, 1/8, . . .
100, 10, 1/16, 1/100, −1/1000, 1/10000, . . .
might be generated from iterative application of the algorithm. The given
algorithm is closed.
2.3 Homework
f (x, y, z) = 2x2 + xy + y 2 + yz + z 2 − 6x − 7y − 8z + 9.
for a given x0 . Find the equations that determine the first order condi-
tions.
9
4. Define the point-to-set mapping on Rn by
A(x) = {y : y0 x ≤ b},
10
3 Basic MATLAB
The home dir is ‘‘~’’, the current dir is ‘‘.’’ and one dir up is
‘‘..’’.
11
3.4 Matrices
12
4 9 9 4 8
>> s = 1 -1/2 + 1/3 - 1/4 + 1/5 - 1/6 + 1/7 ...
-1/8 + 1/9 -1/10
s =
0.6456
>> A’*A
ans =
378 212 206 360
212 370 368 206
206 368 370 212
360 206 212 378
>> det(A)
ans =
0
>> eig(A)
ans =
34.0000
8.0000
0.0000
-8.0000
>> (A/34)^5
ans =
0.2507 0.2495 0.2494 0.2504
0.2497 0.2501 0.2502 0.2500
0.2500 0.2498 0.2499 0.2503
0.2496 0.2506 0.2505 0.2493
>> A’.*A
ans =
256 15 18 52
15 100 66 120
18 66 49 168
52 120 168 1
>> n= (0:3)’;
>> pows = [n n.^2 2.^n]
pows =
0 0 1
1 1 2
2 4 4
3 9 8
13
3.5 Graphics
>> t = 0:pi/100:2*pi;
>> y = sin(t);
>> plot(t,y)
>> y2 = sin(t-0.25);
>> y3 = sin(t-0.5);
>> plot(t,y,t,y2,t,y3)
>> [x,y,z]=peaks;
>> contour(x,y,z,20,’k’)
>> hold on
>> pcolor(x,y,z)
>> hold off
M-files are text files containing MATLAB code. M-files end with .m prefix.
Functions are M-files that can accept input argument and return output
arguments. Variables, in general, are local. MATLAB provides many func-
tions. You can also write your own function in an M-file:
function h = falling(t)
global GRAVITY
h = 1/2*GRAVITY*t.^2;
14
3.7 Files
>> save B A
>> A = 0
A =
0
>> load B
>> A
A =
16 3 2 13
5 10 11 8
9 6 7 12
4 15 14 1
function t = logtab1(n)
x=0.01;
for k=1:n
y(k) = log10(x);
x = x+0.01;
end t=y;
function t = logtab2(n)
x = 0.01:0.01:(n*0.01); t = log10(x);
15
3.8 Homework
1. Let f (x) = ax2 − 2bx + c. Under which conditions does f has a minimum?
What is the minimizing x?
3. Write a MATLAB function that finds the location and value of the mini-
mum of a quadratic function.
16
4 Basic descent methods
The best known method of line search is Newton’s method. Assume not
only that the function is continuous but also that it is smooth. Given the
first and second derivatives of the function at xn , one can write the Taylor
expansion:
17
(Note that this approach can be generalized to the problem of finding the
zeros of the function g(x) = q 0 (x).)
We can expect that the solution of an iterative procedure of this type
will satisfy
f 0 (x∗ )
x∗ = x∗ − 00 ∗ ⇒ f 0 (x∗ ) = 0.
f (x )
We say that an algorithm converges at rat p at least to a solution x∗ if
kxn+1 − x∗ k
limn < ∞,
kxn − x∗ kp
Theorem 4.1 Let the function g have a continuous second derivative and
let x∗ be such that g(x∗ ) = 0 and g 0 (x∗ ) 6= 0. Then the Newton method
converges with an order of convergence of at least two, provided that x0 is
sufficiently close to x∗ .
In order for Matlab to be able to read/write files in disk D you should use
the command
>> cd d:
18
>> grid on
>> fplot(’[2*sin(x+3), humps(x)]’, [-5 5])
>> fmin(’humps’,0.3,1)
ans =
0.6370
>> fmin(’humps’,0.3,1,1)
Func evals x f(x) Procedure
1 0.567376 12.9098 initial
2 0.732624 13.7746 golden
3 0.465248 25.1714 golden
4 0.644416 11.2693 parabolic
5 0.6413 11.2583 parabolic
6 0.637618 11.2529 parabolic
7 0.636985 11.2528 parabolic
8 0.637019 11.2528 parabolic
9 0.637052 11.2528 parabolic
ans =
0.6370
Assume we are given x1 < x2 < x3 and the values of f (xi ), i = 1, 2, 3, which
satisfy
f (x2 ) < f (x1 ) and f (x2 ) < f (x3 ).
The quadratic passing through these points is given by
3 Q
X j6=i (x − xj )
q(x) = f (xi ) Q .
i=1 j6=i (xi − xj )
19
0, i = 1, 2, 3.}. It can be shown that the order of convergence to the solution
is (approximately) 1.3.
Given x1 and x2 , together with f (x1 ), f 0 (x1 ), f (x2 ) and f 0 (x2 ), one can
consider a cubic polynom of the form
q(x) = a0 + a1 x + a2 x2 + a3 x3 .
which satisfies
q 00 (x) = 2a2 x + 6a3 x > 0.
It follows that the appropriate interpolation is given by
f 0 (x2 ) + β2 − β1
x3 = x2 − (x2 − x1 ) ,
f 0 (x2 ) − f 0 (x1 ) + 2β2
where
f (x1 ) − f (x2 )
β1 = f 0 (x1 ) + f 0 (x2 ) − 3
x1 − x2
2 0 0 1/2
β2 = (β1 − f (x1 )f (x2 )) .
4.6 Homework
20
3.(a) Given f (xn ), f 0 (xn ) and f 0 (xn−1 ), show that
f 0 (xn−1 ) − f 0 (xn ) x − xn )2
q(x) = f (x) + f 0 (xn )(x − xn ) + · ,
xn−1 − xn 2
Starting at any x > 0 show that, through a series of halving and doubling
of x and evaluation of the corresponding f (x)’s, a three-point pattern can
be determined.
21
5 The method of steepest decent
xn+1 = xn − αn f˙(xn ),
where αn is the nonnegative scalar that minimizes f (xn − αf˙(xn )). It can
be shown that relative to the solution set {x∗ : f˙(x∗ ) = 0}, the algorithm is
descending and closed, thus converging.
Assume
1 1 1
f (x) = x0 Qx − x0 b = (x − x∗ )0 Q(x − x∗ ) − x∗ 0 Qx∗ ,
2 2 2
were Q a positive definite and symmetric matrix and x∗ = Q−1 b is the
minimizer of f . Note that in this case f˙(x) = Qx − b. and
1
f (xn − αf˙(xn )) = (xn − αf˙(xn ))0 Q(xn − αf˙(xn )) − (xn − αf˙(xn ))0 b,
2
which is minimized at
f˙(xn )0 f˙(xn )
αn = .
f˙(xn )0 Qf˙(xn )
It follows that
1
(xn+1 − x∗ )0 Q(xn+1 − x∗ ) =
2
(f˙(xn )0 f˙(xn ))2 1
1− × (xn − x∗ )0 Q(xn − x∗ ).
˙ 0 ˙ ˙ 0 −1 ˙
f (xn ) Qf (xn )f (xn ) Q f (xn ) 2
22
Proof: By a change of variables Q is diagonal. We assume it is. In which
case P
(x0 x)2 ( di=1 x2i )2
= Pd P .
(x0 Qx)(x0 Q−1 x) ( i=1 λi x2i )( di=1 x2i /λi )
Pd
Denoting ξi = x2i / 2
j=1 xj xj the above becomes
Pd
1/ i=1 ξi λi φ(ξ1 , . . . , ξd )
= Pd = .
i=1 (ξi /λi )
ψ(ξ1 , . . . , ξd )
>> Q = [0.78 -0.02 -0.12 -0.14; -0.02 0.86 -0.04 0.06; ...
-0.12 -0.04 0.72 -0.08; -0.14 0.06 -0.08 0.74]
Q =
0.7800 -0.0200 -0.1200 -0.1400
-0.0200 0.8600 -0.0400 0.0600
-0.1200 -0.0400 0.7200 -0.0800
-0.1400 0.0600 -0.0800 0.7400
>> b = [0,76 0.08 1.12 0.68];
>> eig(Q)
ans =
0.8800
0.9400
23
0.7600
0.5200
>> ((0.88 - 0.52)/(0.88 + 0.52))^2
ans =
0.0661
24
x =
0.5000 -1.0000
>> fun(x)
ans =
1.3029e-010
>> x=[-1,1];
>> options(6)=2;
>> x = fminu(’fun’,x,options)
x =
0.5000 -1.0000
function f = fun1(x)
f = 100*(x(2)-x(1)^2)^2 + (1 - x(1))^2;
>> x=[-1,1];
>> options(1)=1;
>> options(6)=0;
>> x = fminu(’fun1’,x,options)
f-COUNT FUNCTION STEP-SIZE GRAD/SD
4 4 0.500001 -16
9 3.56611e-009 0.500001 0.0208
14 7.36496e-013 0.000915682 -3.1e-006
21 1.93583e-013 9.12584e-005 -1.13e-006
24 1.55454e-013 4.56292e-005 -7.16e-007
Optimization Terminated Successfully
Search direction less than 2*options(2)
Gradient in the search direction less than 2*options(3)
NUMBER OF FUNCTION EVALUATIONS=24
x =
1.0000 1.0000
>> x=[-1,1];
>> options(6)=2;
>> x = fminu(’fun1’,x,options)
f-COUNT FUNCTION STEP-SIZE GRAD/SD
4 4 0.500001 -16
9 3.56611e-009 0.500001 0.0208
15 1.11008e-012 0.000519178 -4.82e-006
25
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 1.931503e-017.
> In c:\matlab\toolbox\optim\cubici2.m at line 10
In c:\matlab\toolbox\optim\searchq.m at line 54
In c:\matlab\toolbox\optim\fminu.m at line 257
....
5.3 Homework
26
6 Newton and quasi-Newton methods
Proof:
6.2 Extensions
xn+1 = xn − αn Sn f˙(xn ),
27
Assume
1 1 1
f (x) = x0 Qx − x0 b = (x − x∗ )0 Q(x − x∗ ) − x∗ 0 Qx∗ ,
2 2 2
were Q a positive definite and symmetric matrix and x∗ = Q−1 b is the
minimizer of f . Note that in this case f˙(x) = Qx − b. and
1
f (xn −αSn f˙(xn )) = (xn −αSn f˙(xn ))0 Q(xn −αSn f˙(xn ))−(xn −αSn f˙(xn ))0 b,
2
which is minimized at
f˙(xn )0 Sn f˙(xn )
αn = .
f˙(xn )0 Sn QSn f˙(xn )
It follows that
1
(xn+1 − x∗ )0 Q(xn+1 − x∗ ) =
2
(f˙(xn )0 Sn f˙(xn ))2 1
1− × (xn − x∗ )0 Q(xn − x∗ ).
˙ 0 ˙ ˙ 0 −1 ˙
f (xn ) Sn QSn f (xn )f (xn ) Q f (xn ) 2
1. Minimizes f (xn )−αSn f˙(xn ) to obtain xn+1 , ∆n x = xn+1 −xn = −αn Sn f˙(xn ),
f˙(xn+1 ) and ∆n f˙ = f˙(xn+1 ) − f˙(xn ).
2. Set
∆n x0 ∆n x Sn ∆n f˙∆n f˙0 Sn
Sn+1 = Sn + − .
∆n x0 ∆n f˙ ∆n f˙0 Sn ∆n f˙
3. Go to 1.
28
It follows, since ∆n x0 f˙(xn+1 ) = 0, that if Sn is positive definite then so
is Sn+1 .
Proof: Define ∆x = xn+1 − xn , ∆f˙ = f˙(xn+1 ) − f˙(xn ). Since
it follows that
2. Set
∆n f˙0 ∆n f˙ Hn ∆n x∆n x0 Hn
Hn+1 = Hn + − .
∆n f˙0 ∆n x ∆n x0 Hn ∆n x
3. Go to 1.
29
6.5 The function fminunc
30
the value of the gradient of FUN at the solution X.
[X,FVAL,EXITFLAG,OUTPUT,GRAD,HESSIAN]=FMINUNC(FUN,X0,...)
returns the value of the Hessian of the objective function FUN at the solution
X.
6.6 Examples
function f = myfun(x)
f = 3*x(1)^2 + 2*x(1)*x(2) + x(2)^2; % cost function
>> x0 = [1,1];
>> [x,fval] = fminunc(’myfun’,x0)
After a couple of iterations, the solution, x, and the value of the function at
x, fval, are returned:
x =
1.0e-008 *
-0.7914 0.2260
fval =
1.5722e-016
To minimize this function with the gradient provided, modify the M-file
myfun.m so the gradient is the second output argument
31
After several iterations the solution x and fval, the value of the function at
x, are returned:
x =
1.0e-015 *
-0.6661 0
fval2 =
1.3312e-030
6.7 Homework
What is the rate if δ is larger than the smallest eigenvalue of (f¨(x∗ ))−1 ?
4. Read the help file on the function fminu. Investigate the effect of sup-
plying the gradients with the parameter grad on the performance of the
procedure. Compare, in particular the functions bilinear and fun1.
6.8 Project 1
32
Use the function fminunc to identify local minima of the function. Try to
the procedure with and without providing the gradient. Try it for different
n’s. Which is the largest n for which the convergence was successful?
33
7 Constrained Minimization Conditions
min f (x)
d
x∈R
subject to:
gi (x) = 0 i = 1, . . . , me
gi (x) ≤ 0 i = me + 1, . . . , m
xl ≤ x ≤ xu
The first me constraints are called equality constraints and the last m − me
constraints are the inequality constraints.
M = {y : ġ(x∗ )0 y = 0}.
34
The necessary conditions can be formulated as l˙ = 0. The matrix of partial
second derivatives of l (with respect to x) at x∗ is
m
X
¨lx (x∗ ) = f¨(x∗ ) + g̈(x∗ )λ = f¨(x∗ ) + g̈j (x∗ )λj
j=1
We say that this matrix is positive semidefinite over M if x0 ¨lx (x∗ )x ≥ 0, for
all x ∈ M .
7.2 Examples
y+z+λ = 0
x+z+λ = 0
x+y+λ = 0
x + y + z = 3.
35
The problem can be formulated as
min f (p1 , . . . , pd )
subject to:
d
X d
X
pi = 1, xi pi = m
i=1 i=1
0 ≤ pi , i = 1, . . . , d.
− log(pi ) − 1 + λ1 + λ2 xi = 0, i = 1, . . . , d,
which leads to
d
X d
X
pi = exp{(λ1 − 1) + λ2 xi }, pi = 1, xi Pi = m.
i=1 i=1
Example 7.3 A chain is suspended from two hooks that are t meters apart
on a horizontal line. The chain consists of d links. Each link is 1 meter
long (measured from the inside). What is the shape of the chain?
36
The constraints are:
d
X d
X d q
X
yi = 0, xi = 1 − yi2 = t.
i=1 i=1 i=1
which leads to
d − i + 0.5 + λ1
yi = − .
[λ22 + (d − i + 0.5 + λ1 )2 ]1/2
Let m
X
¨lx (x∗ ) = f¨(x∗ ) + λj g̈j (x∗ ).
j=1
37
Theorem 7.4 (Second-order condition) Let x∗ be a local extremum point
of f subject to the constraint gj (x) = 0, 1 ≤ j ≤ me and gj (x) ≤ 0,
me + 1 ≤ j ≤ m. Assume that x∗ is a regular point of these constraints, and
let λ ∈ Rm be such that λj ≥ 0, for all j > me , and
m
X
f˙(x∗ ) + λj ġj (x∗ ) = 0.
j=1
Then the matrix ¨lx (x∗ ) is positive semidefinite on the tangent subspace of
the active constraints.
Suppose also that the matrix ¨lx (x∗ ) is positive definite on M . Then x∗ is a
strict local minimum for the constrained optimization problem.
M 0 = {y : ġj (x∗ )0 y = 0, j ∈ J}
38
Example 7.4 Consider the problem:
4x + 2y − 10 + 2λ1 3 + 3λ2 = 0
2x + 2y − 10 + 2λ1 y + λ2 = 0
λ1 ≥ 0, λ2 ≥ 0
2 2
λ1 (x + y − 5) = 0
λ2 (3x + y − 6) = 0.
One should check different subsets of active and inactive constraints. For
example, if we set J = {1} then
4x + 2y − 10 + 2λ1 3 + 3λ2 = 0
2x + 2y − 10 + 2λ1 y + λ2 = 0
x2 + y 2 = 5,
7.5 Sensitivity
minimize f (x)
subject to g(x) = c.
For each c, assume the existence of a solution point x∗ (c). Under appropriate
regularity conditions the function x∗ (c) is well behaved with x∗ (0) = x∗ .
Theorem 7.7 (Sensitivity Theorem) Let f, g ∈ C 2 and consider the
family of problem defined above. Suppose that for c = 0 there is a local
solution x∗ that is a regular point and that, together with its associated La-
grange multiplier vector λ, satisfies the second-order sufficient conditions for
39
a strict local minimum. Then for every c in a neighborhood of 0 there is
x∗ (c), continuous in c, such that x∗ (0) = x∗ , x∗ (c) is a local minimum of
the constrained problem indexed by c, and
7.6 Homework
2. Find the rectangle of given perimeter that has greatest area by solving
the first-order necessary conditions. Verify that the second-order sufficient
conditions are satisfied.
3. Three types of items are to be stored. Item A costs one dollar, item B costs
two dollars and item C costs 4 dollars. The demand for the three items
are independent and uniformly distributed in the range [0, 3000]. How
many of each type should be stored if the total budget is 4,000 dollars?
is non-singular.
minimize x0 Qx − 2b0 x
subject to Ax = c.
40
8 Lagrange methods
41
8.1.2 Inequality constraints
For the general quadratic programming problem, the method of active set is
used. A working set of constraints Wn is updated in each iteration. The set
Wn contains all constraints that are suspected to satisfy an equality relation
at the solution point. In particular, it contains the equality constraints. An
algorithm for solving the general quadratic problem is:
1. Start with a feasible point x0 and a working set W0 . Set n = 0
If d∗n = 0 go to 4.
Take x0 = (0, 0)0 , and W0 = {2, 3}. Then d∗0 = (0, 0)0 . Both Lagrange
multipliers are negative, but the one corresponding to (2) is more negative.
Drop that constraint, and put W1 = {3}. Minimizing along the line y = 0
leads to x1 = (3, 0)0 . The Lagrange multiplier of the active constraint is
negative, thus W2 = ∅. Also, d∗1 = (−1, 4), the direction to the overall
optimum at (2, 4)0 . We move to the constraint (1), and write W3 = {(1)}.
Finally, we move along this constraint to the solution.
42
8.2 Sequential Quadratic Programming
minimize f (x)
subject to gi (x) = 0, i = 1, . . . , me
gi (x) ≤ 0, i = me + 1, . . . , m.
Consider the case of equality constraints only. At each iteration the problem
These methods are modifications of the basic Newton method, with approx-
imations replacing Hessian. One can rewrite the solution to the Newton step
in the form ¨
xn+1 xn ln ġn0 −1 l˙n
= − .
λn+1 λn ġn 0 gn
Instead, one can use the formula
−1 ˙
xn+1 xn Hn ġn0 ln
= − αn ,
λn+1 λn ġn 0 gn
43
8.5 Merit function
In order to choose the αn and to assure that the algorithm will converge
a merit function is associated with the problem such that a solution of
the constrained problem is a (local) minimum of the merit function. The
algorithm should be descending with respect to the merit function.
Consider, for example, the problem with inequality constraints only:
minimize f (x)
subject to gi (x) ≤ 0, i = 1, . . . , m.
44
m
X
= f (x) + αf˙(x)0 d + c [gi (x)+ + αcġj (x)0 d + o (α)]+
i=1
m
X X
= f (x) + αf˙(x)0 d + c gi (x)+ + αc ġj (x)0 d + o (α)
i=1 j∈J(x)
X
= Z(x) + αf˙(x) d + αc 0 0
ġj (x) d + o (α) .
j∈J(x)
Here we applied condition (5) in order to infer that ġj (x)0 d ≤ 0 if gj (x) = 0.
Using this condition again we get
X X m
X
0
c ġj (x) d ≤ c −gj (x) = −c gj (x)+ . (8)
j∈J(x) j∈J(x) j=1
The conclusion follows from the assumption that H is positive definite and
the assumption c ≥ maxj λj .
minimize f (x)
subject to gi (x) ≤ 0, i = 1, . . . , m.
45
Assume that at the current iteration xn = x and Hn = H. Then one
wants to consider the QP problem:
46
and run the Matlab session:
>> x0 = [-1,1];
>> x = constr(’fun4’, x0)
x =
-9.5474 1.0474
>> [f,g] = fun4(x)
f =
0.0236
g =
1.0e-014 *
0.1110
-0.1776
>> options = [];
>> vlb = [0,0];
>> vlu = [];
>> x = constr(’fun4’, x0, options, vlb, vlu)
x =
0 1.5000
>> [f,g] = fun4(x)
f =
8.5000
g =
0
-10
Write the M-file grodf4.m:
function [df,dg] = grudf4(x)
f = exp(x(1))*(4*x(1)^2+2*x(2)^2+4*x(1)*x(2)+2*x(2)+1);
df = [f + exp(x(1))*(8*x(1) + 4*x(2)), exp(x(1))*(4*x(1) + 4*x(2) + 2)];
dg = [x(2) - 1, -x(2); x(1) - 1, -x(1)];
and run the session:
>> vlb = [];
>> x = constr(’fun4’, x0, options, vlb, vlu, ’grudf4’)
x =
-9.5474 1.0474
47
Let us consider the problem:
function f = obj5(x)
f=exp(x(1)) * (4*x(1)^2 + 2*x(2)^2 + 4*x(1)*x(2) + 2*x(2) + 1);
and con5.m:
48
x =
-9.5474 1.0474
>> val
fval =
0.0236
>> [c, ceq] = con5(x)
c =
1.0e-014 *
0.1110
-0.1776
ceq =
[]
>> output.funcCount
ans =
38
The above problem can be solved more efficiently and accurately if gra-
dients are supplied by the user. Create the M-files:
and
>> x0
x0 =
-1 1
>> options = optimset(’LargeScale’,’off’);
>> options = optimset(options,’GradObj’,’on’,’GradConstr’,’on’);
49
>> [x,fval,exitflag,output] = fmincon(’obj5grad’,x0,[],[],[],[],[],[],...
>> ’con5grad’,options);
Optimization terminated successfully:
Search direction less than 2*options.TolX and
maximum constraint violation is less than options.TolCon
Active Constraints:
1
2
>> x
x =
-9.5474 1.0474
>> fval
fval =
0.0236
>> [c, ceq] = con5grad(x)
c =
1.0e-014 *
0.1110
-0.1776
ceq =
[]
>> output.funcCount
ans =
20
8.10 Homework
1. Read the help files of the functions constr and fmincon. Redo the ex-
ample given in class with the function fmincon. Compare the properties
of the two functions in QP problems of different magnitude.
50
3. Extend the result in 2 for the case where H = Hn changes but yet kxk2 ≤
x0 Hn x ≤ ckxk2 for some 0 < < c < ∞ and for all x and n.
51
9 Large scale problems
Large scale problems requires special techniques to deal with memory prob-
lems and numeric complications.
Consider the issue in the context of unconstrained minimization of a
functionf (x). The basic approaches to minimization of a function can by the
general heuristic: Define a neighborhood N of the current x. Approximate
the function f by a function q over N . The solution to the minimization
problem in q provides a candidate for a new x, x + d. This candidate
is adopted if f (x + d) < f (x). (For example, q(d) = f (x) + f˙(x)0 d +
(1/2)d0 f¨(x)d and N = {kDdk ≤ }, for D a diagonal scaling matrix.)
However, when the dimension of the problem is large this approach is
not feasible. The alternative which is used in MATLAB is to choose a two
dimensional subspace S and to constraint the analysis to that subspace.
The subspace is formed by taking the direction of the gradient and either
the Newton direction, i.e. the solution to
f¨(x)d2 = −f˙(x),
52
9.2 Minimization with no constraints. Hassien provided
Let consider the function which was analyzed as part of the first project:
X
n−1
2 2
f (x) = (x2i )(xi+1 +1) + (x2i+1 )(xi +1) .
i=1
Let us first minimize this function with the sparse tridiagonal Hessian
matrix. Start with the M-file brownfgh.m:
53
v(i+1)=v(i+1)+4*(x(i).^2+1).*(x(i+1).^2).*(x(i).^2).*((x(i+1).^2).^(x(i).^2-1));
v0=v;
v=zeros(n-1,1);
v(i)=4*x(i+1).*x(i).*((x(i).^2).^(x(i+1).^2))+...
4*x(i+1).*(x(i+1).^2+1).*x(i).*((x(i).^2).^(x(i+1).^2)).*log(x(i).^2);
v(i)=v(i)+ 4*x(i+1).*x(i).*((x(i+1).^2).^(x(i).^2)).*log(x(i+1).^2);
v(i)=v(i)+4*x(i).*((x(i+1).^2).^(x(i).^2)).*x(i+1);
v1=v;
i=[(1:n)’;(1:(n-1))’];
j=[(1:n)’;(2:n)’];
s=[v0;2*v1];
H=sparse(i,j,s,n,n);
H=(H+H’)/2;
end
>> n = 5;
>> v0=ones(n,1);
>> v1=ones(n-1,1);
>> s=[v0;2*v1];
>> i=[(1:n)’;(1:(n-1))’];
>> j=[(1:n)’;(2:n)’];
>> H=sparse(i,j,s,n,n);
>> full(H)
ans =
1 2 0 0 0
0 1 2 0 0
0 0 1 2 0
0 0 0 1 2
0 0 0 0 1
>> n = 1000;
>> xstart = -ones(n,1);
>> xstart(2:2:n,1) = 1;
>> options = optimset(’GradObj’, ’on’, ’Hessian’, ’on’);
> [x, fval, exitflag, output] = fminunc(’brownfgh’,xstart,options);
Optimization terminated successfully:
First-order optimality less than OPTIONS.TolFun, and no negative/zero
54
curvature detected
>> exitflag
exitflag =
1
>> fval
fval =
2.8709e-017
>> output.iterations
ans =
8
Now, lets redo the problem, but without the Hessian. The algorithm will
approximate it, using the sparse finite-differences. Note that the gradient
must be provided in large-scale problems. Start with the M-file brownfg.m:
>> i=[(1:n)’;(1:(n-1))’];
>> j=[(1:n)’;(2:n)’];
>> v0=ones(n,1);
>> v1=ones(n-1,1);
55
>> s=[v0;v1];
>> H=sparse(i,j,s,n,n);
>> Hstr = (H + H’)/2;
>> spy(Hstr);
56
Optimization terminated successfully:
Relative function value changing by less than OPTIONS.TolFun
>> fval
fval =
205.9313
>> output.iterations
ans =
22
9.5 Project 2
n n/2
X X
4
f (x) = 1 + [(3 − 2xi )xi − xi−1 − xi+1 + 1] + [xi + xi+n/2 ]4 ,
i=1 i=1
57
10 Penalty and Barrier Methods
minimize f (x)
subject to gi (x) ≤ 0, i = 1, . . . , m.
Choose a continuous penalty function which is zero inside the feasible set
and positive outside of it. For example,
m
X
P (x) = (1/2) max{0, gi (x)}2 .
i=1
Proof:
58
Adding (13) and (13) yields
Lemma 10.2 Let x∗ be the solution of the original problem. Then for each
n
f (x∗ ) ≥ q(cn , x∗n ) ≥ f (x∗n ).
Proof:
Let f ∗ be the optimal value associated with the problem. then according
to Lemmas 10.1 and 10.2, the sequence of values q(cn , x∗n ) is nondecreasing
and bounded by f ∗ . Thus
59
10.2 Barrier method
minimize f (x)
subject to gi (x) ≤ 0, i = 1, . . . , m,
and assume that the feasible set is the closure of its interior. A barrier
function is a continuous and positive function over the feasible set which
goes to ∞ as x approaches the boundary. For example,
m
X 1
B(x) = .
i=1
gi (x)
60