next up previous contents
Next: Generating . Up: Unconstrained Optimization Previous: An estimation of the   Contents


The secondary Trust-Region subproblem

The material of this chapter is based on the following reference: [Pow00].
We seek an approximation to the solution $ s^*$, of the maximization problem:

$\displaystyle \max_{s \in \Re^n} \vert q(x_k+s) \vert \equiv \vert f(x_k)$ $\displaystyle + \langle g_k,s \rangle +
 \frac{1}{2}\langle s, H_k s\rangle \vert$    
subject to $\displaystyle \Vert s \Vert _2 < \Delta$    

The following maximization problem is equivalent (after a translation) and will be discussed in the chapter:

$\displaystyle \max_{s \in \Re^n} \vert q(s) \vert \equiv \vert \langle g,s \rangle$ $\displaystyle +
 \frac{1}{2}\langle s, H s\rangle \vert$ (5.1)
subject to $\displaystyle \Vert s \Vert _2 \leq \Delta$    

We will indifferently use the term, polynomial, quadratic or model. The ``trust region'' is defined by the set of all points which respect the constraint $ \Vert s \Vert _2 \leq \Delta $.
Further, the shape of the trust region allows $ s$ to be replaced by $ -s$, it's thus equivalent to consider the computation

$\displaystyle \max_{s \in \Re^n} \vert \langle g,s \rangle \vert$ $\displaystyle +
 \frac{1}{2} \vert \langle s, H s\rangle \vert$ (5.2)
subject to $\displaystyle \Vert s \Vert _2 < \Delta$    

Now, if $ \hat{s}$ and $ \tilde{s}$ are the values that maximize $ \vert
\langle g,s \rangle \vert$ and $ \vert\langle s, H s\rangle \vert$, respectively, subject to $ \Vert s \Vert _2 < \Delta$, then $ s$ may be an adequate solution of the problem 5.1, if it is the choice between $ \pm \hat{s}$ and $ \pm \tilde{s}$ that gives the largest value of the objective function of the problem. Indeed, for every feasible $ s$, including the exact solution of the present computation, we find the elementary bound
$\displaystyle \vert \langle g,s \rangle \vert + \frac{1}{2} \vert \langle s, H s\rangle \vert$ $\displaystyle \leq$ $\displaystyle \Bigg( \vert \langle g, \hat{s} \rangle \vert + \frac{1}{2} \vert...
...le \vert + \frac{1}{2} \vert \langle \tilde{s}, H
\tilde{s}\rangle \vert \Bigg)$  
  $\displaystyle \leq$ $\displaystyle 2 \max\Bigg[ \vert \langle g, \hat{s} \rangle \vert + \frac{1}{2}...
...le \vert + \frac{1}{2} \vert \langle \tilde{s}, H
\tilde{s}\rangle \vert \Bigg]$ (5.3)

It follows that the proposed choice of $ s$ gives a value of $ \vert q(s)\vert$ that is, at least, half of the optimal value. Now, $ \hat{s}$ is the vector $ \pm \rho g/\Vert g\Vert$, while $ \tilde{s}$ is an eigenvector of an eigenvalue of $ H$ of largest modulus, which would be too expensive to compute. We will now discuss how to generate $ \tilde{s}$. We will use a method inspired by the power method for obtaining large eigenvalues.
Because $ \vert \langle \tilde{s}, H \tilde{s}\rangle \vert $ is large only if $ \Vert H \tilde{s}\Vert $ is substantial, the technique begins by finding a column of $ H$, $ \omega$ say, that has the greatest Euclidean norm. Hence letting $ v_1,v_2, \ldots,v_n$ be the columns of the symmetric matrix $ H$, we deduce the bound
$\displaystyle \Vert H \omega\Vert$ $\displaystyle \geq$ $\displaystyle \Vert\omega\Vert^2 = \max_k \{ \Vert v_k\Vert: k=1,\dots
,n\} \Vert\omega\Vert$  
  $\displaystyle \geq$ $\displaystyle \Vert\omega\Vert \sqrt{ \frac{1}{n}
\sum_{k=1}^n \Vert v_k\Vert^2 }$ (5.4)
  $\displaystyle \geq$ $\displaystyle \frac{\Vert\omega\Vert }{\sqrt{n}}
\; \sigma (H)$ (5.5)

Where $ \sigma (H)$ is the spectral radius of $ H$. It may be disastrous, however to set $ \tilde{s}$ to a multiple of $ \omega$, because $ \langle \omega, H \omega \rangle$ is zero in the case:

$\displaystyle H=\begin{pmatrix}
 1 & 1 & 1 & 1 \\
 1 & -1 & -2/3 & -2/3 \\
 1 & -2/3 & -1 & -2/3 \\
 1 & -2/3 & -2/3 & -1 \\
 \end{pmatrix}$ (5.6)

Therefore, the algorithm picks $ \tilde{s}$ from the two dimensional linear subspace of $ \Re^n$ that is spanned by $ \omega$ and $ H \omega$. Specifically, $ \tilde{s}$ has the form $ \alpha \omega + \beta H \omega$, where the ratio $ \alpha / \beta$ is computed to maximize the expression

$\displaystyle \frac {\vert \langle \alpha \omega + \beta H \omega, H (\alpha
 \...
... + \beta H \omega )\rangle \vert}{ \Vert\alpha \omega + \beta H
 \omega\Vert^2}$ (5.7)

which determines the direction of $ \tilde{s}$. Then the length of $ \tilde{s}$ is defined by $ \Vert\tilde{s}\Vert= \rho$, the sign of $ \tilde{s}$, being unimportant.

Subsections
next up previous contents
Next: Generating . Up: Unconstrained Optimization Previous: An estimation of the   Contents
Frank Vanden Berghen 2004-04-19