Rosenbrock
Method
The Rosenbrock method
is a 0th order search algorithm (it means it does not
require any derivatives of the target function. Only simple evaluations of the
objective function are used). Yet, it approximates a gradient search thus combining
advantages of 0th order and 1st order strategies. It was published by Rosenbrock(1)
in the 70th.
This method is particularly well suited when the objective function does not
require a great deal of computing power. In such a case, it's useless to use
very complicated optimization algorithms. We will lose our time in the optimization
calculations instead of making a little bit more evaluations of the objective
function which will lead, at the end, to a shorter calculation time.
If the objective function takes lots of time to evaluate (more than a few seconds),
you should use a more
complex algorithm.
In the first iteration, it is a simple 0th order search in the directions of
the base vectors of an n-dimensional coordinate system
(in the figure above n=2). In the case of a success, which
is an attempt yielding a new minimum value of the target function, the step
width is increased, while in the case of a failure it is decreased and the opposite
direction will be tried (see points 1 to 16 in the figure above).
Once a success
has been found and exploited in each base direction, the coordinate system is
rotated in order to make the first base vector point into the direction of the
gradient (the points 13,16,&17 are defining the new base in the figure above).
Now all step widths are initialized and the process is repeated using the rotated
coordinate system (points 16 to 23).
Nelder-Mead Methods (also called non-linear simplex) are very similar to the
Rosenbrock Algorithm. However they should be avoided because there is no proof
of convergence for these algorithms. In fact, Nelder-Mead Methods can fail to
find a local optimum of a very simple optimization problem based on the minimization
of a polynomial (4). This is due to degenerescence inside the simplex. Even
when no degenerescence occurs, the convergence speed can becomes arbitrarly
slow. The Rosenbrock algorithm does not suffer from these problems. The Rosenbrock
algorithm has also been proved to always converge (4) (global
convergence to a local optima assured). Please, stop using Nelder-Mead
Methods! It is slow and do not always converge.
The creation of a new rotated coordinate system is usually done using a Gram-Shmidt
orthogonalization procedure. In extreme cases, this algorithm becomes numerically
instable. This instabillity can lead to a premature ending of the optimization
algorithm. The Gram-Shmidt orthogonalization procedure can also become very
time consuming when the dimension n of the search
space increases (the complexity of the procedure is n^3).
The new orthogonalization procedure of J.R.Palmer (3) solves these issues.
Initializing the step widths to rather big values enables the strategy to leave
local optima and to go on with search for more global minima. If you set properly
the initial step width, you will find almost always the global optima. If you
want to be sure to always find the global optima, you must use a global-search
optimization method like, for example:
- one parameter to optimize: the STEP method(5)
- more than one parameter to optimize: some "grid-search" like DIRECT(6)
or MCS(7). Please don't use any heuristic or stochastic algorithm: they are
slow and unprecise.
These global-search methods (STEP method, DIRECT method and MCS method) are assuming
that the objective function is "fast" to evaluate. If it's not the case,
than you should use CONDOR. Inside
CONDOR, if you set properly the initial step width (the "rho_start"
parameter), you will find almost always the global optima.
It has turned out that the Rosenbrock approach is more stable than many sophisticated
algorithms and it requires much less calculations of the target function than
higher order strategies (2).
Finally a user who is not an optimization expert has a real chance to understand
it and to set and tune its parameters properly.
Bibliography
(1) "An
automatic Method for findong the greatest or least Value of a Function",
Rosenbrock, H. H., Comp. J., 3, pp 175-184, (1960). The paper is mirrored here.
(2) Schwefel,
H.-P., Num. Opt. v. Comp.-modellen m. d. Evol.-strat., Basel, Birkhäuser, 1977.
(3) "An
improved procedure for orthogonalising the search vectors in Rosenbrock's and
Swann's direct search optimization methods", J.R.Palmer, The Computer
Journal, Volume 12, Issue 1, pp. 69-71. The paper is mirrored here.
(4) "Nonlinear
Programming: Theory and Algorithms, 2nd Edition", Mokhtar S. Bazaraa,
Hanif D. Sherali, C. M. Shetty.
(5) "STEP: The Easiest Way to Optimize a Function",
Stephan Langerman, Grégory Seront, Hugues Bersini.
(6) "Direct
Optimization Algortihm", Daniel E. Finkel. A local copy of a paper
describing the method is here.
(7) "Gloabl
optimization by multilvel Coordinate Search", Waltraud Huyer, Arnold
Neumaier. A local copy of a paper describing the method is here.
Links & Download
The original paper by Rosenbrock is available online here.
It is mirrored here
.
The paper describing the improvement of JR.Palmer is available
here. The paper is mirrored here.
Some "bonus" material: a discussion
about convergence speed of the algorithm.
The code of my implementation of the Rosenbrock algorithm is available here.
The code of the optimizer is standard C and doesn't use any special libraries.
It can be compiled under windows or unix without any problem. The code has been
highly optimized to be as fast as possible (with extend use of memcpy function,
special fast matrix manipulation and so on...). The improvement of J.R. Palmer
is used. This improvement allows still faster speed (the complexity of the orthogonalization
procedure of J.R.Palmer is only n^2). The whole algorithm
is only 107 lines long (with correct indentations). It's written in pure structural
programmation (there is no goto instruction). It is thus very easy to understand/customize.
A small example of use (testR1.cpp) is available. In this example, the standard
Rosenbrock banana function is minimized.