Continuous OptimizationDescent: An optimization point of view on different fields
Introduction
The idea of this paper is to develop the following geometrical view on the complete analysis of finite-dimensional minimization (and so maximization) problems with continuous variables. The name of the game of this analysis is to find for as many admissible points P as possible a small continuous curve of admissible points, originating in P and having everywhere outside P a lower value of the objective function than at P, to be called a descent. These points P cannot be local minima, clearly, and so need not be considered. By a comparison of the—few—remaining admissible points, one finds the global minimum—or minima. To be rigorous, one has to establish the existence of a global minimum in advance, always by means of—a variant of—the theorem of Weierstrass. This theorem states that a continuous function on a non-empty closed and bounded set in assumes its global minimum. A useful variant states that a continuous function that is coercive—this means that outside a suitable closed and bounded set all values taken are higher than at some point of this set—assumes its minimum. This is for example the case for a continuous function for which lim∣x∣→+∞f(x) = +∞ (where ∣·∣ denotes the modulus or euclidian norm). In the present paper we will construct standard and non-standard descents. The former give the standard necessary conditions such as the Fermat theorem, the Lagrange multiplier rule, the Karush–Kuhn–Tucker theorem, and the second order conditions (involving the definiteness of hessians). The latter give nonstandard necessary conditions, which can be used to prove the following famous deep results: the fundamental theorem of algebra, the maximum and the minimum principle from complex function theory, the orthogonal diagonalization of symmetric matrices, the implicit function theorem and the separation theorem for convex sets. It might be a surprise that these theorems can be proved at all using optimization methods; in fact optimization leads to proofs that are simpler than the usual ones, and are all based on one and the same strategy: descent.
Many optimization problems have been solved in the recent and distant past with great effort by some of the greatest scientists. The advantage of the descent view is its user-friendliness: it allows a wide circle of practitioners to understand fully the methods of optimization and to use these methods to solve optimization problems in a routine way. Thus the high art of great experts is turned into a basic craft.
The descent approach has a constructive, algorithmic character, and will hopefully be of help in the numerical solution of problems; for example the descent approach of the theorem on symmetric matrices is closely related to the Jacobi methods; these are efficient algorithms for finding an orthonormal basis of eigenvectors for a symmetric matrix.
The scope of the descent approach extends to dynamic optimization, that is, to the Calculus of Variations and the Optimal Control. Moreover it would be of interest to discover new necessary conditions for optimality by means of the construction of new descents.
This paper is organized as follows. In Section 2 an explanation is given in which precise sense the word ‘descent’ will be used in this paper, and what is its role in the analysis of optimization problems. In Section 3 a first illustration is given: the standard descents are used to derive the first, second and higher order necessary conditions for unconstrained optimization problems. In Section 4 nonstandard descents are given and applied to give efficient proofs of central theorems from different fields: complex numbers—the fundamental theorem of algebra, complex function theory—the minimum and maximum principle, convex analysis—separation of convex sets, symmetric matrices—orthogonal diagonalization (‘existence of an orthogonal basis of eigenvectors’), differential calculus—the implicit function theorem and the tangent space theorem. In Section 5 we use descents to give efficient proofs of the necessary conditions of optimization problems with equality and inequality constraints (the Lagrange multiplier rule and the Karush–Kuhn–Tucker conditions). Section 6 concludes.
Section snippets
Descent
In the present paper the word descent will always be used in the following precise sense. For a set and a point a variation for a point in the set A is defined to be a curve xα, α ⩾ 0 in the set A with endpoint that is continuous at α = 0. For a set and a function , we consider the problem to minimize f on A, to be denoted asand define a descent for an admissible point to be a variation xα, α ⩾ 0 for inside the set of admissible points A, for which
Applications of standard descents without constraints
In this section we illustrate the concept descent by means of the following two fundamental results of optimization theory: the first and second order necessary conditions for unconstrained problems. These results are applied in the complete analysis of all concrete unconstrained optimization problems. We emphasize that the core of these results is the construction of the following two types of descent.
- 1.
If the derivative of a function of n variables at a point is nonzero, then going in a
Nonstandard descents and central theorems
Now we will formulate and prove some well-known results. Each one of these is the central result of an entire field. The usual proofs are complicated. None of these results seems to have anything to do with optimization. However we will give straightforward proofs using optimization methods: in each case the heart of the matter is the construction of a suitable type of non-standard descent for a suitable auxiliary optimization problem, giving a nonstandard necessary condition for this problem. Fundamental theorem of algebra
Applications of standard descents with constraints
In this section we present further illustrations of the construction of descents in order to establish first order necessary conditions for optimization problems with constraints. To begin with we consider the Lagrange multiplier rule—the first order conditions for equality constrained problems. The novelty of the proof is the presentation in terms of descent as well as the use of the tangent space theorem. This use leads to a more intuitive proof than the usual one, which proceeds by a
Conclusion
We have demonstrated that descents can be used to give a unified optimization point of view on central theorems from fields that are usually considered as distinct from optimization. These theorems include the fundamental theorem of algebra, the maximum and minimum principle of complex function theory, the separation theorem for convex sets, the orthogonal diagonalization of symmetric matrices and the implicit function theorem. Moreover, we have seen how the descent point of view gives
References (5)
- et al.
Optimal Control
(1987) - et al.
Optimization: Insights and Applications
(2005)