Continuous Optimization
Descent: An optimization point of view on different fields

https://doi.org/10.1016/j.ejor.2006.06.008Get rights and content

Abstract

The aim of this paper is to present a novel, transparent approach to a well-established field: the deep methods and applications of the complete analysis of continuous optimization problems. Standard descents give a unified approach to all standard necessary conditions, including the Lagrange multiplier rule, the Karush–Kuhn–Tucker conditions and the second order conditions. Nonstandard descents lead to new necessary conditions. These can be used to give surprising proofs of deep central results of fields that are generally viewed as distinct from optimization: the fundamental theorem of algebra, the maximum and the minimum principle of complex function theory, the separation theorems for convex sets, the orthogonal diagonalization of symmetric matrices and the implicit function theorem. These optimization proofs compare favorably with the usual proofs and are all based on the same strategy. This paper is addressed to all practitioners of optimization methods from many fields who are interested in fully understanding the foundations of these methods and of the central results above.

Introduction

The idea of this paper is to develop the following geometrical view on the complete analysis of finite-dimensional minimization (and so maximization) problems with continuous variables. The name of the game of this analysis is to find for as many admissible points P as possible a small continuous curve of admissible points, originating in P and having everywhere outside P a lower value of the objective function than at P, to be called a descent. These points P cannot be local minima, clearly, and so need not be considered. By a comparison of the—few—remaining admissible points, one finds the global minimum—or minima. To be rigorous, one has to establish the existence of a global minimum in advance, always by means of—a variant of—the theorem of Weierstrass. This theorem states that a continuous function on a non-empty closed and bounded set in Rn assumes its global minimum. A useful variant states that a continuous function that is coercive—this means that outside a suitable closed and bounded set all values taken are higher than at some point of this set—assumes its minimum. This is for example the case for a continuous function f:RnR for which limx∣→+∞f(x) = +∞ (where ∣·∣ denotes the modulus or euclidian norm). In the present paper we will construct standard and non-standard descents. The former give the standard necessary conditions such as the Fermat theorem, the Lagrange multiplier rule, the Karush–Kuhn–Tucker theorem, and the second order conditions (involving the definiteness of hessians). The latter give nonstandard necessary conditions, which can be used to prove the following famous deep results: the fundamental theorem of algebra, the maximum and the minimum principle from complex function theory, the orthogonal diagonalization of symmetric matrices, the implicit function theorem and the separation theorem for convex sets. It might be a surprise that these theorems can be proved at all using optimization methods; in fact optimization leads to proofs that are simpler than the usual ones, and are all based on one and the same strategy: descent.

Many optimization problems have been solved in the recent and distant past with great effort by some of the greatest scientists. The advantage of the descent view is its user-friendliness: it allows a wide circle of practitioners to understand fully the methods of optimization and to use these methods to solve optimization problems in a routine way. Thus the high art of great experts is turned into a basic craft.

The descent approach has a constructive, algorithmic character, and will hopefully be of help in the numerical solution of problems; for example the descent approach of the theorem on symmetric matrices is closely related to the Jacobi methods; these are efficient algorithms for finding an orthonormal basis of eigenvectors for a symmetric matrix.

The scope of the descent approach extends to dynamic optimization, that is, to the Calculus of Variations and the Optimal Control. Moreover it would be of interest to discover new necessary conditions for optimality by means of the construction of new descents.

This paper is organized as follows. In Section 2 an explanation is given in which precise sense the word ‘descent’ will be used in this paper, and what is its role in the analysis of optimization problems. In Section 3 a first illustration is given: the standard descents are used to derive the first, second and higher order necessary conditions for unconstrained optimization problems. In Section 4 nonstandard descents are given and applied to give efficient proofs of central theorems from different fields: complex numbers—the fundamental theorem of algebra, complex function theory—the minimum and maximum principle, convex analysis—separation of convex sets, symmetric matrices—orthogonal diagonalization (‘existence of an orthogonal basis of eigenvectors’), differential calculus—the implicit function theorem and the tangent space theorem. In Section 5 we use descents to give efficient proofs of the necessary conditions of optimization problems with equality and inequality constraints (the Lagrange multiplier rule and the Karush–Kuhn–Tucker conditions). Section 6 concludes.

Section snippets

Descent

In the present paper the word descent will always be used in the following precise sense. For a set ARn and a point x¯A a variation for a point x¯ in the set A is defined to be a curve xα, α  0 in the set A with endpoint x¯=x0 that is continuous at α = 0. For a set ARn and a function f:AR, we consider the problem to minimize f on A, to be denoted asf(x)min,xA,and define a descent for an admissible point x¯A to be a variation xα, α  0 for x¯ inside the set of admissible points A, for which

Applications of standard descents without constraints

In this section we illustrate the concept descent by means of the following two fundamental results of optimization theory: the first and second order necessary conditions for unconstrained problems. These results are applied in the complete analysis of all concrete unconstrained optimization problems. We emphasize that the core of these results is the construction of the following two types of descent.

  • 1.

    If the derivative of a function of n variables at a point x¯ is nonzero, then going in a

Nonstandard descents and central theorems

Now we will formulate and prove some well-known results. Each one of these is the central result of an entire field. The usual proofs are complicated. None of these results seems to have anything to do with optimization. However we will give straightforward proofs using optimization methods: in each case the heart of the matter is the construction of a suitable type of non-standard descent for a suitable auxiliary optimization problem, giving a nonstandard necessary condition for this problem.

Fundamental theorem of algebra

Applications of standard descents with constraints

In this section we present further illustrations of the construction of descents in order to establish first order necessary conditions for optimization problems with constraints. To begin with we consider the Lagrange multiplier rule—the first order conditions for equality constrained problems. The novelty of the proof is the presentation in terms of descent as well as the use of the tangent space theorem. This use leads to a more intuitive proof than the usual one, which proceeds by a

Conclusion

We have demonstrated that descents can be used to give a unified optimization point of view on central theorems from fields that are usually considered as distinct from optimization. These theorems include the fundamental theorem of algebra, the maximum and minimum principle of complex function theory, the separation theorem for convex sets, the orthogonal diagonalization of symmetric matrices and the implicit function theorem. Moreover, we have seen how the descent point of view gives

References (5)

  • V.M. Alekseev et al.

    Optimal Control

    (1987)
  • J. Brinkhuis et al.

    Optimization: Insights and Applications

    (2005)
There are more references available in the full text version of this article.

Cited by (0)

View full text