Comparison of Lasserre's measure--based bounds for polynomial optimization to bounds obtained by simulated annealing
Etienne de Klerk, Monique Laurent

TL;DR
This paper compares Lasserre's measure-based bounds for polynomial optimization with bounds from simulated annealing, showing that Lasserre's hierarchy converges faster for polynomial functions over convex sets.
Contribution
The paper demonstrates that for polynomial functions over convex sets, Lasserre's hierarchy provides a faster convergence rate than previously established, compared to simulated annealing bounds.
Findings
Lasserre's bounds outperform simulated annealing in convergence speed.
Faster convergence rate established for polynomial optimization over convex bodies.
Comparison highlights advantages of measure-based bounds over stochastic methods.
Abstract
Comparison of Lasserre's measure--based bounds for polynomial optimization to bounds obtained by simulated annealing. We consider the problem of minimizing a continuous function over a compact set . We compare the hierarchy of upper bounds proposed by Lasserre in [{\em SIAM J. Optim.} , pp. ] to bounds that may be obtained from simulated annealing. We show that, when is a polynomial and a convex body, this comparison yields a faster rate of convergence of the Lasserre hierarchy than what was previously known in the literature.
| Name | Convex? | |||
|---|---|---|---|---|
| Booth function | yes | |||
| Matyas function | yes | |||
| Motzkin polynomial | no | |||
| Three-Hump Camel function | no |
| Booth Function | Matyas Function | Three–Hump Camel Function | Motzkin Polynomial | |||||
|---|---|---|---|---|---|---|---|---|
| 118.383 | 367.834 | 4.2817 | 15.4212 | 29.0005 | 247.462 | 1.0614 | 4.0250 | |
| 97.6473 | 356.113 | 3.8942 | 14.8521 | 9.5806 | 241.700 | 0.8294 | 3.9697 | |
| 69.8174 | 345.043 | 3.6894 | 14.3143 | 9.5806 | 236.102 | 0.8010 | 3.9157 | |
| 63.5454 | 334.585 | 2.9956 | 13.8062 | 4.4398 | 230.663 | 0.8010 | 3.8631 | |
| 47.0467 | 324.701 | 2.5469 | 13.3262 | 4.4398 | 225.381 | 0.7088 | 3.8118 | |
| 41.6727 | 315.354 | 2.0430 | 12.8726 | 2.5503 | 220.251 | 0.5655 | 3.7618 | |
| 34.2140 | 306.510 | 1.8335 | 12.4441 | 2.5503 | 215.269 | 0.5655 | 3.7130 | |
| 28.7248 | 298.138 | 1.4784 | 12.0390 | 1.7127 | 210.431 | 0.5078 | 3.6654 | |
| 25.6050 | 290.206 | 1.3764 | 11.6560 | 1.7127 | 205.734 | 0.4060 | 3.6190 | |
| 21.1869 | 282.687 | 1.1178 | 11.2938 | 1.2775 | 201.173 | 0.4060 | 3.5737 | |
| 19.5588 | 275.554 | 1.0686 | 10.9511 | 1.2775 | 196.745 | 0.3759 | 3.5296 | |
| 16.5854 | 268.782 | 0.8742 | 10.6267 | 1.0185 | 192.446 | 0.3004 | 3.4865 | |
| 15.2815 | 262.348 | 0.8524 | 10.3195 | 1.0185 | 188.272 | 0.3004 | 3.4444 | |
| 13.4626 | 256.230 | 0.7020 | 10.0284 | 0.8434 | 184.220 | 0.2819 | 3.4034 | |
| 12.2075 | 250.408 | 0.6952 | 9.75250 | 0.8434 | 180.287 | 0.2300 | 3.3633 | |
| 11.0959 | 244.863 | 0.5760 | 9.49071 | 0.7113 | 176.469 | 0.2300 | 3.3242 | |
| 9.9938 | 239.577 | 0.5760 | 9.24220 | 0.7113 | 172.762 | 0.2185 | 3.2860 | |
| 9.2373 | 234.534 | 0.4815 | 9.00615 | 0.6064 | 169.164 | 0.1817 | 3.2487 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Optimization Algorithms Research · Markov Chains and Monte Carlo Methods
Comparison of Lasserre’s measure–based bounds for polynomial optimization to bounds obtained by simulated annealing
Etienne de Klerk Tilburg University and Delft University of Technology, [email protected]
Monique Laurent Centrum Wiskunde & Informatica (CWI), Amsterdam and Tilburg University, [email protected]
Abstract
We consider the problem of minimizing a continuous function over a compact set . We compare the hierarchy of upper bounds proposed by Lasserre in [SIAM J. Optim. , pp. ] to bounds that may be obtained from simulated annealing.
We show that, when is a polynomial and a convex body, this comparison yields a faster rate of convergence of the Lasserre hierarchy than what was previously known in the literature.
Keywords: Polynomial optimization; Semidefinite optimization; Lasserre hierarchy; simulated annealing
AMS classification: 90C22; 90C26; 90C30
1 Introduction
We consider the problem of minimizing a continuous function over a compact set . That is, we consider the problem of computing the parameter:
[TABLE]
Our goal is to compare two convergent hierarchies of upper bounds on , namely measure-based bounds introduced by Lasserre [10], and simulated annealing bounds, as studied by Kalai and Vempala [6]. The bounds of Lasserre are obtained by minimizing over measures on with sum-of-squares polynomial density functions with growing degrees, while simulated annealing bounds use Boltzman distributions on with decreasing temparature parameters.
In this note we establish a relationship between these two approaches, linking the degree and temperature parameters in the two bounds (see Theorem 4.1 for a precise statement). As an application, when is a polynomial and is a convex body, we can show a faster convergence rate for the measure-based bounds of Lasserre. The new convergence rate is in (see Corollary 4.3), where is the degree of the sum-of-squares polynomial density function, while the dependence was in in the previously best known result from [4].
Polynomial optimization is a very active research area in the recent years since the seminal works of Lasserre [8] and Parrilo [13] (see also, e.g., the book [9] and the survey [11]). In particular, hierarchies of (lower and upper) bounds for the parameter have been proposed, based on sum-of-squares polynomials and semidefinite programming.
For a general compact set , upper bounds for have been introduced by Lasserre [10], obtained by searching for a sum-of-squares polynomial density function of given maximum degree , so as to minimize the integration of with respect to the corresponding probability measure on . When is Lipschitz continuous and under some mild assumption on (which holds, e.g., when is a convex body), estimates for the convergence rate of these bounds have been proved in [4] that are in order . Improved rates have been subsequently shown when restricting to special sets . Related stronger results have been shown for the case when is the hypercube or . In [3] the authors show a hierarchy of upper bounds using the Beta distribution, with the same convergence rate in , but whose computation needs only elementary operations; moreover an improved convergence in can be shown, e.g., when is quadratic. In addition, a convergence rate in is shown in [2], using distributions based on Jackson kernels and a larger class of sum-of-squares density functions.
In this paper we investigate the hierarchy of measure-based upper bounds of [10] and show that when is a convex body, convexity can be exploited to show an improved convergence rate in , even for nonconvex functions. The key ingredient for this is to establish a relationship with upper bounds based on simulated annealing and to use a known convergence rate result from [6] for simulated annealing bounds in the convex case.
Simulated annealing was introduced by Kirkpatrick et al. [7] as a randomized search procedure for general optimization problems. It has enjoyed renewed interest for convex optimization problems since it was shown by Kalai and Vempala [6] that a polynomial-time implementation is possible. This requires so-called hit-and-run sampling from , as introduced by Smith [14], that was shown to be a polynomial-time procedure by Lovász [12]. Most recently, Abernethy and Hazan [1] showed formal equivalence with a certain interior point method for convex optimization.
This unexpected equivalence between seemingly different methods has motivated this current work to relate the bounds by Lasserre [10] to the simulating annealing bounds as well.
In what follows, we first introduce the measure-based upper bounds of Lasserre [10]. Then we recall the bounds based on simulated annealing and the known convergence results for a linear objective function , and we give an explicit proof of their extension to the case of a general convex function . After that we state our main result and the next section is devoted to its proof. In the last section we conclude with numerical examples showing the quality of the two types of bounds and some final remarks.
2 Lasserre’s hierarchy of upper bounds
Throughout, is the set of polynomials in variables with real coefficients and, for an integer , is the set of polynomials with degree at most . Any polynomial can be written , where we set for and . We let denote the set of sums of squares of polynomials, and consists of all sums of squares of polynomials with degree at most .
We recall the following reformulation for , established by Lasserre [10]:
[TABLE]
By bounding the degree of the polynomial by , we can define the parameter:
[TABLE]
Clearly, the inequality holds for all . Lasserre [10] gave conditions under which the infimum is attained in the program (1). De Klerk, Laurent and Sun [4, Theorem 3] established the following rate of convergence for the bounds .
Theorem 2.1** (De Klerk, Laurent, and Sun [4]).**
Let and a convex body. There exist constants (depending only on and ) and (depending only on ) such that
[TABLE]
That is, the following asymptotic convergence rate holds:
This result of [4] holds in fact under more general assumptions, namely when is Lipschitz continuous and satisfies a technical assumption (Assumption 1 in [4]), which says (roughly) that around any point in there is a ball whose intersection with is at least a constant fraction of the unit ball.
As explained in [10] the parameter can be computed using semidefinite programming, assuming one knows the moments of the Lebesgue measure on , where
[TABLE]
Indeed suppose has degree . Writing as , the parameter from (1) can be reformulated as follows:
[TABLE]
Since the sum-of-squares condition on may be written as a linear matrix inequality, this is a semidefinite program. In fact, since it only has one linear equality constraint, it may even be rewritten as a generalised eigenvalue problem. In particular, is equal to the the smallest generalized eigenvalue of the system:
[TABLE]
where the symmetric matrices and are of order with rows and columns indexed by , and
[TABLE]
For more details, see [10, 4, 3].
3 Bounds from simulated annealing
Given a continuous function , consider the associated Boltzman distribution over the set , defined by the density function:
[TABLE]
Write if the random variable takes values in according to the Boltzman distribution.
The idea of simulated annealing is to sample where is a fixed ‘temperature’ parameter, that is subsequently decreased. Clearly, for any , we have
[TABLE]
The point is that, under mild assumptions, these bounds converge to the minimum of over (see, e.g., [15]):
[TABLE]
The key step in the practical utilization of theses bounds is therefore to perform the sampling of .
Example 3.1**.**
Consider the minimization of the Motzkin polynomial
[TABLE]
over , where there are four global minimizers at the points , and . Figure 1 shows the corresponding Boltzman density function for . Note that this density has four modes, roughly positioned at the four global minimizers of in . The corresponding upper bound on is ().
To obtain a better upper bound on from the Lasserre hierarchy, one needs to use a degree s.o.s. polynomial density; in particular, one has (degree ) and (degree ). More detailed numerical results are given in Section 5.
When is linear and a convex body, Kalai and Vempala [6, Lemma 4.1] show that the rate of convergence of the bounds in (5) is linear in the temperature .
Theorem 3.2** (Kalai and Vempala [6]).**
Let where is a unit vector, and let be a convex body. Then, for any , we have
[TABLE]
We indicate how to extend the result of Kalai and Vempala in Theorem 3.2 to the case of an arbitrary convex function . This more general result is hinted at in §6 of [6], where the authors write
“… a statement analogous to [Theorem 2] holds also for general convex functions …”
but no precise statement is given there. In any event, as we will now show, the more general result may readily be derived from Theorem 3.2 (in fact, from the special case of a linear coordinate function for some ).
Corollary 3.3**.**
Let be a convex function and let be a convex body. Then, for any , we have
[TABLE]
Proof.
Set
[TABLE]
Then we have
[TABLE]
Define the set
[TABLE]
Then is a convex body and we have
[TABLE]
Accordingly, define the parameter
[TABLE]
Corollary 3.3 will follow if we show that
[TABLE]
To this end set and , where we define
[TABLE]
[TABLE]
We work out the parameters and (taking integrations by part):
[TABLE]
[TABLE]
Then, using the fact that , we obtain:
[TABLE]
which proves relation (6).
We can now derive the result of Corollary 3.3. Indeed, using Theorem 2 applied to and the linear function , we get
[TABLE]
∎∎
The bound in the corollary is tight asymptotically, as the following example shows.
Example 3.4**.**
Consider the univariate problem . Thus, in this case, , and . For given temperature , we have
[TABLE]
4 Main results
We will prove the following relationship between the sum-of-squares based upper bound (1) of Lasserre and the bound (5) based on simulated annealing.
Theorem 4.1**.**
Let be a polynomial of degree , let be a compact set and set Then we have
[TABLE]
For the problem of minimizing a convex polynomial function over a convex body, we obtain the following improved convergence rate for the sum-of-squares based bounds of Lasserre.
Corollary 4.2**.**
Let be a convex polynomial of degree and let be a convex body. Then for any integer one has
[TABLE]
for some constant that does not depend on . (For instance, .)
Proof.
Let and set . Combining Theorems 3.2 and 4.1, we get
[TABLE]
∎∎
For convex polynomials , this improves on the known result from Theorem 2.1. One may in fact use the last corollary to obtain the same rate of convergence in terms of for all polynomials, without the convexity assumption, as we will now show.
Corollary 4.3**.**
If be a polynomial and a convex body, then there is a depending on and only, so that
[TABLE]
A suitable value for is
[TABLE]
where and .
We first define a convex quadratic function that upper bounds on as follows:
[TABLE]
where , and is the minimizer of on . Note that for all by Taylor’s theorem, and .
By definition of the Lasserre hierarchy,
[TABLE]
Invoking Corollary 4.2 and using that the degree of is , we obtain:
[TABLE]
where .∎
The last result improves on the known rate in Theorem 2.1.
Proof of Theorem 4.1
The key idea in the proof of Theorem 4.1 is to replace the Boltzman density function by a polynomial approximation.
To this end, we first recall a basic result on approximating the exponential function by its truncated Taylor series.
Lemma 4.4** (De Klerk, Laurent and Sun [4]).**
Let denote the (univariate) polynomial of degree obtained by truncating the Taylor series expansion of at the order . That is,
[TABLE]
Then is a sum of squares of polynomials. Moreover, we have
[TABLE]
We now define the following approximation of the Boltzman density :
[TABLE]
By construction, is a sum-of-squares polynomial probability density function on , with degree if is a polynomial of degree . Moreover, by relation (7) in Lemma 4.4, we obtain
[TABLE]
From this we can derive the following result.
Lemma 4.5**.**
For any continuous and scalar one has
[TABLE]
Proof.
As is a polynomial of degree and a probability density function on (by (8)), we have:
[TABLE]
Using the above inequality (10) for we can upper bound the integral on the right hand side:
[TABLE]
Combining with the inequality (12) gives the desired result.∎∎
We now proceed to the proof of Theorem 4.1. In view of Lemma 4.5, we only need to bound the last right-hand-side term in (11):
[TABLE]
and to show that .
By the defininition of we have
[TABLE]
which implies
[TABLE]
Combining with the Stirling approximation inequality,
[TABLE]
applied to , we obtain:
[TABLE]
Consider , so that . Then, using the fact that , we obtain
[TABLE]
This concludes the proof of Theorem 4.1.
5 Concluding remarks
We conclude with a numerical comparison of the two hierarchies of bounds. By Theorem 4.1, it is reasonable to compare the bounds and , with and the degree of . Thus we define, for the purpose of comparison:
[TABLE]
We calculated the bounds for the polynomial test functions listed in Table 1.
The bounds are shown in Table 2. The bounds were taken from [2], while the bounds were computed via numerical integration, in particular using the Matlab routine sum2 of the package Chebfun [5].
The results in the table show that the bound in Theorem 4.1 is far from tight for these examples. In fact, it may well be that the convergence rates of and are different for convex . We know that is the exact convergence rate for the simulated annealing bounds for convex (cf. Example 3.4), but it was speculated in [2] that one may in fact have , even for non-convex . Determining the exact convergence rate remains an open problem.
Finally, one should point out that it is not really meaningful to compare the computational complexities of computing the two bounds and , as explained below.
For any polynomial and convex body , may be computed by solving a generalised eigenvalue problem with matrices of order , as long as the moments of the Lebesgue measure on are known. The generalised eigenvalue computation may be done in operations; see [3] for details. Thus this is a polynomial-time procedure for fixed values of .
For non-convex , the complexity of computing is not known. When is linear, it is shown in [1] that with may be obtained in oracle membership calls for , where the notation suppresses logarithmic factors.
Since the assumptions on the available information is different for the two types of bounds, there is no simple way to compare these respective complexities.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. Abernethy and E. Hazan. Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier. ar Xiv 1507.02528, July 2015.
- 2[2] E. de Klerk, R. Hess and M. Laurent. Improved convergence rates for Lasserre-type hierarchies of upper bounds for box-constrained polynomial optimization. SIAM J. Optim. (to appear), ar Xiv:1603.03329 v 1 (2016)
- 3[3] E. de Klerk, J.-B. Lasserre, M. Laurent, and Z. Sun. Bound-constrained polynomial optimization using only elementary calculations. Mathematics of Operations Research (to appear), ar Xiv:1507.04404 v 2 (2016)
- 4[4] E. de Klerk, M. Laurent, Z. Sun. Convergence analysis for Lasserre’s measure-based hierarchy of upper bounds for polynomial optimization, Math. Program. Ser. A , (2016). doi:10.1007/s 10107-016-1043-1.
- 5[5] T. A. Driscoll, N. Hale, and L. N. Trefethen, editors, Chebfun Guide , Pafnuty Publications, Oxford, 2014.
- 6[6] A. T. Kalai and S. Vempala. Simulated annealing for convex optimization. Mathematics of Operations Research , 31(2), 253–266 (2006)
- 7[7] S. Kirkpatrick, C.D. Gelatt, Jr., M.P. Vecchi. Optimization by simulated annealing. Science 220, 671-–680, 1983.
- 8[8] Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11, 796–817 (2001)
