On semi-infinite systems of convex polynomial inequalities and   polynomial

Feng Guo; Xiaoxia Sun

arXiv:1812.10987·math.OC·August 6, 2019

On semi-infinite systems of convex polynomial inequalities and polynomial

Feng Guo, Xiaoxia Sun

PDF

Open Access

TL;DR

This paper develops a method to approximate semi-infinite convex polynomial inequality systems using semidefinite programming, enabling solutions to related convex polynomial optimization problems with guarantees on accuracy and exactness in special cases.

Contribution

It introduces a procedure for constructing approximate semidefinite representations of semi-infinite convex polynomial inequality systems and applies this to convex polynomial optimization.

Findings

01

Constructed semidefinite representations that approximate the feasible set.

02

Provided an SDP relaxation method for convex polynomial optimization over these sets.

03

Achieved exact SDP relaxation and minimizer extraction in special cases.

Abstract

We consider the semi-infinite system of polynomial inequalities of the form \[ \mathbf{K}:=\{x\in\mathbb{R}^m\mid p(x,y)\ge 0,\ \ \forall y\in S\subseteq\mathbb{R}^n\}, \] where $p (x, y)$ is a real polynomial in the variables $x$ and the parameters $y$ , the index set $S$ is a basic semialgebraic set in $R^{n}$ , $- p (x, y)$ is convex in $x$ for every $y \in S$ . We propose a procedure to construct approximate semidefinite representations of $K$ . There are two indices to index these approximate semidefinite representations. As two indices increase, these semidefinite representation sets expand and contract, respectively, and can approximate $K$ as closely as possible under some assumptions. In some special cases, we can fix one of the two indices or both. Then, we consider the optimization problem of minimizing a convex polynomial over $K$ . We present an SDP…

Figures6

Click any figure to enlarge with its caption.

Equations143

K := {x \in R^{m} ∣ p (x, y) \geq 0, \forall y \in S \subseteq R^{n}},

K := {x \in R^{m} ∣ p (x, y) \geq 0, \forall y \in S \subseteq R^{n}},

K := {x \in R^{m} ∣ p (x, y) \geq 0, \forall y \in S \subseteq R^{n}},

K := {x \in R^{m} ∣ p (x, y) \geq 0, \forall y \in S \subseteq R^{n}},

S := {y \in R^{n} ∣ g_{1} (y) \geq 0, \dots, g_{s} (y) \geq 0},

S := {y \in R^{n} ∣ g_{1} (y) \geq 0, \dots, g_{s} (y) \geq 0},

\left\{x\in{\mathbb{R}}^{m}\ \Big{|}\ \exists w\in{\mathbb{R}}^{l},\ \text{s.t.}\ A_{0}+\sum_{i=1}^{m}A_{i}x_{i}+\sum_{j=1}^{l}B_{j}w_{j}\succeq 0\right\}

\left\{x\in{\mathbb{R}}^{m}\ \Big{|}\ \exists w\in{\mathbb{R}}^{l},\ \text{s.t.}\ A_{0}+\sum_{i=1}^{m}A_{i}x_{i}+\sum_{j=1}^{l}B_{j}w_{j}\succeq 0\right\}

(P) f^{*} := x \in K in f f (x) where K is defined in (\ref eq::K) and f (x) \in R [x] is convex.

(P) f^{*} := x \in K in f f (x) where K is defined in (\ref eq::K) and f (x) \in R [x] is convex.

x \in R^{m} min f (x) s.t. p (x, y_{1}) \geq 0, \dots, p (x, y_{l}) \geq 0.

x \in R^{m} min f (x) s.t. p (x, y_{1}) \geq 0, \dots, p (x, y_{l}) \geq 0.

L_{f} (x) := f (x) - f^{*} - i = 1 \sum l λ_{i} p (x, y_{i})

L_{f} (x) := f (x) - f^{*} - i = 1 \sum l λ_{i} p (x, y_{i})

f (x) - i = 1 \sum l λ_{i} p (x, y_{i}) \geq f (u) - i = 1 \sum l λ_{i} p (u, y_{i}), \forall x \in R^{m} and λ_{i} p (u, y_{i}) = 0, i = 1, \dots, l,

f (x) - i = 1 \sum l λ_{i} p (x, y_{i}) \geq f (u) - i = 1 \sum l λ_{i} p (u, y_{i}), \forall x \in R^{m} and λ_{i} p (u, y_{i}) = 0, i = 1, \dots, l,

\mathbf{qmodule}(G):=\left\{\sum_{j=0}^{s}g_{j}q_{j}^{2}\ \Big{|}\ q_{j}\in{\mathbb{R}}[y],j=0,1,\ldots,s\right\}

\mathbf{qmodule}(G):=\left\{\sum_{j=0}^{s}g_{j}q_{j}^{2}\ \Big{|}\ q_{j}\in{\mathbb{R}}[y],j=0,1,\ldots,s\right\}

\mathbf{qmodule}_{k}(G):=\left\{\sum_{j=0}^{s}g_{j}q_{j}^{2}\ \Big{|}\ q_{j}\in{\mathbb{R}}[y],\,\deg(g_{j}q_{j}^{2})\leq 2k,j=0,1,\ldots,s\right\}

\mathbf{qmodule}_{k}(G):=\left\{\sum_{j=0}^{s}g_{j}q_{j}^{2}\ \Big{|}\ q_{j}\in{\mathbb{R}}[y],\,\deg(g_{j}q_{j}^{2})\leq 2k,j=0,1,\ldots,s\right\}

∥ ψ ∥ := α max \frac{∣ ψ _{α} ∣}{( α ∣ α ∣ )} .

∥ ψ ∥ := α max \frac{∣ ψ _{α} ∣}{( α ∣ α ∣ )} .

k \geq c exp [(d^{2} n^{d} \frac{∥ ψ ∥ τ _{S}^{d}}{min _{y \in S} ψ ( y )})^{c}] .

k \geq c exp [(d^{2} n^{d} \frac{∥ ψ ∥ τ _{S}^{d}}{min _{y \in S} ψ ( y )})^{c}] .

H (y^{α}) = \int y^{α} d μ (y), \forall α \in N^{n} .

H (y^{α}) = \int y^{α} d μ (y), \forall α \in N^{n} .

(qmodule_{k} (G))^{*} = {H \in (R [y]_{2 k})^{*} ∣ H (g_{j} q_{j}^{2}) \geq 0, \forall q_{j} \in R [y], de g (g_{j} q_{j}^{2}) \leq 2 k, j = 0, 1, \dots, s} .

(qmodule_{k} (G))^{*} = {H \in (R [y]_{2 k})^{*} ∣ H (g_{j} q_{j}^{2}) \geq 0, \forall q_{j} \in R [y], de g (g_{j} q_{j}^{2}) \leq 2 k, j = 0, 1, \dots, s} .

d_{j} := ⌈ de g (g_{j}) /2 ⌉, \forall j = 1, \dots, s, and d_{S} := j max d_{j} .

d_{j} := ⌈ de g (g_{j}) /2 ⌉, \forall j = 1, \dots, s, and d_{S} := j max d_{j} .

\mbox r ank M_{k - d_{S}} (H) = \mbox r ank M_{k} (H) .

\mbox r ank M_{k - d_{S}} (H) = \mbox r ank M_{k} (H) .

S_{>}

S_{>}

S

{x \in R^{m} ∣ p^{hom} (x, \tilde{y}) \geq 0, \forall y \in closure (S_{>})} .

{x \in R^{m} ∣ p^{hom} (x, \tilde{y}) \geq 0, \forall y \in closure (S_{>})} .

K := {x \in R^{m} ∣ p^{hom} (x, \tilde{y}) \geq 0, \forall y \in S} .

K := {x \in R^{m} ∣ p^{hom} (x, \tilde{y}) \geq 0, \forall y \in S} .

S := {y \in R^{n} ∣ \overset{g}{^}_{1} (y) \geq 0, \dots, \overset{g}{^}_{s} (y) \geq 0, ∥ y ∥_{2}^{2} = 1} .

S := {y \in R^{n} ∣ \overset{g}{^}_{1} (y) \geq 0, \dots, \overset{g}{^}_{s} (y) \geq 0, ∥ y ∥_{2}^{2} = 1} .

Θ_{r} (x) = i = 1 \sum m (\frac{x _{i}}{τ _{K}})^{2 r} \in R [x] .

Θ_{r} (x) = i = 1 \sum m (\frac{x _{i}}{τ _{K}})^{2 r} \in R [x] .

d_{x} = de g_{x} (p (x, y)), d_{y} = de g_{y} (p (x, y)) and d_{K} := max {⌈ d_{y} /2 ⌉, d_{S}} .

d_{x} = de g_{x} (p (x, y)), d_{y} = de g_{y} (p (x, y)) and d_{K} := max {⌈ d_{y} /2 ⌉, d_{S}} .

Λ_{r, t} := ⎩ ⎨ ⎧ (L (x_{1}), \dots, L (x_{m})) \in R^{m} : ⎩ ⎨ ⎧ L \in (R [x]_{2 r})^{*}, L (1) = 1, L (q^{2}) \geq 0, \forall q \in R [x]_{r}, L (Θ_{k}) \leq 1, k = ⌈ d_{x} /2 ⌉, \dots, r, L (p (x, y)) \in qmodule_{t} (G) . ⎭ ⎬ ⎫ .

Λ_{r, t} := ⎩ ⎨ ⎧ (L (x_{1}), \dots, L (x_{m})) \in R^{m} : ⎩ ⎨ ⎧ L \in (R [x]_{2 r})^{*}, L (1) = 1, L (q^{2}) \geq 0, \forall q \in R [x]_{r}, L (Θ_{k}) \leq 1, k = ⌈ d_{x} /2 ⌉, \dots, r, L (p (x, y)) \in qmodule_{t} (G) . ⎭ ⎬ ⎫ .

a^{T} x - b + \frac{ε}{2} (1 + Θ_{r}) = \tilde{q}^{2} + j = 1 \sum l λ_{j} p (x, y_{j})

a^{T} x - b + \frac{ε}{2} (1 + Θ_{r}) = \tilde{q}^{2} + j = 1 \sum l λ_{j} p (x, y_{j})

0 \leq λ_{j} \leq \frac{a ^{T} u _{0} - b}{p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{min _{j = 1, \dots, l} p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{min _{y \in S} p ( u _{0} , y )} \leq \frac{2 τ _{K}}{p _{u_{0}}^{*}},

0 \leq λ_{j} \leq \frac{a ^{T} u _{0} - b}{p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{min _{j = 1, \dots, l} p ( u _{0} , y _{j} )} \leq \frac{2 τ _{K}}{min _{y \in S} p ( u _{0} , y )} \leq \frac{2 τ _{K}}{p _{u_{0}}^{*}},

0 > a^{T} v - b + ε

0 > a^{T} v - b + ε

\geq L (a^{T} x - b) + \frac{ε}{2} L (1 + Θ_{r})

= L (\tilde{q}^{2} + \int_{S} p (x, y) d μ (y))

= L (\tilde{q}^{2}) + \int_{S} L (x, y) d μ (y) \geq 0,

p (\overset{u}{ˉ}, y)

p (\overset{u}{ˉ}, y)

\geq λ p (u_{0}, y) .

p (\overset{u}{ˉ}, y) \geq κ (ε) p (u_{0}, y) \geq κ (ε) p_{u_{0}}^{*} > 0

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Polynomial and algebraic computation · Advanced Control Systems Optimization

Full text

On semi-infinite systems of convex polynomial inequalities and

polynomial optimization problems

Feng Guo

[email protected]

School of Mathematical Sciences,

Dalian University of Technology, Dalian, 116024, China

Xiaoxia Sun

School of Mathematics,

Dongbei University of Finance and Economics, Dalian, 116025, China

[email protected]

Abstract

We consider the semi-infinite system of polynomial inequalities of the form

[TABLE]

where $p(x,y)$ is a real polynomial in the variables $x$ and the parameters $y$ , the index set $S$ is a basic semialgebraic set in ${\mathbb{R}}^{n}$ , $-p(x,y)$ is convex in $x$ for every $y\in S$ . We propose a procedure to construct approximate semidefinite representations of $\mathbf{K}$ . There are two indices to index these approximate semidefinite representations. As two indices increase, these semidefinite representation sets expand and contract, respectively, and can approximate $\mathbf{K}$ as closely as possible under some assumptions. In some special cases, we can fix one of the two indices or both. Then, we consider the optimization problem of minimizing a convex polynomial over $\mathbf{K}$ . We present an SDP relaxation method for this optimization problem by similar strategies used in constructing approximate semidefinite representations of $\mathbf{K}$ . Under certain assumptions, some approximate minimizers of the optimization problem can also be obtained from the SDP relaxations. In some special cases, we show that the SDP relaxation for the optimization problem is exact and all minimizers can be extracted.

keywords:

semi-infinite systems, convex polynomials, semidefinite representations, semidefinite programming relaxations, sum of squares, polynomial optimization

MSC:

[2010] 65K05, 90C22, 90C34

1 Introduction

We consider the following semi-infinite system of polynomial inequalities

[TABLE]

where $p(x,y)\in{\mathbb{R}}[x,y]:={\mathbb{R}}[x_{1},\ldots,x_{m},y_{1},\ldots,y_{n}]$ the polynomial ring in $x$ and $y$ over the real field and the index set $S$ is a basic semialgebraic set defined by

[TABLE]

where $g_{j}(y)\in{\mathbb{R}}[y]$ , $j=1,\dots,s$ . In this paper, we assume that $-p(x,y)\in{\mathbb{R}}[x]$ is convex for every $y\in S$ and hence $\mathbf{K}$ is a convex set in ${\mathbb{R}}^{m}$ .

We say a convex set $C$ in ${\mathbb{R}}^{m}$ is semidefinitely representable (or linear matrices inequality representable) if there exist some integers $l,k$ and real $k\times k$ symmetric matrices $\{A_{i}\}_{i=0}^{m}$ and $\{B_{j}\}_{j=1}^{l}$ such that $C$ is identical with

[TABLE]

and $(\ref{eq::sdr})$ is called the semidefinite representation (or linear matrices inequality representation) of $C$ . Many interesting convex sets are semidefinitely representable, see a collection in Ben-Tal and Nemirovski (2001). Clearly, optimizing a linear function over a semidefinitely representable set can be cast as a semidefinite progamming (SDP) problem, while SDP has an extremely wide area of applications and can be efficiently solved by interior-point methods to a given accuracy in polynomial time (c.f. Wolkowicz et al. (2000)). Semidefinite representations of convex sets can help us to build SDP relaxations of many computationally intractable optimization problems. Arising from above, one of the basic issues in convex algebraic geometry is to characterize convex sets in ${\mathbb{R}}^{m}$ which are semidefinitely representable and give systematic procedures to obtain their semidefinite representations. Clearly, if a set in ${\mathbb{R}}^{m}$ is semidefinitely representable, then it is convex and semialgebraic. Conversely, Nemirovski asked in his plenary address at the 2006 ICM that whether each convex semialgebraic set is semidefinitely representable. Yet a negative answer has been recently given by Scheiderer (2018). Hence, it is reasonable to study how to construct approximate semidefinite representations of $C$ , that is a sequence of semidefinite representation sets of the form $(\ref{eq::sdr})$ which converge to $C$ in some sence.

For a given basic semialgebraic set in ${\mathbb{R}}^{m}$ , Lasserre (2009b) and Gouveia et al. (2010) proposed some methods to construct semidefinite outer approximations of the closure of its convex hull. These appproaches are based on the sums of squares representation of linear functions which are nonnegative on a basic semialgebraic set. If the basic semialgebraic set is compact, these approximations can be made arbitrarily close and become exact under some favorable conditions. Some extensions of these semidefinite approximations to noncompact basic semialgebraic sets are given in Guo et al. (2015). For a convex semialgebraic set, Helton and Nie (2009, 2010) proposed some sufficient conditions, in terms of curvature conditions for the boundary, for its semidefinite representability. These conditions are recently modified and improved by Kriel and Schweighofer (2018). In this paper, we first consider to construct approximate semidefinite representations of the set $\mathbf{K}$ in (1). The difference of this problem from ones in the literature is that $\mathbf{K}$ is defined by infinitely many convex real polynomials. As there is a quantifier in the definition (1), $\mathbf{K}$ is in fact a semialgebraic set by the Tarski-Seidenberg principle (c.f. Bochnak et al. (1998)). Theoretically, $\mathbf{K}$ can be decomposed as a finite union of basic closed semialgebraic sets and hence, as proved in Helton and Nie (2009), the semidefinite approximations of $\mathbf{K}$ can be made by glueing together Lasserre (2009b) relaxations of many small pieces of $\mathbf{K}$ . Such a decomposition of $\mathbf{K}$ can possibly be obtained by quantifier elimination with algebraic techniques (c.f. Bochnak et al. (1998)). However, in practice (exact) quantifier elimination is very costly and limited to problems of very modest size. These obstacles make the problem studied in this paper nontrivial. To the best of our knowledge, there is very few related work in the literature addressing this issue. In Lasserre (2015); Magron et al. (2015), some tractable methods using semidefinite programs are proposed to approximate semi-algebraic sets defined with quantifiers. Clearly, the set $\mathbf{K}$ studied in this paper is in such case with a universal quantifiers. However, rather than approximate semidefinite representations of $\mathbf{K}$ , their approach generates a sequence of sublevel sets of a single polynomial to approximate $\mathbf{K}$ .

In the second part of this paper, we consider the following convex minimization problem

[TABLE]

This problem is NP-hard. Indeed, it is obvious that the problem of minimizing a polynomial $h(y)\in{\mathbb{R}}[y]$ over $S$ can be regarded as a special case of $(\mathbf{P})$ . As is well known, the polynomial optimization problem is NP-hard even when $n>1$ , $h(y)$ is a nonconvex quadratic polynomial and $g_{j}(y)$ ’s are linear (c.f. Pardalos and Vavasis (1991)). Hence, a general the problem $(\mathbf{P})$ cannot be expected to be solved in polynomial time unless P=NP.

The problem $(\mathbf{P})$ can be seen as a special branch of convex semi-infinite programming (SIP), in which the involved functions are not necessarily polynomials. Numerically, SIP problems can be solved by different approaches including, for instance, discretization methods, local reduction methods, exchange methods, simplex-like methods and so on. See Hettich and Kortanek (1993); López and Still (2007); Goberna and López (2017) and the references therein for details. One of main difficulties in numerical treatment of general SIP problems is that the feasibility test of $\bar{u}\in{\mathbb{R}}^{m}$ is equivalent to globally solve the lower level subproblem of $\min_{y\in S}p(\bar{u},y)$ which is generally nonlinear and nonconvex. To the best of our knowledge, few of the numerical methods mentioned above are specially designed by exploiting features of polynomial optimization problems. Parpas and Rustem (2009) proposed a discretization-like method to solve minimax polynomial optimization problems, which can be reformulated as semi-infinite polynomial programming (SIPP) problems. Using polynomial approximation and an appropriate hierarchy of SDP relaxations, Lasserre presented an algorithm to solve the generalized SIPP problems in Lasserre (2012). Based on an exchange scheme, an SDP relaxation method for solving SIPP problems was proposed in Wang and Guo (2013). By using representations of nonnegative polynomials in the univariate case, an SDP method was given in Xu et al. (2015) for linear SIPP problems (a special case of $(\mathbf{P})$ ) with $S$ being closed intervals.

Here are some contributions and novelties in this paper:

(i)

We first propose a procedure to construct approximate semidefinite representations of $\mathbf{K}$ (Section 3.2). The construction is based on some representations of linear functions nonnegative on $\mathbf{K}$ . On the one hand, we use high degree perturbation proposed in Lasserre and Netzer (2007) to approximate the Lagrangian associated with the considered linear function by sums of squares of polynomials. As there is an integration with respect to some unknown measure in the Lagrangian, on the other hand, we employ Putinar’s Positivstellensatz to replace the integration by some linear functionals in the dual spaces of quadratic modules. Consequently, some semidefinite representation sets with two indices are obtained to approximate $\mathbf{K}$ . As two indices increase, these semidefinite representation sets expand and contract, respectively, and can approximate $\mathbf{K}$ as closely as possible under some assumptions (Theorem 3.7). In some special cases when we can fix one of the two indices or both (Remark 3.8). 2. (ii)

As the second contribution in this paper, we present some new SDP relaxation methods for the problem $(\mathbf{P})$ by similar strategies used in constructing approximate semidefinite representations of $\mathbf{K}$ . Approximate values of $f^{*}$ can be obtaind by the proposed SDP relaxations with two indices and converge to $f^{*}$ as the two indices tend to $\infty$ (Theorem 4.4). If $(\mathbf{P})$ has a unique minimizer, approximate minimizers of $(\mathbf{P})$ can also be obtained from the SDP relaxations (Remark 4.5). Compared with some existing related work, the convexity in $(\mathbf{P})$ is well exploited here and the assumptions needed are quite mild. In the case when $f$ and $-p(x,y)$ are s.o.s-convex for every $y\in S$ , the indices in the SDP relaxations can be reduced to one. If, moreover, $S$ is a bounded interval, we show that the SDP relaxation of $(\mathbf{P})$ is exact and all minimizers can be extracted (Theorem 4.8).

This paper is organized as follows. In Section 2, we give some notation and preliminaries used in this paper. Approximate semidefinite representations of $\mathbf{K}$ as well as some examples are proposed in Section 3. We study SDP relaxations of the problem $(\mathbf{P})$ in Section 4.

2 Notation and Preliminaries

Here is some notation used in this paper. The symbol $\mathbb{N}$ (resp., $\mathbb{R}$ ) denotes the set of nonnegative integers (resp., real numbers). For any $t\in\mathbb{R}$ , $\lceil t\rceil$ (resp. $\lfloor t\rfloor$ ) denotes the smallest (resp. largest) integer that is not smaller (resp. larger) than $t$ . For $x=(x_{1},\ldots,x_{m})\in\mathbb{R}^{m}$ , $\|x\|_{2}$ denotes the standard Euclidean norm of $x$ . For $\alpha=(\alpha_{1},\ldots,\alpha_{n})\in\mathbb{N}^{n}$ , $|\alpha|=\alpha_{1}+\cdots+\alpha_{n}$ . For $k\in\mathbb{N}$ , denote $\mathbb{N}^{n}_{k}=\{\alpha\in\mathbb{N}^{n}\mid|\alpha|\leq k\}$ and $|\mathbb{N}^{n}_{k}|$ its cardinality. For variables $x=(x_{1},\ldots,x_{m})$ , $y=(y_{1},\ldots,y_{n})$ and $\beta\in\mathbb{N}^{m},\alpha\in\mathbb{N}^{n}$ , $x^{\beta}$ , $y^{\alpha}$ denote $x_{1}^{\beta_{1}}\cdots x_{m}^{\beta_{m}}$ , $y_{1}^{\alpha_{1}}\cdots y_{n}^{\alpha_{n}}$ , respectively. ${\mathbb{R}}[x]$ (resp., $\mathbb{R}[y]$ ) denotes the ring of polynomials in $x$ (resp., $y$ ) with real coefficients. For $k\in\mathbb{N}$ , denote by ${\mathbb{R}}[x]_{k}$ (resp., ${\mathbb{R}}[y]_{k}$ ) the set of polynomials in ${\mathbb{R}}[x]$ (resp., ${\mathbb{R}}[y]$ ) of total degree up to $k$ . For $A={\mathbb{R}}[x],\ {\mathbb{R}}[y],\ {\mathbb{R}}[x]_{k},\ {\mathbb{R}}[y]_{k}$ , denote by $A^{*}$ the dual space of linear functionals from $A$ to ${\mathbb{R}}$ .

Definition 2.1

We say that the Slater condition holds for $\mathbf{K}$ if there exists $u\in\mathbf{K}$ such that $p(u,y)>0$ for all $y\in S$ and the point $u$ is called a Slater point.

Theorem 2.2

(c.f. Borwein (1981); Levin (1969))* Assume that the Slater condition holds for $\mathbf{K}$ and the index set $S$ is compact in the problem $(\mathbf{P})$ . Then for any convex $f(x)\in{\mathbb{R}}[x]$ , there exist points $y_{1},\ldots,y_{l}\in S$ with $l\leq n$ such that $f^{*}$ is equal to the optimal value of the discretization problem*

[TABLE]

Corollary 2.3

Suppose that the assumptions in Theorem 2.2 hold for $(\mathbf{P})$ . Then for any convex $f[x]\in{\mathbb{R}}[x]$ , there exist $u\in{\mathbb{R}}^{m}$ , $y_{1},\ldots,y_{l}\in S$ and nonnegative Lagrange multipliers $\lambda_{1},\ldots,\lambda_{l}\in{\mathbb{R}}$ with $l\leq n$ such that the Lagrangian

[TABLE]

satisfies that $L_{f}(x)\geq L_{f}(u)=0$ for all $x\in{\mathbb{R}}^{m}$ and $\nabla L_{f}(u)=0$ .

Proof. Consider the discretization problem (4). As the Slater condition holds for (4), by convex programming duality (c.f. (Bertsekas, 2009, Proposition 5.3.1)), there is no dual gap between (4) with its Lagrange dual problem, which has an optimal solution, say $\lambda=(\lambda_{1},\ldots,\lambda_{l})$ where $\lambda_{i}\geq 0$ . By a Frank-Wolfe type theorem proved in Belousov (1977), (4) also has an optimal solution, say $u$ . Then, due to convex programming optimality conditions (c.f. (Bertsekas, 2009, Proposition 5.3.2)), we get

[TABLE]

which implies that $L_{f}(x)\geq L_{f}(u)=0$ for all $x\in{\mathbb{R}}^{m}$ and hence $\nabla L_{f}(u)=0$ . $\square$

Next we recall some background about representations of polynomials positive (nonnegative) on a basic semialgebraic set and the dual theory. A polynomial $\phi(x)\in{\mathbb{R}}[x]$ is said to be a sum of squares (s.o.s) of polynomials if it can be written as $\phi(x)=\sum_{i=1}^{t}\phi_{i}(x)^{2}$ for some $\phi_{1}(x),\ldots,\phi_{t}(x)\in{\mathbb{R}}[x]$ . Notice that not every nonnegative polynomials can be written as s.o.s, see Reznick (2000). Lasserre and Netzer (2007) gave the following s.o.s approximations of nonnegative polynomials via simple high degree perturbations.

Theorem 2.4

(Lasserre and Netzer, 2007, c.f. Theorem 3.1, 3.2 and Corollary 3.3)* For a given $h\in{\mathbb{R}}[x]$ , the following are true.*

(i)

For any $r\geq\lceil\deg(h)/2\rceil$ , there exists $\varepsilon^{*}_{r}\geq 0$ such that $h+\varepsilon(1+\sum_{j=1}^{m}x_{j}^{2r})$ is s.o.s if and only if $\varepsilon\geq\varepsilon^{*}_{r}$ ; 2. (ii)

If $h$ is nonnegative on $[-1,1]^{m}$ , then $\varepsilon^{*}_{r}$ in $(i)$ decreasingly converges to [math] as $r$ tends to $\infty$ ; 3. (iii)

For any $\varepsilon>0$ , if $h$ is nonnegative on $[-1,1]^{m}$ , then there exists some $r(h,\varepsilon)\in\mathbb{N}$ such that $h+\varepsilon(1+\sum_{j=1}^{m}x_{j}^{2r})$ is s.o.s for every $r\geq r(h,\varepsilon)$ .

Moreover, $\varepsilon^{*}_{r}$ in Theorem 2.4 is computable by solving an SDP problem, see (Lasserre and Netzer, 2007, Theorem 3.1).

In the rest of this paper, we let $G:=\{g_{1},\ldots,g_{s}\}$ be the set of polynomials that defines the semialgebraic set $S$ in $(\ref{eq::S})$ and $g_{0}=1$ for convenience.

We denote by

[TABLE]

the quadratic module generated by $G$ and denote by

[TABLE]

its $k$ -th quadratic module. It is clear that if $\psi\in\mathbf{qmodule}(G)$ , then $\psi(y)\geq 0$ for any $y\in S$ . However, the converse is not necessarily true. Note that checking $\psi\in\mathbf{qmodule}_{k}(G)$ for a fixed $k\in\mathbb{N}$ is an SDP feasibility problem, see Lasserre (2001); Parrilo and Sturmfels (2003).

Definition 2.5

We say that ${\cal Q}(G)$ is Archimedean if there exists $\psi\in{\cal Q}(G)$ such that the inequality $\psi(y)\geq 0$ defines a compact set in ${\mathbb{R}}^{n}$ .

Note that the Archimedean property implies that $S$ is compact but the converse is not necessarily true. However, for any compact set $S$ we can always force the associated quadratic module to be Archimedean by adding a redundant constraint $M-\|y\|^{2}_{2}\geq 0$ in the description of $S$ for sufficiently large $M$ .

Theorem 2.6

(Putinar, 1993, Putinar’s Positivstellensatz)* Suppose that $\mathbf{qmodule}(G)$ is Archimedean. If a polynomial $\psi\in{\mathbb{R}}[y]$ is positive on $S$ , then $\psi\in\mathbf{qmodule}_{k}(G)$ for some $k\in\mathbb{N}$ . *

For a polynomial $\psi(y)=\sum_{\alpha}\psi_{\alpha}y^{\alpha}\in{\mathbb{R}}[y]$ , define the norm

[TABLE]

We have the following result for an estimation of the order $k$ in Theorem 2.6.

Theorem 2.7

(Nie and Schweighofer, 2007, Theorem 6)* Suppose that $\mathbf{qmodule}(G)$ is Archimedean and $S\subseteq(-\tau_{S},\tau_{S})^{n}$ for some $\tau_{S}>0$ . Then there is some positive $c\in{\mathbb{R}}$ (depending only on $g_{j}$ ’s) such that for all $\psi\in{\mathbb{R}}[y]$ of degree $d$ with $\min_{y\in S}\psi(y)>0$ , we have $\psi\in\mathbf{qmodule}_{k}(G)$ whenever*

[TABLE]

We say that a linear functional $\mathscr{H}\in({\mathbb{R}}[y])^{*}$ has a representing measure $\mu$ if there exists a Borel measure $\mu$ on ${\mathbb{R}}^{n}$ such that

[TABLE]

For $k\in\mathbb{N}$ , we say $\mathscr{H}\in({\mathbb{R}}[y]_{k})^{*}$ has a representing measure $\mu$ if the above holds for all $\alpha\in\mathbb{N}^{n}_{k}$ .

A basic problem in the theory of moments concerns the characterization of linear functionals in $({\mathbb{R}}[y])^{*}$ which have some representing measure.

Theorem 2.8

(Berg and Maserick, 1984, Theorem 2.1)* Let $\mathscr{H}$ be a linear functional in $({\mathbb{R}}[y])^{*}$ such that $\mathscr{H}(q^{2})\geq 0$ for all $q\in{\mathbb{R}}[y]$ . If there exist $a,c>0$ such that $|\mathscr{H}(y^{\alpha})|\leq ca^{|\alpha|}$ for every $\alpha\in\mathbb{N}^{n}$ , then $\mathscr{H}$ has exactly one representing measure $\mu$ on ${\mathbb{R}}^{n}$ with support contained in $[-c,c]^{n}$ .*

Haviland (1935) proved that $\mathscr{H}\in({\mathbb{R}}[y])^{*}$ has a representing measure $\mu$ supported on $S$ in (2) if and only if $\mathscr{H}(h)\geq 0$ for every $h\in{\mathbb{R}}[y]$ nonnegative on $S$ . Clearly,

[TABLE]

Hence, in a dual view, Putinar’s Positivstellensatz reads

Theorem 2.9

(Putinar, 1993, Putinar’s Positivstellensatz)* Suppose that $\mathbf{qmodule}(G)$ is Archimedean. If $\mathscr{H}\in(\mathbf{qmodule}_{k}(G))^{*}$ for all $k\in\mathbb{N}$ , then $\mathscr{H}$ has a representing measure $\mu$ supported on $S$ . *

Let

[TABLE]

For $k\geq d_{S}$ , we have the following sufficient condition for a linear functional $\mathscr{H}\in(\mathbf{qmodule}_{k}(G))^{*}$ having representing measure supported on $S$ . Denote by $M_{k}(\mathscr{H})$ the $k$ -th moment matrix associated with a linear functional $\mathscr{H}\in({\mathbb{R}}[y]_{2k})^{*}$ , which is indexed by $\mathbb{N}^{n}_{k}$ , with $(\alpha,\beta)$ -th entry $\mathscr{H}(y^{\alpha+\beta})$ for $\alpha,\beta\in\mathbb{N}^{n}_{k}$ .

Condition 2.10

A linear functional $\mathscr{H}\in({\mathbb{R}}[y]_{2k})^{*}$ satisfies the flat extension condition when

[TABLE]

Theorem 2.11

(Curto and Fialkow, 2005, Theorem 1.1)* Suppose that $\mathscr{H}\in(\mathbf{qmodule}_{k}(G))^{*}$ satisfies the flat extension condition with $r:=\mbox{rank}M_{k}(\mathscr{H})$ , then $\mathscr{H}$ has a unique $r$ -atomic representing measure supported on $S$ . *

To end this section, let us recall a very interesting subclass of convex polynomials in ${\mathbb{R}}[x]$ introduced by Helton and Nie (2010).

Definition 2.12

(Helton and Nie (2010))* A polynomial $h\in{\mathbb{R}}[x]$ is s.o.s-convex if its Hessian $\nabla^{2}h$ is a s.o.s, i.e., there is some integer $r$ and some matrix polynomial $H\in{\mathbb{R}}[x]^{r\times m}$ such that $\nabla^{2}h(x)=H(x)^{T}H(x)$ .*

While checking the convexity of a polynomial is generally NP-hard (c.f. Ahmadi et al. (2013)), s.o.s-convexity can be checked numerically by solving an SDP, see Helton and Nie (2010). The following result plays a significant role in this paper.

Lemma 2.13

(Helton and Nie, 2010, Lemma 8)* Let $h\in{\mathbb{R}}[x]$ be s.o.s-convex. If $h(u)=0$ and $\nabla h(u)=0$ for some $u\in{\mathbb{R}}^{m}$ , then $h$ is s.o.s.*

3 Approximate semidefinite representations of $\mathbf{K}$

As we always assume that the index set $S$ in the definition of $\mathbf{K}$ is compact in this paper, we first show that a set $\mathbf{K}$ with a generic noncompact index set $S$ can be converted into a system with compact index set. Hereafter, by saying that a property holds for a generic index set $S$ , we mean that it holds for $S$ in the following sense. If we consider the space of all coefficients of generators $g_{j}$ ’s of all possible sets $S$ of form $(\ref{eq::S})$ in the canonical monomial basis of ${\mathbb{R}}[y]_{d}$ with $d=\max_{j}\deg(g_{j})$ , then coefficients of $g_{j}$ ’s of those index sets $S$ such that the property does not hold are in a Zariski closed set of the space.

3.1 Noncompact case

In this subsection, we consider the set $\mathbf{K}$ in (1) with noncompact index set $S$ . We used the technique of homogenization proposed in Wang and Guo (2013) to convert a semi-infinite system (1) with a generic noncompact index set into a system with compact index set.

For a polynomial $g(y)\in{\mathbb{R}}[y]$ , denote its homogenization by ${g}^{\text{hom}}(\tilde{y})\in{\mathbb{R}}[\tilde{y}]$ , where $\tilde{y}=(y_{0},y_{1},\ldots,y_{n})$ , i.e., ${g}^{\text{hom}}(\tilde{y})=y_{0}^{\deg(g)}g(y/y_{0})$ . For the basic semialgebraic set $S$ in $(\ref{eq::S})$ , define

[TABLE]

Proposition 3.1

(Wang and Guo, 2013, Proposition 4.2)* For any $g(y)\in{\mathbb{R}}[y]$ , $g(y)\geq 0$ on $S$ if and only if ${g}^{\text{hom}}(\tilde{y})\geq 0$ on ${\sf closure}(\widetilde{S}_{>})$ .*

Let $d_{y}:=\deg_{y}(p(x,y))$ and $p^{\text{hom}}(x,\tilde{y})$ be the homogenization of $p(x,y)$ with respect to the variables $y$ . It follows that the set $\mathbf{K}$ in $(\ref{eq::K})$ is equivalent to

[TABLE]

Replacing ${\sf closure}(\widetilde{S}_{>})$ by the basic semialgebraic set $\widetilde{S}$ , we get the following set

[TABLE]

It is obvious that $\widetilde{\mathbf{K}}\subseteq\mathbf{K}$ since ${\sf closure}(\widetilde{S}_{>})\subseteq\widetilde{S}$ .

Definition 3.2

(Nie (2013))* $S$ is said to be closed at $\infty$ if ${\sf closure}(\widetilde{S}_{>})=\widetilde{S}$ .*

Remark 3.3

Clearly, $\mathbf{K}=\widetilde{\mathbf{K}}$ when $S$ is closed at $\infty$ . Note that not every set $S$ of form $(\ref{eq::S})$ is closed at $\infty$ even when it is compact (Nie, 2012, Example 5.2). However, it is shown in (Wang and Guo, 2013, Theorem 4.10) that the closedness at $\infty$ is a generic property. It follows that $\mathbf{K}=\widetilde{\mathbf{K}}$ for generic index sets $S$ . Note that $\widetilde{S}_{>}$ depends only on $S$ , while $\widetilde{S}$ depends not only on $S$ but also on the choice of the inequalities $g_{1}(y)\geq 0,\ldots,g_{s}(y)\geq 0$ . In some cases, we can add some redundant inequalities in the description of $S$ to force it to be closed at $\infty$ (c.f. Guo et al. (2015)).**

For any polynomial $g(y)\in{\mathbb{R}}[y]$ , denote $\hat{g}(y)$ as its homogeneous part of the highest degree. Define

[TABLE]

In particular, denote $\hat{p}(x,y)$ as the homogeneous parts of $p(x,y)$ with respect to $y$ of the highest degree $d_{y}$ .

Definition 3.4

We say that the extended Slater condition holds for $\mathbf{K}$ if there exists a point $u\in{\mathbb{R}}^{m}$ of $\mathbf{K}$ such that $p(u,y)>0$ for all $y\in S$ and $\hat{p}(u,y)>0$ for all $y\in\widehat{S}$ . We call $u$ an extended Slater point of $\mathbf{K}$ .

Proposition 3.5

The Slater condition holds for $\widetilde{\mathbf{K}}$ if and only if the extended Slater condition holds for $\mathbf{K}$ .

Proof. Suppose that $u$ is an extended Slater point of $\mathbf{K}$ . For any $\tilde{v}=(v_{0},v)\in\widetilde{S}$ , we have $v\in\widehat{S}$ if $v_{0}=0$ and $v/v_{0}\in S$ otherwise. It is straightforward to verify that the Slater condition also holds for $\widetilde{\mathbf{K}}$ at $u$ .

Suppose that the Slater condition holds for $\widetilde{\mathbf{K}}$ at $u\in{\mathbb{R}}^{m}$ . For any point $v\in{\mathbb{R}}^{n}$ , we have $(0,v)\in\widetilde{S}$ if $v\in\widehat{S}$ and $\left(\frac{1}{\sqrt{1+\|v\|_{2}^{2}}},\frac{v}{\sqrt{1+\|v\|_{2}^{2}}}\right)\in\widetilde{S}$ if $v\in S$ . Then similarly, it implies that the extended Slater condition holds for $\mathbf{K}$ at $u$ . $\square$ As a result of the above arguments, it is reasonable to consider the following assumption in the rest of this paper.

Assumption 3.6

The set $S$ is compact, $-p(x,y)\in{\mathbb{R}}[x]$ is convex for any $y\in S$ and the Slater condition holds for $\mathbf{K}$ .

3.2 Approximate semidefinite representations of

$\mathbf{K}$

We assume that $\mathbf{K}$ in (1) is compact and a scalar $\tau_{\mathbf{K}}$ such that $\|x\|_{2}\leq\tau_{\mathbf{K}}$ for any $x\in\mathbf{K}$ is known. For $r\in\mathbb{N}$ , define

[TABLE]

It is clear that $\Theta_{r}(x)\leq 1$ for any $x\in\mathbf{K}$ and $r\in\mathbb{N}$ . Denote by $\mathbf{B}$ the unit ball in ${\mathbb{R}}^{m}$ . Recall the notation $d_{S}$ in (7) and let

[TABLE]

For $\mathscr{L}\in({\mathbb{R}}[x])^{*}$ (resp., $\mathscr{H}\in({\mathbb{R}}[y])^{*}$ ), denote by $\mathscr{L}(p(x,y))$ (resp., $\mathscr{H}(p(x,y))$ ) the image of $\mathscr{L}$ (resp., $\mathscr{H}$ ) on $p(x,y)$ regarded as an element in ${\mathbb{R}}[x]$ (resp., ${\mathbb{R}}[y]$ ) with coefficients in ${\mathbb{R}}[y]$ (resp., ${\mathbb{R}}[x]$ ), i.e., $\mathscr{L}(p(x,y))\in{\mathbb{R}}[y]$ (resp., $\mathscr{H}(p(x,y)))\in{\mathbb{R}}[x]$ ). Hence, some notation, like $\mathscr{H}(\mathscr{L}(p(x,y)))$ , should cause no confusion once the dual spaces where the linear fuctionals $\mathscr{L}$ and $\mathscr{H}$ come from are specified in the context.

Theorem 3.7

Suppose that $\mathbf{K}$ is compact. For any integers $r\geq\lceil d_{x}/2\rceil$ and $t\geq d_{\mathbf{K}}$ , define

[TABLE]

Then, $\Lambda_{r_{2},t}\subseteq\Lambda_{r_{1},t}$ for any $r_{2}>r_{1}\geq\lceil d_{x}/2\rceil$ and $\Lambda_{r,t_{2}}\supseteq\Lambda_{r,t_{1}}$ for any $t_{2}>t_{1}\geq d_{\mathbf{K}}$ . If Assumption 3.6 holds, then the following are true.

(i)

For any $\varepsilon>0$ , there exists an integer $r(\varepsilon)\geq\lceil d_{x}/2\rceil$ such that for every $r\geq r(\varepsilon)$ and $t\geq d_{\mathbf{K}}$ , it holds that $\Lambda_{r,t}\subseteq\mathbf{K}+\varepsilon\mathbf{B}$ . If $\mathbf{qmodule}(G)$ is Archimedean, then there exists integer $t(\varepsilon)\geq d_{\mathbf{K}}$ such that for every $r\geq\lceil d_{x}/2\rceil$ and $t\geq t(\varepsilon)$ , it holds that $\mathbf{K}\subseteq\Lambda_{r,t}+\varepsilon\mathbf{B}$ . Consequently, $\Lambda_{r,t}$ converges to $\mathbf{K}$ as $r$ and $t$ both tend to $\infty$ ; 2. (ii)

If the Lagrangian $L_{f}(x)$ as defined in $(\ref{eq::lag})$ is s.o.s for every linear $f\in{\mathbb{R}}[x]$ , then $\mathbf{K}\supseteq\Lambda_{r,t_{2}}\supseteq\Lambda_{r,t_{1}}$ for any $r\geq\lceil d_{x}/2\rceil$ , $t_{2}>t_{1}\geq d_{\mathbf{K}}$ . For any $\varepsilon>0$ , if moreover, $\mathbf{qmodule}(G)$ is Archimedean, then there exists integer $t(\varepsilon)\geq d_{\mathbf{K}}$ such that $\mathbf{K}\subseteq\Lambda_{r,t}+\varepsilon\mathbf{B}$ for any $r\geq\lceil d_{x}/2\rceil$ , $t\geq t(\varepsilon)$ . Consequently, $\Lambda_{r,t}$ converges to $\mathbf{K}$ as $t$ tends to $\infty$ for any $r\geq\lceil d_{x}/2\rceil$ .

Proof. For a fixed $x\in\Lambda_{r_{2},t}$ , there exists $\mathscr{L}\in({\mathbb{R}}[x]_{2r_{2}})^{*}$ satisfying conditions in (10) for $\Lambda_{r_{2},t}$ . Let $\mathscr{L}^{\prime}$ be the restriction of $\mathscr{L}$ on ${\mathbb{R}}[y]_{2r_{1}}$ . Then, it is clear that $\mathscr{L}^{\prime}$ satisfies all conditions in (10) for $\Lambda_{r_{1},t}$ and thus $x\in\Lambda_{r_{1},t}$ . Similarly, if $x\in\Lambda_{r,t_{1}}$ , then $x\in\Lambda_{r,t_{2}}$ for any $t_{2}>t_{1}\geq d_{\mathbf{K}}$ .

(i). Fix an $\varepsilon>0$ and a point $v\not\in\mathbf{K}+\varepsilon\mathbf{B}$ . Now we prove that there is some integer $r(\varepsilon)$ that does not depend on $v$ such that $v\not\in\Lambda_{r,t}$ for every $r\geq r(\varepsilon)$ and $t\geq d_{\mathbf{K}}$ , which implies that $\Lambda_{r,t}\subseteq\mathbf{K}+\varepsilon\mathbf{B}$ . By (Lasserre, 2009b, Lemma 5), there exist $a\in{\mathbb{R}}^{m}$ and $b=\min_{x\in\mathbf{K}}a^{T}x$ statisfying $\|a\|_{2}=1$ and $|b|\leq\tau_{\mathbf{K}}$ such that $a^{T}x-b\geq 0$ for any $x\in\mathbf{K}$ and $a^{T}v-b<-\varepsilon$ . Consider the optimization problem $\min_{x\in\mathbf{K}}a^{T}x-b$ . By Corollary 2.3, the associated Lagrangian $L_{a,b}(x):=a^{T}x-b-\sum_{j=1}^{l}\lambda_{j}p(x,y_{l})$ as defined in (5) is nonnegative on ${\mathbb{R}}^{m}$ for some $y_{1},\ldots,y_{l}\in S$ and nonnegative $\lambda_{1},\ldots,\lambda_{l}\in{\mathbb{R}}$ . In particular, $L_{a,b}$ is nonnegative on $[-\tau_{\mathbf{K}},\tau_{\mathbf{K}}]^{m}$ . By Theorem 2.4 (iii), there is some integer $r(\varepsilon)\geq\lceil d_{x}/2\rceil$ such that for any $r\geq r(\varepsilon)$ , it holds that

[TABLE]

for some $\tilde{q}\in{\mathbb{R}}[x]$ . As $r\geq r(\varepsilon)\geq\lceil d_{x}/2\rceil$ , we have $\deg(\tilde{q}^{2})\leq 2r$ . Now we show that $r(\varepsilon)$ does not depend on $v$ . According to (Lasserre and Netzer, 2007, Sec. 3.3), $r(\varepsilon)$ depends on $\varepsilon$ , the dimension $m$ and the size of $a$ , $b$ , $\lambda_{j}$ ’s and the coefficients $p(x,y_{j})$ regarded as polynomials in ${\mathbb{R}}[x]$ . Fix a Slater point $u_{0}\in\mathbf{K}$ , since $a^{T}u_{0}-b-\sum_{j=1}^{l}\lambda_{j}p(u_{0},y_{j})\geq 0$ , as proved in (Lasserre, 2009b, Lemma 7), we have

[TABLE]

where $p_{u_{0}}^{*}:=\min_{y\in S}p(u_{0},y)>0$ since $u_{0}$ is a Slater point and $S$ is compact. Write $p(x,y_{j})=\sum_{\alpha}p_{x,\alpha}(y_{j})x^{\alpha}$ , then $p_{x,\alpha}(y_{j})\leq\max_{\alpha}\max_{y\in S}p_{x,\alpha}(y)$ . Hence, all $a$ , $b$ , $\lambda_{j}$ ’s and $p_{x,\alpha}(y_{j})$ ’s are uniformly bounded, which means that $r(\varepsilon)$ does not depend on $v$ . For any $r\geq r(\varepsilon)$ and $t\geq d_{\mathbf{K}}$ , to the contrary, assume that $v\in\Lambda_{r,t}$ . Then, there exists $\mathscr{L}$ satisfying the conditions in (10) for $\Lambda_{r,t}$ with $\mathscr{L}(x_{i})=v_{i}$ . Let $\mu=\sum_{j=1}^{l}\lambda_{j}\delta_{y_{l}}$ where $\delta_{y_{l}}$ denotes the Dirac measure at $y_{l}$ . As $\deg(\tilde{q}^{2})\leq 2r$ , it holds that

[TABLE]

which is a contradiction. Thus, $v\not\in\Lambda_{r,t}$ and $\Lambda_{r,t}\subseteq\mathbf{K}+\varepsilon\mathbf{B}$ .

Fix a Slater point $u_{0}\in\mathbf{K}$ . Let $u\in\mathbf{K}$ be arbitrary. Now we first prove that there exist a point $\bar{u}\in\mathbf{K}$ and an integer $t(\varepsilon)$ that does not depend on $u$ (in fact, it depends on $\varepsilon,\mathbf{K},S,u_{0},p(x,y),g_{j}$ ’s) such that $\|u-\bar{u}\|_{2}\leq\varepsilon$ and $\bar{u}\in\Lambda_{r,t}$ for every $r\geq\lceil d_{x}/2\rceil$ and $t\geq t(\varepsilon)$ , which implies that $\mathbf{K}\subseteq\Lambda_{r,t}+\varepsilon\mathbf{B}$ . If $\|u-u_{0}\|_{2}\leq\varepsilon$ , then let $\bar{u}=u_{0}$ ; otherwise, let $\lambda=\varepsilon/{\|u_{0}-u\|_{2}}$ and $\bar{u}=\lambda u_{0}+(1-\lambda)u$ , then we have $1>\lambda\geq\frac{\varepsilon}{2\tau_{\mathbf{K}}}$ , $\|u-\bar{u}\|_{2}=\lambda\|u_{0}-u\|_{2}=\varepsilon$ and

[TABLE]

Let $\kappa(\varepsilon):=\min\{\frac{\varepsilon}{2\tau_{\mathbf{K}}},1\}$ . Then, in either case, it follows that

[TABLE]

for any $y\in S$ . Write $p(\bar{u},y)=\sum_{\beta}p_{y,\beta}(\bar{u})y^{\beta}\in{\mathbb{R}}[y]$ . Recall the norm defined in (6), then

[TABLE]

As $\mathbf{K}$ is compact, $N_{p}$ is well-defined. Note that $N_{p}$ does not depend on $u$ but only on $p$ and $\mathbf{K}$ . By Theorem 2.7, there exists come positive $c$ depending on $g_{j}$ ’s such that $p(\bar{u},y)\in\mathbf{qmodule}_{t}(G)$ whenever

[TABLE]

For any $r\geq\lceil d_{x}/2\rceil$ , define a linear functional $\mathscr{L}\in({\mathbb{R}}[x]_{2r})^{*}$ by $\mathscr{L}(x^{\alpha})=\bar{u}^{\alpha}$ for all $\alpha\in\mathbb{N}^{m}_{2r}$ . Then, it is clear that $\mathscr{L}(x_{i})=\bar{u}_{i}$ for $i=1,\ldots,m$ , $\mathscr{L}(\Theta_{k})\leq 1$ for $k=\lceil d_{x}/2\rceil,\ldots,r$ and $\mathscr{L}(q^{2})\geq 0$ for all $q\in{\mathbb{R}}[x]_{r}$ . We have $\mathscr{L}(p(x,y))=p(\bar{u},y)$ . It implies that $\bar{u}\in\Lambda_{r,t}$ and thus $\mathbf{K}\subseteq\Lambda_{r,t}+\varepsilon\mathbf{B}$ for every $r\geq\lceil d_{x}/2\rceil$ and $t\geq t(\varepsilon)$ .

(ii). By (i), we only need to prove $\Lambda_{r,t}\subseteq\mathbf{K}$ for any $r\geq\lceil d_{x}/2\rceil$ and $t\geq d_{\mathbf{K}}$ . Fix a point $v\not\in\mathbf{K}$ . By the Separation Theorem of convex sets, there exist $a\in{\mathbb{R}}^{m}$ and $b\in{\mathbb{R}}$ such that $a^{T}x-b\geq 0$ for any $x\in\mathbf{K}$ and $a^{T}v-b<0$ . As proved in (i), there are some $y_{1},\ldots,y_{l}\in S$ and nonnegative $\lambda_{1},\ldots,\lambda_{l}\in{\mathbb{R}}$ such that $a^{T}x-b-\sum_{j=1}^{l}\lambda_{j}p(x,y_{l})$ is nonnegative on ${\mathbb{R}}^{m}$ . Since the associated Lagrangian $L_{f}(x)$ is s.o.s for every linear function $f$ , we have

[TABLE]

for some $\tilde{q}\in{\mathbb{R}}[x]$ . To the contrary, assume that $v\in\Lambda_{r,t}$ . Then, there exist $\mathscr{L}$ satisfying the conditions in (10) for $\Lambda_{r,t}$ . Define $\mu$ as in (i). Like in (12), we get that

[TABLE]

which is a contradiction. Thus, $v\not\in\Lambda_{r,t}$ and hence $\Lambda_{r,t}\subseteq\mathbf{K}$ . $\square$

Remark 3.8

(i). According to the proof, the conclusions (i) and (ii) in Theorem 3.7 are still true if we simplify the condtion $\mathscr{L}(\Theta_{k})\leq 1,\ k=\lceil d_{x}/2\rceil,\ldots,r$ in $(\ref{eq::lambda})$ by $\mathscr{L}(\Theta_{r})\leq 1$ .

(ii). In practice, we can let $r=t$ in $\Lambda_{r,t}$ and approximate $\mathbf{K}$ by one sequence $\{\Lambda_{r,r}\}$ . Suppose that $\mathbf{qmodule}(G)$ is Archimedean, then by Theorem 3.7 (i), for any $\varepsilon>0$ , there exists $r\geq\max\{\lceil d_{x}/2\rceil,d_{\mathbf{K}}\}$ such that $\Lambda_{r,r}\subseteq\mathbf{K}+\varepsilon\mathbf{B}$ and $\mathbf{K}\subseteq\Lambda_{r,r}+\varepsilon\mathbf{B}$ . That is, $\{\Lambda_{r,r}\}$ can approximate $\mathbf{K}$ as closely as possible as $r$ increases.

(iii). If $S$ is compact but $\mathbf{qmodule}(G)$ is not Archimedean, then the set $\mathbf{qmodule}_{t}(G)$ in the definition of $\Lambda_{r,t}$ in $(\ref{eq::lambda})$ can be replaced by the $t$ -th order preordering in Schmüdgen’s representations of polynomials positive on $S$ (Schmüdgen (1991)). Moreover, if we have exact representations of polynomials nonnegative on $S$ in some cases, we may fix the order $t$ in $\Lambda_{r,t}$ and only let $r$ increase. Then, a sequence of nested outer approximate semidefinite representations of $\mathbf{K}$ can be obtained. For instance, consider the case

[TABLE]

By the representations of univariate polynomials nonnegative on an interval (c.f. Powers and Reznick (2000); Laurent (2009)), we can fix $t=d_{\mathbf{K}}$ and then the sequence $\Lambda_{r,d_{\mathbf{K}}}$ converges to $\mathbf{K}$ as $r$ tends to $\infty$ . We leave the details here to keep the paper clean.

(iv). If the Lagrangian $L_{f}(x)$ is s.o.s for every linear $f\in{\mathbb{R}}[x]$ , by the proof of Theorem 3.7 (ii), the condition $\mathscr{L}(\Theta_{k})\leq 1,\ k=\lceil d_{x}/2\rceil,\ldots,r$ is redundant and can be removed. In general, it may be difficult to check whether or not the Lagrangian $L_{f}(x)$ is s.o.s for every linear $f\in{\mathbb{R}}[x]$ . However, when $-p(x,y)$ is s.o.s-convex in $x$ for any $y\in S$ , by Corollary 2.3 and Lemma 2.13, $L_{f}(x)$ is indeed s.o.s for any s.o.s-convex $f\in{\mathbb{R}}[x]$ (in particular, for every linear $f\in{\mathbb{R}}[x]$ ). In particular, if $S$ is in the case (15) and $-p(x,y)$ is s.o.s-convex in $x$ for any $y\in S$ , then we have the exact semidefinite representation $\mathbf{K}=\Lambda_{r,t}$ for any $r\geq\lceil d_{x}/2\rceil$ and $t\geq d_{\mathbf{K}}$ . * $\hfill\square$ *

Note that the standard semidefinite representation (3) of $\Lambda_{r,t}$ can be easily generated using Yalmip (Löfberg (2004)). Moreover, for $m=2$ and $3$ , we can first generate the form (3) of $\Lambda_{r,t}$ and then draw it using the software package Bermeja Rostalski (2010).

Example 3.9

Now we present some illustrating examples. As we shall see, the approximate semidefinite representations defined in this section are very tight for some given sets $\mathbf{K}$ .

(1).

Consider the polynomial

[TABLE]

It is proved in Ahmadi and Parrilo (2012) that $f(x_{1},x_{2},1)\in{\mathbb{R}}[x_{1},x_{2}]$ is convex but not s.o.s-convex. Rotate the shape in the $(x_{1},x_{2})$ -plane defined by $f(x_{1},x_{2},1)\leq 100$ continuously around the origin by $90^{\circ}$ clockwise. Denote by $\mathbf{K}$ the common area of these shapes in this process. We illustrate $\mathbf{K}$ in the left of Figure 1. In other words, the set $\mathbf{K}$ is defined by

[TABLE]

where $p(x_{1},x_{2},y_{1},y_{2})=100-f(y_{1}x_{1}-y_{2}x_{2},y_{2}x_{1}+y_{1}x_{2},1)$ and

[TABLE]

It is clear that the assumptions in Theorem 3.7 holds for $\mathbf{K}$ and $d_{x}=d_{y}=8$ , $d_{\mathbf{K}}=4$ . By the software Bermeja, the semidefinite representation set $\Lambda_{4,4}$ as defined in $(\ref{eq::lambda})$ is drawn in gray bounded by the red curve in the right of Figure 1.

(2).

Consider the set

[TABLE]

where $p(x_{1},x_{2},y_{1},y_{2})=-x_{1}^{2}-2y_{2}x_{1}x_{2}-y_{1}x_{2}^{2}-x_{1}-x_{2}$ and

[TABLE]

We illustrate $\mathbf{K}$ in the left of Figure 2 by using some grid of $S$ . The Hessian matrix of $p$ with respect to $x_{1}$ and $x_{2}$ is

[TABLE]

Clearly, $-p(x_{1},x_{2},y_{1},y_{2})$ is s.o.s-convex in $(x_{1},x_{2})$ for every $y\in S$ . We have $d_{x}=2,d_{y}=1$ and $d_{\mathbf{K}}=1$ . The semidefinite representation set $\Lambda_{1,1}$ is drawn in gray bounded by the red curve in the right of Figure 2.

3.3 More discussions

Now we would like to interpret the semidefinite approximations $\Lambda_{r,t}$ for $\mathbf{K}$ in a dual view. We shall explain why these semidefinite approximations need two indices and whether or not we can approximate the convex hull of $\mathbf{K}$ in a similar way if the convexity in $x$ is removed from the constraints functions $-p(x,y)$ for $y\in S$ .

It is clear that the convex hull of a subset in ${\mathbb{R}}^{m}$ is the intersection of half spaces defined by hyperplanes tangent to this subset. Hence, to obtain semidefinite approximations of the convex hull of a subset in ${\mathbb{R}}^{m}$ , it is key to characterize linear functions nonnegative on the subset via s.o.s of polynomials. If $\mathbf{K}$ is defined by finitely many polynomial inequalities, a linear function $a^{T}x+b$ nonnegative on $\mathbf{K}$ can be represented by Putinar’s (or Schmüdgen’s) Positivstellensatz and convergent semidefinite approximations of $\mathbf{K}$ can be derived by increasing the degrees of s.o.s of polynomials invloved in the representation, see Lasserre (2009b); Gouveia et al. (2010). However, as the set $\mathbf{K}$ in our case is defined by infinitely many polynomial inqualities, the Positivstellensatz can not be directly used here. Nevertheless, when $-p(x,y)$ is convex in $x$ for all $y\in S$ and the assumptions in Corollary 2.3 hold, there exists a (atomic) measure $\mu$ supported on $S$ for each $a^{T}x+b$ such that the associated Lagrangian $L_{a,b}(x)=a^{T}x+b-\int p(x,y)\mathrm{d}\mu(y)\geq 0$ on ${\mathbb{R}}^{m}$ . Then, to obtain semidefinite approximations of $\mathbf{K}$ , we can use Lasserre’s s.o.s representation via high degree perturbations (Theorem 2.4) to characterize this inequality and the dual of Putinar’s Positivstellensatz (Theorem 2.9) to replace the unknown measure $\mu$ by a linear functional in $(\mathbf{qmodule}_{t}(G))^{*}$ . Consequently, in the dual, the resulting semdefinite approximations $\Lambda_{r,t}$ are defined in the way (10) and need two indices, i.e., one to bound the degree of the perturbation and the other to bound the order of the quadratic module.

From the above arguments, we can also see that if the convexity in $x$ is removed from $-p(x,y)$ for $y\in S$ , the convex hull of $\mathbf{K}$ can not be approximated as closely as possible in a way similarly as $\Lambda_{r,t}$ is defined. To see this, recall that even in the finitely many constraints case mentioned above, one need to increase the degree of s.o.s of polynomials involved in the Putinar’s (or Schmüdgen’s) representation of $a^{T}x+b$ to obtain convergent semidefinite approximations. In the infinitely many constraints case, to formulate the nonnegative Lagrangian $L_{a,b}$ , we need a measure $\mu$ for $a^{T}x+b$ to encode those active $y\in S$ (Corollary 2.3). As $\mu$ is unknown, we can not further parameterize unknown s.o.s of polynomials in the integral $\int p(x,y)\mathrm{d}\mu(y)$ and increase the degree; otherwise, bi-linearity occurs and thus semidefinite approximations can not be derived. Moreover, such a measure $\mu$ for $a^{T}x+b$ may not even exist if the convexity in $x$ is removed from $-p(x,y)$ . Therefore, it is still a challenge to construct semidefinite approximations of convex hull of semi-algebraic sets defined by infinitely many arbitrary polynomial inequalities. In Lasserre (2015); Magron et al. (2015), some tractable methods using semidefinite programs are proposed to approximate semi-algebraic sets defined with quantifiers. Clearly, the set $\mathbf{K}$ studied in this paper is in such case with a universal quantifiers. To end this section, we would like to point out the differences in methodology and contributions between the present paper and the above two references. The following is the basic idea of Lasserre (2015); Magron et al. (2015) to get approximations of $\mathbf{K}$ in (1). For $x\in{\mathbb{R}}^{m}$ , define the map $J_{p}(x):=\min_{y\in S}p(x,y)$ . Then, we have $\mathbf{K}=\{x\in{\mathbb{R}}^{m}\mid J_{p}(x)\geq 0\}$ . Suppose that $\mathbf{K}$ is contained in a compact set $\mathscr{B}$ in ${\mathbb{R}}^{m}$ , then it can be proved that there exsits a sequence of polynomials $\{q_{k}\}_{k}\subseteq{\mathbb{R}}[x]$ such that $q_{k}(x)\leq J_{p}(x)$ for all $x\in\mathscr{B}$ and $q_{k}$ converges to $J_{p}$ for the $L_{1}(\mathscr{B})$ -norm. Hence, $\mathbf{K}$ can be approximated by $\{x\in{\mathbb{R}}^{m}\mid q_{k}(x)\geq 0\}$ . As $p(x,y)-q_{k}(x)\geq 0$ for every $x\in\mathscr{B}$ and $y\in S$ , we can use Positivstellensatz in ${\mathbb{R}}[x,y]$ to reduce the problem of computing such a sequence $\{q_{k}\}_{k}$ to SDP problems. Therefore, the method in Lasserre (2015); Magron et al. (2015) works for $\mathbf{K}$ in a general form without requiring $-p(x,y)$ to be convex in $x$ and approximates $\mathbf{K}$ by a sequence of sublevel set of a single polynomial. Instead, we exploit the convexity of the defining polynomials of $\mathbf{K}$ and construct semidefinite approximations for it. Note that the polynomials $q_{k}$ ’s in method of Lasserre (2015); Magron et al. (2015) can be enforced to be convex for $\mathbf{K}$ in (1) (see (Lasserre, 2015, Section 4.2)), but the convergence and the semidefinitely representability of the sublevel sets are not clear to the best of our knowledge.

4 SDP relaxations of convex semi-infinite polynomial

programming

For a convex polynomial $f(x)\in{\mathbb{R}}[x]$ , consider the following convex semi-infinite polynomial programming problem

[TABLE]

Let $d_{P}:=\max\{\deg(f),d_{x}\}$ and $\mathcal{M}(S)$ be the set of all (nonnegative) Borel measures supported on $S$ .

4.1 General case

Consider the case when $\mathbf{K}$ is compact and Assumption 3.6 holds. In the following, we will obtain SDP relaxations of $(\mathbf{P})$ in two steps.

In the first step, for any integer $r\geq\lceil d_{P}/2\rceil$ , we convert $(\mathbf{P})$ to the problem

[TABLE]

For $\mathscr{L}\in({\mathbb{R}}[x]_{2r})^{*}$ , $\xi\geq 0$ , $\rho\in{\mathbb{R}}$ , $\eta\geq 0$ , $\mu\in\mathcal{M}(S)$ and $q\in{\mathbb{R}}[x]_{r}$ , consider the Lagrange dual function of (16):

[TABLE]

Then,

[TABLE]

Hence, the Lagrange dual problem of (16) reads

[TABLE]

Definition 4.1

We call $\mathscr{L}^{(r)}\in({\mathbb{R}}[x]_{2r})^{*}$ with $r\geq\lceil d_{P}/2\rceil$ a nearly optimal solution of $(\ref{eq::f*rdual})$ if $\mathscr{L}^{(r)}$ is feasible for $(\ref{eq::f*rdual})$ and the limit of $\mathscr{L}^{(r)}(f)$ is equal to the limit of the optimal values of $(\mathbf{P}^{*}_{r})$ as $r\rightarrow\infty$ .

Theorem 4.2

Suppose that $f(x)$ is convex, $\mathbf{K}$ is compact and Assumption 3.6 holds. Let $\mathscr{L}^{(r)}$ be a nearly optimal solution of $(\ref{eq::f*rdual})$ and $\mathscr{L}^{(r)}(x)=(\mathscr{L}^{(r)}(x_{1}),\ldots,\mathscr{L}^{(r)}(x_{m}))$ .

(i)

$f^{*}_{r}\leq f^{*}$ * and $f^{*}_{r}$ converges to $f^{*}$ as $r$ tends to $\infty$ ;* 2. (ii)

$f^{*}_{r}$ * is attainable in $(\ref{eq::f*r})$ and there is no dual gap between $(\ref{eq::f*r})$ and $(\ref{eq::f*rdual})$ ;* 3. (iii)

Assume that $\tau_{\mathbf{K}}=1$ $($ possibly after scaling $)$ . Then, for any convergent subsequence $\{\mathscr{L}^{(r_{i})}(x)\}_{i}$ of $\{\mathscr{L}^{(r)}(x)\}_{r}$ , $\lim_{i\rightarrow\infty}\mathscr{L}^{(r_{i})}(x)$ is a minimizer of $(\mathbf{P})$ . Consequently, if $u^{*}$ is the unique minimizer of $(\mathbf{P})$ , then $\lim_{r\rightarrow\infty}{\mathscr{L}^{(r)}(x)}=u^{*}$ ; 4. (iv)

If moreover, the Lagrangian $L_{f}(x)$ as defined in $(\ref{eq::lag})$ is s.o.s, then $f_{r}^{*}=f^{*}$ for any $r\geq\lceil d_{P}/2\rceil$ and it is also attainable in $(\ref{eq::f*rdual})$ .

Proof. (i) For any $x\in\mathbf{K}$ and $y\in S$ , we have $\Theta_{r}(x)\leq 1$ and $p(x,y)\geq 0$ . Consequently, for any feasible point $(\rho,\eta,\mu,q)$ of (16) and any $x\in\mathbf{K}$ , it holds that

[TABLE]

which implies that $f^{*}_{r}\leq f^{*}$ .

Conversely, by Corollary 2.3, there exist some $y_{1},\ldots,y_{l}\in S$ and nonnegative Lagrange multipliers $\lambda_{1},\ldots,\lambda_{l}\in{\mathbb{R}}$ such that

[TABLE]

where $\mu=\sum_{j=1}^{l}\lambda_{j}\delta_{y_{l}}\in\mathcal{M}(S)$ and $\delta_{y_{l}}$ is the Dirac measure at $y_{l}$ . For any fixed $r\in\mathbb{N}$ with $r\geq\lceil d_{P}/2\rceil$ , by Theorem 2.4 (i), there exists a $\varepsilon_{r}^{*}\geq 0$ such that

[TABLE]

if and only if $\eta\geq\varepsilon_{r}^{*}$ . It means that $(\ref{eq::f*r})$ is feasible and $f^{*}_{r}\geq f^{*}-2\varepsilon^{*}_{r}$ . Moreover, by Theorem 2.4 (ii), $\varepsilon_{r}^{*}$ decreasingly converges to [math] as $r$ tends to $\infty$ . It then follows that $f_{r}^{*}$ converges to $f^{*}$ as $r$ tends to $\infty$ .

(ii) Fix a Slater point $u$ of $\mathbf{K}$ . Since $S$ is compact, there exists a neighborhood $\mathcal{O}_{u}$ of $u$ such that every point in $\mathcal{O}_{u}$ is a Slater point of $\mathbf{K}$ . Let $\nu$ be the probability measure with uniform distribution in $\mathcal{O}_{u}$ and set $\mathscr{L}\in({\mathbb{R}}[x]_{2r})^{*}$ where $\mathscr{L}(x^{\alpha})=\int x^{\alpha}d\nu$ . It is easy to see that $\mathscr{L}$ is strictly admissible for (17). The conclusion follows due to the duality theory in convex optimization.

(iii) For any $r\geq\lceil d_{P}/2\rceil$ , as $\tau_{\mathbf{K}}=1$ and $\mathscr{L}^{(r)}(\Theta_{r})\leq 1$ , it is clear that $\mathscr{L}^{(r)}(x_{i}^{2r})\leq 1$ for all $i=1,\ldots,n$ . Since $\mathscr{L}(1)=1$ and $\mathscr{L}(q^{2})\geq 0$ for all $q\in{\mathbb{R}}[x]_{r}$ , we then deduce that $|\mathscr{L}^{(r)}(x^{\alpha})|\leq 1$ for any $|\alpha|\leq 2r$ by (Lasserre and Netzer, 2007, Lemma 4.1 and 4.3). Extend $\mathscr{L}^{(r)}\in({\mathbb{R}}[x]_{2r})^{*}$ to $({\mathbb{R}}[x])^{*}$ by letting $\mathscr{L}^{(r)}(x^{\alpha})=0$ for all $|\alpha|>2r$ and denote it by $\widetilde{\mathscr{L}}^{(r)}$ . Then, it holds that $\widetilde{\mathscr{L}}^{(r)}(x^{\alpha})\in[-1,1]$ for all $\alpha\in\mathbb{N}^{m}$ .

Let $\{\mathscr{L}^{(r_{i})}(x)\}_{i}$ be a convergent subsequence of $\{\mathscr{L}^{(r)}(x)\}_{r}$ . By Tychonoff’s theorem, there exists a convergent subsequence of the corresponding $\{\widetilde{\mathscr{L}}^{(r_{i})}(x^{\alpha})\mid\alpha\in\mathbb{N}^{m}\}_{i}$ in the product topology. Without loss of generality, we assume that the whole sequence $\{\widetilde{\mathscr{L}}^{(r_{i})}(x^{\alpha})\mid\alpha\in\mathbb{N}^{m}\}_{i}$ converges as $i\rightarrow\infty$ and denote by $\widetilde{\mathscr{L}}^{*}\in({\mathbb{R}}[x])^{*}$ the limit. From the pointwise convergence, we have $\widetilde{\mathscr{L}}^{*}(q^{2})\geq 0$ for all $q\in{\mathbb{R}}[x]$ and $\widetilde{\mathscr{L}}^{*}(x^{\alpha})\in[-1,1]$ for all $\alpha\in\mathbb{N}^{m}$ . By Theorem 2.8, $\widetilde{\mathscr{L}}^{*}$ has exactly one representing measure $\nu$ with support contained in $[-1,1]^{m}$ . Since $\mathscr{L}^{(r)}$ is nearly optimal solution of (17), we obtain $\widetilde{\mathscr{L}}^{*}(f)=\int fd\nu(x)=f^{*}$ by (i) and (ii). We have

[TABLE]

For any $\varepsilon>0$ , from the proof of Theorem 3.7 (i) and Remark 3.8 (i), it is easy to see that there exists an integer $r(\varepsilon)$ such that $\mathscr{L}^{(r_{i})}(x)\in\mathbf{K}+\varepsilon\mathbf{B}$ whenever $r_{i}\geq r(\varepsilon)$ . By the pointwise convergence, we deduce that $\widetilde{\mathscr{L}}^{*}(x)\in\mathbf{K}$ . Then, since $f$ is convex and $\widetilde{\mathscr{L}}^{*}$ has a representing measure, by Jensen’s inequality, $f^{*}\leq f(\widetilde{\mathscr{L}}^{*}(x))\leq\widetilde{\mathscr{L}}^{*}(f)=f^{*}$ . Hence, $\widetilde{\mathscr{L}}^{*}(x)$ is indeed a minimizer of (17).

Assume that $u^{*}$ is the unique minimizer of (17). We have shown that $\{\mathscr{L}^{(r)}(x)\}_{r}$ is contained in $[-1,1]^{m}$ and $\lim_{i\rightarrow\infty}{\mathscr{L}^{(r_{i})}(x)}=u^{*}$ for any convergent subsequence $\{\mathscr{L}^{(r_{i})}(x)\}_{i}$ , therefore the whole sequence $\{\mathscr{L}^{(r)}(x)\}_{r}$ converges to $u^{*}$ .

(iv) Under the assumption, (19) holds for $\eta=0$ and any $r\geq\lceil d_{P}/2\rceil$ . Hence, $f_{r}^{*}=f^{*}$ for any $r\geq\lceil d_{P}/2\rceil$ by the proof of (i). As $\mathbf{K}$ is compact, suppose that $f^{*}$ is attainable in $(\mathbf{P})$ at a minimizer $x^{*}\in\mathbf{K}$ . Define $\mathscr{L}^{*}\in({\mathbb{R}}[x]_{2r})^{*}$ by letting $\mathscr{L}^{*}(x^{\alpha})=(x^{*})^{\alpha}$ for all $\alpha\in\mathbb{N}^{m}_{2r}$ , then $f^{*}_{r}=f^{*}$ is attainable in (17) at $\mathscr{L}^{*}$ . $\square$

Consider the problem $(\mathbf{P}_{r})$ . The integration $\int_{S}\cdot d\mu(y)$ can be seen as a linear functional in $({\mathbb{R}}[y])^{*}$ . In the second step, to obtain SDP relaxations of $(\mathbf{P}_{r})$ , we need to characterize those linear functionals $\mathscr{H}\in({\mathbb{R}}[y])^{*}$ which have representing measures in $\mathcal{M}(S)$ . In a dual view, we need a representation of $\mathscr{L}(p(x,y))\in{\mathbb{R}}[y]$ in (17) which is nonnegative on $S$ . Here, Putinar Positivstellensatz (Theorem 2.6 and 2.9) comes into play.

For any $t\geq d_{\mathbf{K}}$ , consider the SDP relaxation of (16)

[TABLE]

Similar to (16), the Lagrange dual function of (20) is

[TABLE]

where $\mathscr{L}\in({\mathbb{R}}[x]_{2r})^{*}$ , $\xi\geq 0$ , $\rho\in{\mathbb{R}}$ , $\eta\geq 0$ , $\mathscr{H}\in(\mathbf{qmodule}_{t}(G))^{*}$ and $q\in{\mathbb{R}}[x]_{r}$ . Similar to the duality between (16) and (17), the Lagrange dual problem of (20) can be derived as

[TABLE]

Theorem 4.3

For any integer $r\geq\lceil d_{P}/2\rceil$ , the following are true.

(i)

If $\mathbf{qmodule}(G)$ is Archimedean and the Slater condition holds for $\mathbf{K}$ , then $f_{r,t}^{\mbox{\tiny psdp}}$ and $f_{r,t}^{\mbox{\tiny dsdp}}$ decreasingly converge to $f_{r}^{*}$ as $t$ tends to $\infty$ ; 2. (ii)

For some order $t\geq d_{\mathbf{K}}$ , if the flat extension condition holds for $\mathscr{H}^{*}$ in the solution $(\rho^{*},\eta^{*},\mathscr{H}^{*},q^{*})$ of $(\ref{eq::f*rt})$ , then $f_{r,t}^{\mbox{\tiny psdp}}=f_{r}^{*}$ ; 3. (iii)

If $S$ is in the case (15), then we have $f_{r,d_{\mathbf{K}}}^{\mbox{\tiny psdp}}=f_{r,d_{\mathbf{K}}}^{\mbox{\tiny dsdp}}=f_{r}^{*}$ .

Proof. (i) For any feasible point $(\rho,\eta,\mu,q)$ of (16), define $\mathscr{H}\in(\mathbf{qmodule}_{t}(G))^{*}$ by letting $\mathscr{H}(y^{\beta})=\int y^{\beta}d\mu$ for all $\beta\in\mathbb{N}^{n}_{2t}$ , then $(\rho,\eta,\mathscr{H},q)$ is feasible for (20) and hence $f_{r,t}^{\mbox{\tiny psdp}}\geq f^{*}_{r}$ for any $t\geq d_{\mathbf{K}}$ . Then by the weak duality and Theorem 4.2, we have $f^{*}_{r}\leq f_{r,t}^{\mbox{\tiny psdp}}\leq f_{r,t}^{\mbox{\tiny dsdp}}$ for any $t\geq d_{\mathbf{K}}$ . It is sufficient to prove that $\lim_{t\rightarrow\infty}f_{r,t}^{\mbox{\tiny dsdp}}=f_{r}^{*}$ .

Fixing an arbitrary $\varepsilon>0$ , we show that there is some $t\geq d_{\mathbf{K}}$ such that $0\leq f_{r,t}^{\mbox{\tiny dsdp}}-f_{r}^{*}\leq\varepsilon$ . Fix a Slater point $u$ of $\mathbf{K}$ and define $\mathscr{L}^{\prime}\in({\mathbb{R}}[x]_{2r})^{*}$ with $\mathscr{L}^{\prime}(x^{\alpha})=u^{\alpha}$ for all $\alpha\in\mathbb{N}^{m}_{2r}$ . Then $\mathscr{L}^{\prime}$ is feasible for (21) for some $t^{\prime}\geq d_{\mathbf{K}}$ by Putinar’s Positivstellensatz. If $\mathscr{L}^{\prime}(f)-f^{*}_{r}\leq\varepsilon$ , then $0\leq f_{r,t^{\prime}}^{\mbox{\tiny dsdp}}-f_{r}^{*}\leq\varepsilon$ . Next, we assume that $\mathscr{L}^{\prime}(f)-f^{*}_{r}>\varepsilon$ . Then, we can choose another feasible point $\overline{\mathscr{L}}$ of (17) such that $\mathscr{L}^{\prime}(f)-\overline{\mathscr{L}}(f)>0$ and $\overline{\mathscr{L}}(f)-f^{*}_{r}\leq\varepsilon/2$ . Let

[TABLE]

Then, we have $0<\delta<1$ and hence

[TABLE]

Hence, $\widehat{\mathscr{L}}$ is feasible for (21) for some $\hat{t}\geq d_{\mathbf{K}}$ by Putinar’s Positivstellensatz. We have

[TABLE]

As $\varepsilon$ is arbitrary, the conclusion follows.

(ii) Suppose that the flat extension condition holds for $\mathscr{H}^{*}$ in the solution $(\rho^{*},\eta^{*},\mathscr{H}^{*},q^{*})$ of (20) at some order $t\geq d_{\mathbf{K}}$ . Then, by Theorem 2.11, $\mathscr{H}^{*}$ admits some representing measure $\mu^{*}$ supported on $S$ . As $f_{r}^{*}\leq f_{r,t}^{\mbox{\tiny psdp}}$ and $(\rho^{*},\eta^{*},\mu^{*},q^{*})$ is feasible for (16), we conclude that $f_{r}^{*}=f_{r,t}^{\mbox{\tiny psdp}}$ .

(iii) By the proof of (i), the conclusion follows due to the representations of univariate polynomials nonnegative on an interval (c.f. Powers and Reznick (2000); Laurent (2009)) and Theorem 4.2 (ii). $\square$

Theorem 4.4

Suppose $f(x)$ is convex , $\mathbf{K}$ is compact and Assumption 3.6 holds. Then, for any $\varepsilon>0$ , the following are true.

(i)

There exists a $r(\varepsilon)\in\mathbb{N}$ such that $f^{\mbox{\tiny dsdp}}_{r,t}\geq f^{\mbox{\tiny psdp}}_{r,t}\geq f^{*}-\varepsilon$ holds for any $r\geq r(\varepsilon)$ and $t\geq d_{\mathbf{K}}$ ; 2. (ii)

If $\mathbf{qmodule}(G)$ is Archimedean and the Slater condition holds for $\mathbf{K}$ , then for any $r\geq\lceil d_{P}/2\rceil$ , there exists a $t(r,\varepsilon)\in\mathbb{N}$ such that $f^{\mbox{\tiny psdp}}_{r,t}\leq f^{\mbox{\tiny dsdp}}_{r,t}\leq f^{*}+\varepsilon$ holds for any $t\geq t(r,\varepsilon)$ ; 3. (iii)

If $S$ is in the case (15), we have $\lim_{r\rightarrow\infty}f_{r,d_{\mathbf{K}}}^{\mbox{\tiny psdp}}=\lim_{r\rightarrow\infty}f_{r,d_{\mathbf{K}}}^{\mbox{\tiny dsdp}}=f^{*}$ .

Proof. (i) It is clear that $f_{r}^{*}\leq f^{\mbox{\tiny psdp}}_{r,t}\leq f^{\mbox{\tiny dsdp}}_{r,t}$ holds for any $r\geq\lceil d_{P}/2\rceil$ and $t\geq d_{\mathbf{K}}$ . By Theorem 4.2 (i), there exists a $r(\varepsilon)\in\mathbb{N}$ such that $f^{*}_{r}\geq f^{*}-\varepsilon$ holds for any $r\geq r(\varepsilon)$ . Thus, (i) follows.

(ii) Due to Theorem 4.3 (i), for any $r\geq\lceil d_{P}/2\rceil$ , there exists a $t(r,\varepsilon)\in\mathbb{N}$ such that $f^{\mbox{\tiny psdp}}_{r,t}\leq f^{\mbox{\tiny dsdp}}_{r,t}\leq f^{*}_{r}+\varepsilon$ holds for any $t\geq t(r,\varepsilon)$ . Then (ii) follows since $f^{*}_{r}\leq f^{*}$ for any $r\geq\lceil d_{P}/2\rceil$ by Theorem 4.2 (i).

(iii) It is clear by Theorem 4.2 (i) and Theorem 4.3 (iii). $\square$

Remark 4.5

(i). Theorem 4.4 (i) and (ii) implies that we can approximate $f^{*}$ by $f^{\mbox{\tiny psdp}}_{r,t}$ and $f^{\mbox{\tiny dsdp}}_{r,t}$ as closely as possible with $r$ and $t$ both large enough. In practice, we can let $t=r$ and then $\lim_{r\rightarrow\infty}f^{\mbox{\tiny psdp}}_{r,r}=\lim_{r\rightarrow\infty}f^{\mbox{\tiny dsdp}}_{r,r}=f^{*}$ under the assumptions in Theorem 4.4 (i) and (ii).

(ii). Assume that $\tau_{\mathbf{K}}=1$ . By Theorem 4.3 $($ i $)$ , for any $r\geq\lceil d_{P}/2\rceil$ , there exists $t(r)\in\mathbb{N}$ such that $f_{r,t(r)}^{\mbox{\tiny dsdp}}\leq f_{r}^{*}+1/r$ . Denote by $\mathscr{L}^{(r,t(r))}$ a minimizer of $f_{r,t(r)}^{\mbox{\tiny dsdp}}$ , then $\{\mathscr{L}^{(r,t(r))}\}_{r}$ is a sequence of nearly optimal solutions of $(\ref{eq::f*rdual})$ and Theorem 4.2 (iii) holds for the corresponding sequence $\{\mathscr{L}^{(r,t(r))}(x)\}_{r}$ . In particular, when $(\mathbf{P})$ has a unique minizer $u^{*}$ and $r,t$ are large enough, we can expect that the point $\mathscr{L}^{(r,t)}(x)$ for any approximate solution $\mathscr{L}^{(r,t)}$ of $(\ref{eq::f*rtdual})$ lies in a small neighborhood of $u^{*}$ . * $\hfill\square$ *

4.2 S.O.S-Convex case

Recall Remark 3.8 (iv) and Theorem 4.2 (iv). We now strengthen Assumption 3.6 to

Assumption 4.6

The set $S$ is compact, $-p(x,y)\in{\mathbb{R}}[x]$ is s.o.s-convex for any $y\in S$ and the Slater condition holds for $\mathbf{K}$ .

If Assumption 4.6 holds and $f(x)$ is s.o.s-convex, then the Lagrangian $L_{f}(x)$ as defined in (5) is s.o.s according to Remark 3.8 (iv). Like in the general case, in the first step, we convert $(\mathbf{P})$ to

[TABLE]

For $\mathscr{L}\in({\mathbb{R}}[x]_{d_{P}})^{*}$ , $\rho\in{\mathbb{R}}$ , $\mu\in\mathcal{M}(S)$ and $q\in{\mathbb{R}}[x]_{\lfloor d_{P}/2\rfloor}$ , consider the Lagrange dual function of (22):

[TABLE]

Then,

[TABLE]

Hence, the Lagrange dual problem of (16) reads

[TABLE]

Theorem 4.7

*Assume that $f(x)$ is s.o.s-convex and Assumption 4.6 holds, then the following are true. *

(i)

The optimal values of $(\ref{eq::sosconvexp})$ and $(\ref{eq::sosconvexd})$ are both equal to $f^{*}$ which is attainable in $(\ref{eq::sosconvexp})$ . Moreover, if $f^{*}$ is attainable in $(\mathbf{P})$ , then so it is in $(\ref{eq::sosconvexd})$ ; 2. (ii)

If $\mathscr{L}^{*}$ is a minimizer of $(\ref{eq::sosconvexd})$ , then $\mathscr{L}^{*}(x)=(\mathscr{L}^{*}(x_{1}),\ldots,\mathscr{L}^{*}(x_{m}))$ is a minimizer of $(\mathbf{P})$ .

Proof. (i) Denote by $f^{*}_{\mbox{\tiny sos}}$ the optimal value of (22). Since Assumption 3.6 holds, recalling (18), there exists $\mu\in\mathcal{M}(S)$ such that $L_{f}(x)=f(x)-f^{*}-\int_{S}p(x,y)d\mu\geq 0$ for all $x\in{\mathbb{R}}^{m}$ . Note that the degree of $L_{f}(x)$ is even and at most $2\lfloor d_{P}/2\rfloor$ . As $L_{f}$ is s.o.s, it holds that

[TABLE]

which means that (22) is feasible and $f^{*}_{\mbox{\tiny sos}}\geq f^{*}$ . For any $x\in\mathbf{K}$ and feasible point $(\rho,\mu,q)$ of (22), it holds that $f(x)-\rho\geq 0$ which implies that $f^{*}_{\mbox{\tiny sos}}\leq f^{*}$ . Consequently, we have $f^{*}_{\mbox{\tiny sos}}=f^{*}$ . Since (23) is strictly feasible (see the proof of Theorem 4.2 (ii)), (22) has an optimal solution and there is no dual gap between (22) and (23).

Suppose that $f^{*}$ is attainable in $(\mathbf{P})$ at a minimizer $u^{*}\in\mathbf{K}$ . Define $\mathscr{L}^{*}\in({\mathbb{R}}[x]_{d_{P}})^{*}$ by letting $\mathscr{L}^{*}(x^{\alpha})=(u^{*})^{\alpha}$ for all $\alpha\in\mathbb{N}^{m}_{d_{P}}$ , then $f^{*}$ is attainable in (23) at $\mathscr{L}^{*}$ .

(ii) Compare the feasible set of (23) with the definition of $\Lambda_{r,t}$ in (10) and recall the proof of Theorem 3.7 (ii). Note that to show $\Lambda_{r,t}\subseteq\mathbf{K}$ for any $r\geq\lceil d_{x}/2\rceil$ and $t\geq d_{\mathbf{K}}$ , the constraints $\mathscr{L}(\Theta_{k})\leq 1$ in (10) are redundant. Moreover, the inequality (14) still holds for $\mathscr{L}$ feasible to (23). Hence, we can obtain that $\mathscr{L}^{*}(x)\in\mathbf{K}$ . As $f(x)$ is s.o.s-convex, by (Lasserre, 2009a, Theorem 2.6), the extension of Jensen’s inequality $f(\mathscr{L}^{*}(x))\leq\mathscr{L}^{*}(f)=f^{*}$ holds, which implies that $\mathscr{L}^{*}(x)$ is a minimizer of $(\mathbf{P})$ . $\square$

In the same way as we derive the SDP relaxations (20) and (21) from (16) and (17), we next obtain corresponding SDP relaxations of (22) and (23) as

[TABLE]

and its dual

[TABLE]

Theorem 4.8

*Assume that $f(x)$ is s.o.s-convex and Assumption 4.6 holds, then the following are true. *

(i)

If $\mathbf{qmodule}(G)$ is Archimedean and the Slater condition holds for $\mathbf{K}$ , then $\lim_{t\rightarrow\infty}f_{t}^{\mbox{\tiny psdp}}=\lim_{t\rightarrow\infty}f_{t}^{\mbox{\tiny dsdp}}=f^{*}$ ; 2. (ii)

For some order $t\geq d_{\mathbf{K}}$ , if the flat extension condition holds for $\mathscr{H}^{*}$ in the solution $(\rho^{*},\mathscr{H}^{*},q^{*})$ of $(\ref{eq::sosconvext})$ , then $f_{t}^{\mbox{\tiny psdp}}=f^{*}$ ; 3. (iii)

Let $\{\mathscr{L}^{(t)}\}_{t}$ be a sequence of nearly optimal solutions of $(\ref{eq::sosconvextdual})$ and $\mathscr{L}^{(t)}(x)=(\mathscr{L}^{(t)}(x_{1}),\ldots,\mathscr{L}^{(t)}(x_{m}))$ . For any convergent subsequence $\{\mathscr{L}^{(t_{i})}(x)\}_{i}$ of $\{\mathscr{L}^{(t)}(x)\}_{t}$ , $\lim_{i\rightarrow\infty}\mathscr{L}^{(t_{i})}(x)$ is a minimizer of $(\mathbf{P})$ . Consequently, if $\{\mathscr{L}^{(t)}(x)\}_{t}$ is bounded and $u^{*}$ is a unique minimizer of $(\mathbf{P})$ , then $\lim_{t\rightarrow\infty}{\mathscr{L}^{(t)}(x)}=u^{*}$ . 4. (iv)

If $S$ is in the case (15), then we have $f^{\mbox{\tiny psdp}}_{d_{\mathbf{K}}}=f^{\mbox{\tiny dsdp}}_{d_{\mathbf{K}}}=f^{*}$ . If $(\mathbf{P})$ is solvable, then $u^{*}$ is a minimizer of $(\mathbf{P})$ if and only if there exists a minimizer $\mathscr{L}^{*}$ of $(\ref{eq::sosconvextdual})$ with $t=d_{\mathbf{K}}$ such that $\mathscr{L}^{*}(x)=u^{*}$ .

Proof. (i) and (ii): See Theorem 4.7 (i) and the proofs of Theorem 4.3 (i) and (ii).

(iii): Since $f(x)$ is s.o.s-convex, due to the extended Jensen’s inequality (Lasserre, 2009a, Theorem 2.6), it holds that $f(\mathscr{L}^{(t)}(x))\leq\mathscr{L}^{(t)}(f)$ and therefore $f(\lim_{t\rightarrow\infty}\mathscr{L}^{(t)}(x))\leq\lim_{t\rightarrow\infty}\mathscr{L}^{(t)}(f)=f^{*}$ . From the proofs of Theorem 3.7 (ii) and Theorem 4.7 (ii), it is easy to see that the sequence $\{\mathscr{L}^{(t)}(x)\}\subset\mathbf{K}$ and hence $\lim_{t\rightarrow\infty}\mathscr{L}^{(t)}(x)\in\mathbf{K}$ . Thus, $\lim_{i\rightarrow\infty}\mathscr{L}^{(t_{i})}(x)$ is a minimizer of $(\mathbf{P})$ .

(iv): By Theorem 4.7 (i) and the weak duality, it holds that $f^{*}\leq f^{\mbox{\tiny psdp}}_{t}\leq f^{\mbox{\tiny dsdp}}_{t}$ for any $t\geq d_{\mathbf{K}}$ . For any $\varepsilon>0$ , there exists a point $u^{(\varepsilon)}\in\mathbf{K}$ such that $f(u^{(\varepsilon)})\leq f^{*}+\varepsilon$ . Define $\mathscr{L}^{\varepsilon}\in({\mathbb{R}}[x]_{d_{P}})^{*}$ by letting $\mathscr{L}^{\varepsilon}(x^{\alpha})=(u^{(\varepsilon)})^{\alpha}$ for all $\alpha\in\mathbb{N}^{m}_{d_{P}}$ . By the representations of univariate polynomials nonnegative on an interval (c.f. Powers and Reznick (2000); Laurent (2009)), $\mathscr{L}^{(\varepsilon)}$ is feasible to (25) with $t=d_{\mathbf{K}}$ , which implies that $f^{\mbox{\tiny dsdp}}_{d_{\mathbf{K}}}\leq f^{*}+\varepsilon$ . Since $\varepsilon$ is abitrary, it holds that $f^{\mbox{\tiny psdp}}_{d_{\mathbf{K}}}=f^{\mbox{\tiny dsdp}}_{d_{\mathbf{K}}}=f^{*}$ .

Clearly, we only need to prove the “if” part. Since Assumption 4.6 holds, from the proofs of Theorem 3.7 (ii) and Theorem 4.7 (ii), we have $\mathscr{L}^{*}(x)\in\mathbf{K}$ . As $f(x)$ is s.o.s-convex, due to the extended Jensen’s inequality (Lasserre, 2009a, Theorem 2.6), it holds that $f^{*}\leq f(\mathscr{L}^{*}(x))\leq\mathscr{L}^{*}(f)=f^{*}$ . Thus, $\mathscr{L}^{*}(x)$ is a minimizer of $(\mathbf{P})$ . $\square$

Remark 4.9

(i) Note that we do not require $\mathbf{K}$ to be compact in Theorem 4.7 and 4.8.

(ii) In the special case when $f(x),\ p(x,y)$ are linear in $x$ for every $y\in S$ , the SDP relaxation (24) agrees with the SDP relaxation of generalized problems of moments proposed in Lasserre (2008). * $\hfill\square$ *

Example 4.10

Now we consider two convex semi-infinite polynomial programming problems using the sets $\mathbf{K}$ defined in Example 3.9 (1) and (2). Notice that the constraints in the dual SDP relaxations $(\ref{eq::f*rtdual})$ and $(\ref{eq::sosconvextdual})$ can be easily generated by Yalmip. Hence, we solve the following problems using these corresponding dual SDP relaxations, which can also give us some informations on the minimizers of the problems.

(1).

Recall the sets $\mathbf{K}$ and $S$ defined in Example 3.9 (1) where the polynomial $p(x_{1},x_{2},y_{1},y_{2})\in{\mathbb{R}}[x_{1},x_{2}]$ is convex but not s.o.s-convex for every $y\in S$ . Ahmadi and Parrilo (2013)* constructed a polynomial*

[TABLE]

$($ see (Ahmadi and Parrilo, 2013, (5.2)) $)$ which is convex but not s.o.s-convex. In order to illustrate the efficiency of the SDP relaxations $(\ref{eq::f*rtdual})$ better, we shift and scale $\tilde{f}$ to define $f(x_{1},x_{2}):=\tilde{f}(x_{1}-1,x_{2}-1)/10000$ , which is still convex but not s.o.s-convex. Then, consider the problem $\min_{x\in\mathbf{K}}f(x_{1},x_{2})$ , where $d_{P}=d_{x}=8$ . Letting $r=t=4$ , we get $f_{4,4}^{\mbox{\tiny dsdp}}=0.15234$ achieved at $\mathscr{L}^{(4,4)}$ and an approximate minimizer $\mathscr{L}^{(4,4)}(x)=(0.4245,0.6373)$ . To show the accuracy of the solution, we draw some contoure lines of $f$ , including $f(x_{1},x_{2})=0.15234$ , and mark the point $\mathscr{L}^{(4,4)}(x)$ by red ‘ $+$ ’ in Figure 3 (left). As we can see, the line $f(x_{1},x_{2})=0.15234$ is almost tangent to $\mathbf{K}$ at the point $\mathscr{L}^{(4,4)}(x)$ . 2. (2).

Recall the sets $\mathbf{K}$ and $S$ defined in Example 3.9 (2). Let $f(x_{1},x_{2}):=(x_{1}-1)^{2}+x_{2}^{2}$ , i.e., the square of the distance function of a point to $(1,0)$ , and consider the problem $\min_{x\in\mathbf{K}}f(x_{1},x_{2})$ . Then, the polynomials $f(x_{1},x_{2})$ and $-p(x_{1},x_{2},y_{1},y_{2})$ for all $y\in S$ are s.o.s-convex. As $d_{\mathbf{K}}=1$ , solving the SDP relaxation $(\ref{eq::sosconvextdual})$ with $t=1$ , we get $f_{1}^{\mbox{\tiny dsdp}}=0.80942$ achieved at $\mathscr{L}^{(1)}$ and an approximate minimizer $\mathscr{L}^{(1)}(x)=(0.1311,-0.2335)$ . The corresponding contoures and the point $\mathscr{L}^{(1)}(x)$ are shown in Figure 3 (right).

To end this section, we compare our SDP relaxation method for the convex semi-infinite polynomial programming problem $(\mathbf{P})$ with the approach given in Wang and Guo (2013), which can also solve $(\mathbf{P})$ via SDP relaxations. In fact, the method proposed in Wang and Guo (2013) is based on the exchange scheme and works for semi-infinite polynomial programming problems without requiring convexity.

Generally speaking, given a finite subset $S_{k}\subseteq S$ in an iteration, one obtains at least one global minimizer $u^{(k)}$ of $f(x)$ under the associated finitely many constraints and then compute the global minimum $p^{k}$ and minimizers $y^{(1)},\ldots,y^{(l)}$ of $p(u^{(k)},y)$ over $S$ . If $p^{k}\geq 0$ , stop; otherwise, update $S_{k+1}=S_{k}\cup\{y^{(1)},\ldots,y^{(l)}\}$ and proceed to the next iteration. Therefore, to guarantee the success of the exchange method, the subproblems in each iteration need to be globally solved and at least one minimizer of each subproblem can be extracted. The subproblems can be solved by Lasserre’s SDP relaxation method and minimizers can be extracted when the flat extension condition holds.

However, Lasserre’s SDP relaxation method for the lower level subproblem of minimizing $p(u^{(k)},y)$ over $S$ does not necessarily have finite convergence. Even it does, a minimizer for the lower level subproblem could not be extracted. In particular, when there are infinitely many minimizers, the flat extension condition fails (c.f. (Laurent, 2009, Sec. 6.6)). For polynomial optimization problems with generic coefficients data, according to (Nie, 2014, Theorem 1.2) and (Nie and Ranestad, 2009, Proposition 2.1), there are finitely many minimizers and Lasserre’s SDP relaxation method has finite convergence. However, even the coefficients data in $(\mathbf{P})$ is generic, we are not clear about the success of the method in Wang and Guo (2013) applied to $(\mathbf{P})$ . It is because in the lower level subproblems, the coefficients of $p(u^{(k)},y)$ depend on the solutions of the upper level subproblem of the same interation.

For example, consider the problem

[TABLE]

It is easy to see that the feasible set is $\{x\in{\mathbb{R}}^{2}\mid x_{1}+x_{2}-1\geq 0,\ 1-x_{2}\geq 0\}$ and the minimizer is $(\frac{1}{2},\frac{1}{2})$ . If we choose the intial set $S_{0}=\{(0,0,0)\}$ , then upper level subproblem has a unique minimizer $u^{(0)}=(0,0)$ . Then for the lower level subproblem of minimizing $p(u^{(0)},y)=-y_{1}^{2}-y_{2}^{2}-y_{3}^{2}+1$ over $S$ , it clear that the solution set is $\{y\in{\mathbb{R}}^{3}\mid y_{1}^{2}+y_{2}^{2}=1,y_{3}^{2}=1\}$ which is infinite and the flat extension condition does not apply for Lasserre’s SDP relaxations. As none of the minimizers can be extracted, the method in Wang and Guo (2013) fails for this problem. Since the objective functions is s.o.s-convex and Assumption 4.6 holds, we can solve the above problem by our SDP relaxations $(\ref{eq::sosconvextdual})$ . Let $t=1$ , we get $f_{1}^{\mbox{\tiny dsdp}}=0.2500$ achieved at $\mathscr{L}^{(1)}$ and an approximate minimizer $\mathscr{L}^{(1)}(x)=(0.5000,0.5000)$ .

Acknowledgments

The authors are very grateful for the comments of two anonymous referees which helped to improve the presentation. The first author was supported by the Chinese National Natural Science Foundation under grants 11401074, 11571350. The second author was supported by the Chinese National Natural Science Foundation under grant 11801064, the Foundation of Liaoning Education Committee under grant LN2017QN043.

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ahmadi and Parrilo (2013) Ahmadi, A., Parrilo, P., 2013. A complete characterization of the gap between convexity and sos-convexity. SIAM Journal on Optimization 23 (2), 811–833.
2Ahmadi et al. (2013) Ahmadi, A. A., Olshevsky, A., Parrilo, P. A., Tsitsiklis, J. N., 2013. NP-hardness of deciding convexity of quartic polynomials and related problems. Mathematical Programming 137 (1), 453–476.
3Ahmadi and Parrilo (2012) Ahmadi, A. A., Parrilo, P. A., Oct 2012. A convex polynomial that is not sos-convex. Mathematical Programming 135 (1), 275–292.
4Belousov (1977) Belousov, E., 1977. Introduction to Convex Analysis and Integer Programming. Moscow University Publ.: Moscow.
5Ben-Tal and Nemirovski (2001) Ben-Tal, A., Nemirovski, A., 2001. Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. MOS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics.
6Berg and Maserick (1984) Berg, C., Maserick, P. H., 1984. Exponentially bounded positive definite functions. Illinois Journal of Mathematics 28, 162–179.
7Bertsekas (2009) Bertsekas, D. P., 2009. Convex optimization theory. Athena Scientific.
8Bochnak et al. (1998) Bochnak, J., Coste, M., Roy, M.-F., 1998. Real Algebraic Geometry. Springer.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

On semi-infinite systems of convex polynomial inequalities and

Abstract

keywords:

MSC:

1 Introduction

2 Notation and Preliminaries

Definition 2.1

Theorem 2.2

Corollary 2.3

Theorem 2.4

Definition 2.5

Theorem 2.6

Theorem 2.7

Theorem 2.8

Theorem 2.9

Condition 2.10

Theorem 2.11

Definition 2.12

Lemma 2.13

3 Approximate semidefinite representations of K\mathbf{K}K

3.1 Noncompact case

Proposition 3.1

Definition 3.2

Remark 3.3

Definition 3.4

Proposition 3.5

Assumption 3.6

3.2 Approximate semidefinite representations of

Theorem 3.7

Remark 3.8

Example 3.9

3.3 More discussions

4 SDP relaxations of convex semi-infinite polynomial

4.1 General case

Definition 4.1

Theorem 4.2

Theorem 4.3

Theorem 4.4

Remark 4.5

4.2 S.O.S-Convex case

Assumption 4.6

Theorem 4.7

Theorem 4.8

Remark 4.9

Example 4.10

Acknowledgments

3 Approximate semidefinite representations of $\mathbf{K}$