New Dependencies of Hierarchies in Polynomial Optimization

Adam Kurpisz; Timo de Wolff

arXiv:1903.04996·cs.DS·March 13, 2019

New Dependencies of Hierarchies in Polynomial Optimization

Adam Kurpisz, Timo de Wolff

PDF

TL;DR

This paper compares four hierarchies for solving constrained polynomial optimization problems, revealing their relationships and limitations, especially regarding SOS, SDSOS, SONC, and Sherali Adams hierarchies, with implications for Positivstellensatz results.

Contribution

It establishes new dependencies and incomparabilities among these hierarchies for general and Boolean hypercube cases, including polynomial equivalences and containment results.

Findings

01

SONC and SOS hierarchies are polynomially incomparable.

02

SDSOS is contained within SONC hierarchy.

03

Schm"udgen-like versions of SDSOS*, SONC*, and SA* are polynomially equivalent.

Abstract

We compare four key hierarchies for solving Constrained Polynomial Optimization Problems (CPOP): Sum of Squares (SOS), Sum of Diagonally Dominant Polynomials (SDSOS), Sum of Nonnegative Circuits (SONC), and the Sherali Adams (SA) hierarchies. We prove a collection of dependencies among these hierarchies both for general CPOPs and for optimization problems on the Boolean hypercube. Key results include for the general case that the SONC and SOS hierarchy are polynomially incomparable, while SDSOS is contained in SONC. A direct consequence is the non-existence of a Putinar-like Positivstellensatz for SDSOS. On the Boolean hypercube, we show as a main result that Schm\"udgen-like versions of the hierarchies SDSOS*, SONC*, and SA* are polynomially equivalent. Moreover, we show that SA* is contained in any Schm\"udgen-like hierarchy that provides a O(n) degree bound.

Equations88

min f (x),

min f (x),

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{G}}\ :=\ \{g_{0}:=1,g_{1},\ldots,g_{m}:\leavevmode\nobreak\ g_{i}\in\mathbb{R}[\boldsymbol{x}]\text{ for all }\leavevmode\nobreak\ i\in[m]\}

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{G}}\ :=\ \{g_{0}:=1,g_{1},\ldots,g_{m}:\leavevmode\nobreak\ g_{i}\in\mathbb{R}[\boldsymbol{x}]\text{ for all }\leavevmode\nobreak\ i\in[m]\}

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{G}_{+}}\ :=\ \{\boldsymbol{x}\in\mathbb{R}^{n}\leavevmode\nobreak\ |\leavevmode\nobreak\ g(\boldsymbol{x})\geq 0\text{ for all }{g\in\mathcal{G}}\}\subseteq\mathbb{R}^{n}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{G}_{+}}\ :=\ \{\boldsymbol{x}\in\mathbb{R}^{n}\leavevmode\nobreak\ |\leavevmode\nobreak\ g(\boldsymbol{x})\geq 0\text{ for all }{g\in\mathcal{G}}\}\subseteq\mathbb{R}^{n}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{K}({\mathcal{G}_{+}})}\ :=\ \{f\in\mathbb{R}[\boldsymbol{x}]\ |\ f(\boldsymbol{x})\geq 0\text{ for all }\boldsymbol{x}\in\mathcal{G}_{+}\}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{K}({\mathcal{G}_{+}})}\ :=\ \{f\in\mathbb{R}[\boldsymbol{x}]\ |\ f(\boldsymbol{x})\geq 0\text{ for all }\boldsymbol{x}\in\mathcal{G}_{+}\}.

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{f}_{\mathcal{G}}^{*}}\ :=\ \min\{f(\boldsymbol{x})\leavevmode\nobreak\ |\leavevmode\nobreak\ \boldsymbol{x}\in\mathcal{G}_{+}\}\ =\ \max\{\lambda\in\mathbb{R}\leavevmode\nobreak\ |\leavevmode\nobreak\ f-\lambda\in\mathcal{K}({\mathcal{G}_{+}})\}.\end{aligned}

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{f}_{\mathcal{G}}^{*}}\ :=\ \min\{f(\boldsymbol{x})\leavevmode\nobreak\ |\leavevmode\nobreak\ \boldsymbol{x}\in\mathcal{G}_{+}\}\ =\ \max\{\lambda\in\mathbb{R}\leavevmode\nobreak\ |\leavevmode\nobreak\ f-\lambda\in\mathcal{K}({\mathcal{G}_{+}})\}.\end{aligned}

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Prep}(\mathcal{G})}\ :=\ \left\{\sum_{i=0}^{s}c_{i}g_{0}^{k_{i,0}}\cdots g_{m}^{k_{i,m}}\ \middle|\ c_{i}\in\mathbb{R}_{\geq 0},g_{j}\in\mathcal{G},k_{i,j}\in\mathbb{N}\right\}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Prep}(\mathcal{G})}\ :=\ \left\{\sum_{i=0}^{s}c_{i}g_{0}^{k_{i,0}}\cdots g_{m}^{k_{i,m}}\ \middle|\ c_{i}\in\mathbb{R}_{\geq 0},g_{j}\in\mathcal{G},k_{i,j}\in\mathbb{N}\right\}.

\displaystyle\small{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{GEN}})}\ :=\ \left\{\sum_{i=1}^{m}s_{i}g_{i}\subseteq\mathbb{R}[\boldsymbol{x}]\ \middle|\ s_{i}\in\mathbb{GEN},\ g_{i}\in\mathcal{G},\text{ and }\deg(s_{i}g_{i})\leq 2d\text{ for all }i\in[m]\right\}.

\displaystyle\small{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{GEN}})}\ :=\ \left\{\sum_{i=1}^{m}s_{i}g_{i}\subseteq\mathbb{R}[\boldsymbol{x}]\ \middle|\ s_{i}\in\mathbb{GEN},\ g_{i}\in\mathcal{G},\text{ and }\deg(s_{i}g_{i})\leq 2d\text{ for all }i\in[m]\right\}.

GEN_{G}^{2 d}

GEN_{G}^{2 d}

GEN_{Prep (G)}^{2 d}

GEN_{Prep (G)}^{2 d}

SOS_{G}^{2 d}

SOS_{G}^{2 d}

E_{μ} [h] := x \in {0, 1}^{n} \sum μ (x) \cdot h (x),

E_{μ} [h] := x \in {0, 1}^{n} \sum μ (x) \cdot h (x),

\overline{SOS}_{G}^{2 d}

\overline{SOS}_{G}^{2 d}

0 \leq E_{μ} [h^{2}] = E_{μ} I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d^{^{'}} \sum v_{I} v_{J} k \in I \prod x_{k} k \in J \prod x_{k} g = I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d^{^{'}} \sum v_{I} v_{J} E_{μ} [g \cdot k \in I \cup J \prod x_{k}] = v^{⊤} M_{g}^{2 d} v,

0 \leq E_{μ} [h^{2}] = E_{μ} I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d^{^{'}} \sum v_{I} v_{J} k \in I \prod x_{k} k \in J \prod x_{k} g = I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d^{^{'}} \sum v_{I} v_{J} E_{μ} [g \cdot k \in I \cup J \prod x_{k}] = v^{⊤} M_{g}^{2 d} v,

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{\overline{f}}_{\mathbb{SOS},\mathcal{G}}^{2d}}\ :=\ \quad\min\left\{\widetilde{\mathbb{E}}_{\mu}[f]\leavevmode\nobreak\ \middle|\leavevmode\nobreak\ \mu:\{0,1\}^{n}\to\mathbb{R}\leavevmode\nobreak\ \widetilde{\mathbb{E}}_{\mu}[1]=1,\leavevmode\nobreak\ M_{g}^{2d}\succeq 0,\leavevmode\nobreak\ \text{for all }g\in\mathcal{G}\right\}.\end{aligned}

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{\overline{f}}_{\mathbb{SOS},\mathcal{G}}^{2d}}\ :=\ \quad\min\left\{\widetilde{\mathbb{E}}_{\mu}[f]\leavevmode\nobreak\ \middle|\leavevmode\nobreak\ \mu:\{0,1\}^{n}\to\mathbb{R}\leavevmode\nobreak\ \widetilde{\mathbb{E}}_{\mu}[1]=1,\leavevmode\nobreak\ M_{g}^{2d}\succeq 0,\leavevmode\nobreak\ \text{for all }g\in\mathcal{G}\right\}.\end{aligned}

SDSOS_{G}^{2 d}

SDSOS_{G}^{2 d}

M = i, j \in [m] \sum c_{ij} w_{ij} w_{ij}^{⊤}

M = i, j \in [m] \sum c_{ij} w_{ij} w_{ij}^{⊤}

s (z) = z^{⊤} D Q D z = z^{⊤} D I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d \sum c_{I J} v_{I J} v_{I J}^{⊤} D z = I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d \sum c_{I J} (z^{⊤} D v_{I J})^{2},

s (z) = z^{⊤} D Q D z = z^{⊤} D I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d \sum c_{I J} v_{I J} v_{I J}^{⊤} D z = I, J \subseteq [n] ∣ I ∣, ∣ J ∣ \leq d \sum c_{I J} (z^{⊤} D v_{I J})^{2},

\overline{SDSOS}_{G}^{2 d}

\overline{SDSOS}_{G}^{2 d}

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{\overline{f}}_{\mathbb{SDSOS},\mathcal{G}}^{2d}}\ =\ \quad\min_{\mu:\{0,1\}^{n}\rightarrow\mathbb{R}}\{\widetilde{\mathbb{E}}_{\mu}[f]\leavevmode\nobreak\ |\leavevmode\nobreak\ \widetilde{\mathbb{E}}_{\mu}[1]=1,\leavevmode\nobreak\ {M_{g}^{2d}}_{\big{|}I}\succeq 0,\leavevmode\nobreak\ \text{for all }g\in\mathcal{G},|I|=2\}.\end{aligned}

\displaystyle\begin{aligned} {\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{\overline{f}}_{\mathbb{SDSOS},\mathcal{G}}^{2d}}\ =\ \quad\min_{\mu:\{0,1\}^{n}\rightarrow\mathbb{R}}\{\widetilde{\mathbb{E}}_{\mu}[f]\leavevmode\nobreak\ |\leavevmode\nobreak\ \widetilde{\mathbb{E}}_{\mu}[1]=1,\leavevmode\nobreak\ {M_{g}^{2d}}_{\big{|}I}\succeq 0,\leavevmode\nobreak\ \text{for all }g\in\mathcal{G},|I|=2\}.\end{aligned}

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SA}}\ :=\ \left\{h\in\mathbb{R}[\boldsymbol{x}]\leavevmode\nobreak\ \middle|\leavevmode\nobreak\ h=\sum_{I,J}\alpha_{I,J}\cdot\boldsymbol{x}_{I}\overline{\boldsymbol{x}_{J}},\leavevmode\nobreak\ \alpha_{I,J}\in\mathbb{R}_{\geq 0},\leavevmode\nobreak\ I,J\subseteq[n]\right\}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SA}}\ :=\ \left\{h\in\mathbb{R}[\boldsymbol{x}]\leavevmode\nobreak\ \middle|\leavevmode\nobreak\ h=\sum_{I,J}\alpha_{I,J}\cdot\boldsymbol{x}_{I}\overline{\boldsymbol{x}_{J}},\leavevmode\nobreak\ \alpha_{I,J}\in\mathbb{R}_{\geq 0},\leavevmode\nobreak\ I,J\subseteq[n]\right\}.

SA_{G}^{2 d}

SA_{G}^{2 d}

\overline{SA}_{G}^{2 d}

\overline{SA}_{G}^{2 d}

SONC_{G}^{2 d}

SONC_{G}^{2 d}

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}f(\mathbf{x})}\leavevmode\nobreak\ :=\leavevmode\nobreak\ f_{\boldsymbol{\beta}}\mathbf{x}^{\boldsymbol{\beta}}+\sum_{j=0}^{r}f_{\boldsymbol{\alpha}(j)}\mathbf{x}^{\boldsymbol{\alpha}(j)},

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}f(\mathbf{x})}\leavevmode\nobreak\ :=\leavevmode\nobreak\ f_{\boldsymbol{\beta}}\mathbf{x}^{\boldsymbol{\beta}}+\sum_{j=0}^{r}f_{\boldsymbol{\alpha}(j)}\mathbf{x}^{\boldsymbol{\alpha}(j)},

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\Theta_{f}}\ :=\ \prod_{j=0}^{r}\left(\frac{f_{\boldsymbol{\alpha}(j)}}{\lambda_{j}}\right)^{\lambda_{j}}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\Theta_{f}}\ :=\ \prod_{j=0}^{r}\left(\frac{f_{\boldsymbol{\alpha}(j)}}{\lambda_{j}}\right)^{\lambda_{j}}.

{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SONC}}\ :=\ \left\{h=\sum_{i=1}^{k}\mu_{i}p_{i}\leavevmode\nobreak\ |\leavevmode\nobreak\ p_{i}\text{ is a nonnegative circuit polynomial},\leavevmode\nobreak\ \mu_{i}\geq 0,\leavevmode\nobreak\ k\in\mathbb{N}^{*}\right\}.

{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SONC}}\ :=\ \left\{h=\sum_{i=1}^{k}\mu_{i}p_{i}\leavevmode\nobreak\ |\leavevmode\nobreak\ p_{i}\text{ is a nonnegative circuit polynomial},\leavevmode\nobreak\ \mu_{i}\geq 0,\leavevmode\nobreak\ k\in\mathbb{N}^{*}\right\}.

V (N_{n}) = {x \in R^{n} : ∣∣ x ∣ ∣_{1} = 1} .

V (N_{n}) = {x \in R^{n} : ∣∣ x ∣ ∣_{1} = 1} .

min N_{n} such that g_{1}, \dots, g_{s} \geq 0

min N_{n} such that g_{1}, \dots, g_{s} \geq 0

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}M_{n}}\ :=\ 1+\sum_{j=1}^{n}\mathbf{x}^{2(\mathbf{e}+\mathbf{e}_{j})}-(n+1)\mathbf{x}^{\mathbf{e}}.

\displaystyle{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}M_{n}}\ :=\ 1+\sum_{j=1}^{n}\mathbf{x}^{2(\mathbf{e}+\mathbf{e}_{j})}-(n+1)\mathbf{x}^{\mathbf{e}}.

New (M_{n}) \cap (2 Z)^{n} = Vert (New (M_{n})) \cup {2 e} .

New (M_{n}) \cap (2 Z)^{n} = Vert (New (M_{n})) \cup {2 e} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

New Dependencies of Hierarchies in Polynomial Optimization

Adam Kurpisz

Adam Kurpisz, ETH Zürich, Department of Mathematics, Rämistrasse 101, 8092 Zürich, Switzerland

[email protected]

and

Timo de Wolff

Timo de Wolff, Technische Universität Berlin, Institut für Mathematik, Sekr. MA 6-2, Straße des 17. Juni 136, 10623 Berlin, Germany

[email protected]

Abstract.

We compare four key hierarchies for solving Constrained Polynomial Optimization Problems (CPOP): Sum of Squares (SOS), Sum of Diagonally Dominant Polynomials (SDSOS), Sum of Nonnegative Circuits (SONC), and the Sherali Adams (SA) hierarchies. We prove a collection of dependencies among these hierarchies both for general CPOPs and for optimization problems on the Boolean hypercube. Key results include for the general case that the (SONC) and (SOS) hierarchy are polynomially incomparable, while (SDSOS) is contained in (SONC). A direct consequence is the non-existence of a Putinar-like Positivstellensatz for SDSOS. On the Boolean hypercube, we show as a main result that Schmüdgen-like versions of the hierarchies SDSOS∗, SONC∗, and SA∗ are polynomially equivalent. Moreover, we show that SA∗ is contained in any Schmüdgen-like hierarchy that provides a $O(n)$ degree bound.

Key words and phrases:

Hierarchy, nonnegativity, polynomial comparable, polynomial optimization, Sherali Adams, sum of diagonally dominant polynomials, sum of nonnegative circuit polynomials, sum of squares

2010 Mathematics Subject Classification:

Primary: 14P10, 68Q25, 90C60; Secondary: 14Q20; ACM Subject Classification: Theory of computation $\rightarrow$ Proof complexity Theory of computation $\rightarrow$ Linear programming, Semidefinite programming, Convex optimization

1. Introduction

A Constrained Polynomial Optimization Problem (CPOP) is of the form

[TABLE]

where $f(\mathbf{x})$ and $g_{1}(\mathbf{x}),\ldots,g_{m}(\mathbf{x})$ are $n$ -variate real polynomials. Solving CPOP is a crucial nonconvex optimization problem, which lies at the core of both theoretical and applied computer science. A special case of CPOP is a Binary Constrained Polynomial Optimization Problem (BCPOP) where the polynomials $\pm(x_{i}^{2}-x_{i})$ are among the polynomials defining the feasibility set. Many important optimization problems belong to the BCPOP class. However, solving these is NP-hard in general.

A CPOP can be equivalently seen as the problem of maximizing a real $\lambda$ such that $f-\lambda$ is nonnegative over the semialgebraic set defined by the polynomials $g_{1}(\mathbf{x}),\ldots,g_{m}(\mathbf{x})$ . This is an interesting perspective since various techniques form real algebraic geometry provide methods for certifying nonnegativity of a real polynomial over semialgebraic sets. The class of such theorems is called Positivstellensätze. These theorems state that, under some assumptions, a polynomial $f$ , which is positive (or nonnegative) over the feasibility set, can be expressed in a particular algebraic way. Typically, this algebraic expression is a sum of nonnegative polynomials from a chosen ground set of nonnegative polynomials multiplied by the polynomials defining the feasibility set. Choosing a proper ground set of nonnegative polynomials is crucial from the perspective of optimization. Ideally, both testing membership in the ground set and deciding nonnegativity of a polynomial in the ground set should be efficiently doable. Moreover, fixing the maximum degree of polynomials in the ground sets, used for a representation of $f$ , provides a family of algorithms parameterized by an integer $d$ , which gives a sequence of lower bounds for the value of CPOP. If the ground set of polynomials is chosen properly, then the sequence of lower bounds converges in $d$ to the optimal value of CPOP.

One of the most successful approaches for constructing theoretically efficient algorithms is the Sum of Squares (SOS) method [GV01, Nes00, Par00, Sho87], known as Lasserre relaxation [Las01]. The method relies on Putinar’s Positivstellensatz [Put93] using sum of squares of polynomials as the ground set. Finding a degree $d$ SOS certificate for nonnegativity of $f$ can be performed by solving a semidefinite programming (SDP) formulation of size $\binom{n+d}{d}^{O(1)}$ . Finally, for every (feasible) $n$ -variate hypercube optimization problem, with constraints of degree at most $d$ , there exists a degree $2(n+\lceil d/2\rceil)$ SOS certificate, see e.g., [BS16].

The SOS algorithm is a frontier method in algorithm design. It was used to provide the best available algorithms for a variety of combinatorial optimization problems. The Lovász $\theta$ -function [Lov79] for the Independent Set problem is implied by the SOS algorithm of degree 2. Moreover, the Goemans-Williamson relaxation [GW95] for the Max Cut problem and the Goemans-Linial relaxation for the Sparsest Cut problem (analyzed in [ARV09]) can be obtained by the SOS algorithm of degree 2 and 6, respectively. SOS was also proven to be a successful method for Maximum Constraint Satisfaction problems (Max CSP). For Max CSP, the SOS algorithm is as powerful as any SDP relaxation of comparable size [LRS15]. Furthermore, SOS was applied to problems in dictionary learning [BKS15, SS17], tensor completion and decomposition [BM16, HSSS16, PS17], and robust estimation [KSS18]. For other applications of the SOS method see e.g., [BRS11, BCG09, Chl07, CS08, CGM13, dlVKM07, GS11, MM09, Mas17, RT12], and the surveys [CT12, Lau03, Lau09].

From a practical perspective however, solving SDP problems is known to be very time consuming. Moreover, from a theoretical point of view, it is an open problem whether an SDP of size $n^{O(d)}$ can be solved in time $n^{O(d)}$ [O’D16, RW17]. Hence, various methods have been proposed to choose different ground sets of polynomials to make a resulting problem easier to solve, but still effective.

In [AM14] Ahmadi and Majumdar propose an algorithmic framework by choosing the ground set of polynomials to be scaled diagonally-dominant polynomials (SDSOS). SDSOS polynomials can be seen as the binomial squares. Thus, the SDSOS algorithm is not stronger than the SOS algorithm. However, searching for a degree- $d$ SDSOS certificate can be performed using Second Order Conic Program (SOCP) of size $\binom{n+d}{d}^{O(1)}$ ; see [AM14]. Since, in practice, an SOCP can be solved much faster than an SDP, the algorithm attracted a lot of attention and has been used to solve problems in Robotics and Control [AMT14, Leo18, PP15, SA16, ZFP18], Option Pricing [AM14], Power Flow [KGNSZ18, SSTL18], and Discrete Geometry [DL16].

An alternative approach, that is a more tractable method than the SOS, was initiated by Sherali and Adams in [SA90]. The technique was first introduced as a method to tighten the Linear Program (LP) relaxations for BCPOP problems and for such settings finding the degree $d$ certificate can be done by solving an LP of size $\binom{n+d}{d}^{O(1)}$ . The Sherali Adams (SA) algorithm arises from using the set of polynomials depending on at most $d$ variables, which are nonnegative on the Boolean hypercube. These polynomials are called $d$ -juntas. The SA algorithm was used to construct some of the most prominent algorithms with good asymptotic running time in combinatorial optimization [CLRS16, LR16, TZ17], logic [AM13], and other fields of computer science.

Finally, a method independent from SOS was introduced in [IdW16] using Sum of Nonnegative Circuit Polynomials (SONC) as a ground set. These polynomials form a full dimensional cone in the cone of nonnegative polynomials, which is not contained in the SOS cone. For example, the well-known Motzkin polynomial is a nonnegative circuit polynomial, but not an SOS. Moreover, SONCs generalize polynomials which are certified to be nonnegative via the arithmetic-geometric mean inequality [Rez89]. SONC certificates of degree $d$ can be computed via a convex optimization program called * Relative Entropy Programming (REP)* of size $\binom{n+d}{d}^{O(1)}$ [DIdW17, Theorem 5.3]; see also [CS16, CMW18]. Recently, an experimental comparison of SONC with the SOS method for unconstrained optimization was presented in [SdW18].

For all presented algorithms, one can define a potentially stronger algorithm without changing the corresponding ground set of polynomials, by using a more general construction for the certificate of nonnegativity. Such a certificate expresses a polynomial, which is nonnegative over a given semialgebraic set, as a sum of polynomials from the ground set multiplied by the product of polynomials defining the semialgebraic set; see section 2 for further details. We call the resulting systems ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}SOS^{*}}$ , ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}SDSOS^{*}}$ , ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}SA^{*}}$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}SONC^{*}}$ . Some of these extensions were intensively studied in the literature, see e.g., [GHP02, Wor15].

Our Results

In this paper, we provide an extensive comparison of the presented semialgebraic proof systems. More precisely, following the definitions in e.g., [BFI*+*18], we analyze their polynomial comparability:

Definition 1.1.

Let $P$ and $Q$ be semialgebraic proof systems. $P$ contains $Q$ if for every semialgebraic set $\mathcal{G}_{+}$ and a polynomial $f$ admitting a degree $d$ certificate of nonnegativity over $\mathcal{G}_{+}$ in $Q$ , $f$ admits also a degree $O(d)$ certificate in $P$ . System $P$ strictly contains $Q$ if $P$ contains $Q$ but $Q$ does not contain $P$ . Systems $P$ and $Q$ are polynomially equivalent if $P$ contains $Q$ and $Q$ contains $P$ . Finally, systems $P$ and $Q$ are polynomially incomparable if neither $P$ contains $Q$ nor $Q$ contains $P$ .

∎

For a more detailed definition of proof systems and their comparability, see subsection 2.5.

In this article, we show the dependencies between the proof systems presented in figure 1.

In particular, in section 3, we show that for general CPOP problems the SOS proof system is polynomially incomparable with the SONC proof system. We also proved that the same relation holds for SOS∗ and SONC∗ proof systems; see corollary 3.7. So far, it was only known that the cones of SOS and SONC polynomials are not contained in each other [IdW16, Proposition 7.2] however, it has no direct implication on the relation between the SOS and the SONC methods for the CPOP optimization. Similarly, in a very recent result [CMW18], the authors point out that the SONC cone contains SDSOS cone. In this paper, in section 4, we extend this result for CPOP problems by proving, that SONC certificate strictly contains the SDSOS certificate and the same relation holds for SONC∗ and SDSOS∗ certificates; see corollary 4.3. As a consequence, we conclude that there exists no Putinar-like Positivstellensatz for SDSOS; see corollary 4.4.

For the BCPOP we provide a general, sufficient condition for the proof system to contain $SA^{*}$ proof system, see theorem 5.1. This combined with the results from subsection 5.2, and subsection 5.3 proves the polynomial equivalence of $SONC^{*}$ , $SDSOS^{*}$ , and $SA^{*}$ on the Boolean hypercube. Moreover, by proving some properties of SONC, SDSOS, and SA polynomials in lemma 4.1, and lemma 5.6, we prove additional dependencies between the hierarchies in subsection 5.2, subsection 5.3, and subsection 5.4.

We remark that all results in this article concern the minimal degrees for certificates in a particular proof system as these are the standard way to measure the complexity of algorithms in theoretical computer science. Our results do not directly imply a particular behaviour of actual runtimes in an experimental setting, as these depend on various further factors other than the degree.

Acknowledgements

AK is supported by SNSF project PZ00P2 $\_$ 174117, TdW is supported by the DFG grant WO 2206/1-1.

2. Preliminaries

In this section, we introduce the proof systems used in this article. Moreover, for the sake of clarity, we provide dual formulations for some of the presented proof systems for the BCPOP case. We begin with introducing basic notation. For any $n,d\in\mathbb{N}$ we denote ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}[n]}=\{1,\ldots,n\}$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\binom{n}{\leq d}}\leavevmode\nobreak\ :=\leavevmode\nobreak\ \sum_{i=0}^{d}\binom{n}{i}$ . Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{N}^{*}}=\mathbb{N}\setminus\{\mathbf{0}\}$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{R}_{\geq 0}}$ ( ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{R}_{>0}}$ ) be the set of nonnegative (positive) real numbers. Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{R}[\boldsymbol{x}]}=\mathbb{R}[x_{1},\ldots,x_{n}]$ be the ring of * $n$ -variate real polynomials* and for every $f\in\mathbb{R}[\boldsymbol{x}]$ we define the * real zero set* as ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{V}(f)}=\{\boldsymbol{x}\in\mathbb{R}^{n}\ |\ f(\boldsymbol{x})=0\}$ . We denote the * Newton polytope* of $f$ by ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{New}(f)}$ and the vertices of $\operatorname{New}(f)$ by ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Vert}\left(\operatorname{New}(f)\right)}$ . A lattice point is called even if it is in $(2\mathbb{N})^{n}$ , and a term $f_{\boldsymbol{\alpha}}\mathbf{x}^{\boldsymbol{\alpha}}$ is called a monomial square if $f_{\boldsymbol{\alpha}}\in\mathbb{R}_{\geq 0}$ and $\alpha$ is even.

In what follows we introduce different proof systems and their notation. Next to the specific sources that we provide later in the section, we refer the reader to introductory literature like [BPT13, Lau09, Las15, Mar08] on the mathematical side, and [Raz16, Rot13] on the computer science side. Moreover, we fix the notation

[TABLE]

for a set of polynomials. Throughout the paper we assume that the cardinality ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}m}$ of the set $\mathcal{G}$ is polynomial in the size of $n$ . For a given $\mathcal{G}$ , we define the corresponding * semi-algebraic set*

[TABLE]

Furthermore, for any given semialgebraic set $\mathcal{G}_{+}\subseteq\mathbb{R}^{n}$ , we consider the set of nonnegative polynomials with respect to $\mathcal{G}_{+}$

[TABLE]

For a given $f\in\mathbb{R}[\boldsymbol{x}]$ and a set of constraints $\mathcal{G}$ , we define the corresponding constrained polynomial optimization problem (CPOP) as (see e.g., [BV04])

[TABLE]

Hence, $\mathcal{G}_{+}$ corresponds to the feasibility region of the program (CPOP).

The problem (CPOP) is NP-hard in general. Thus, one chooses proper subsets $\mathcal{K}({\mathcal{G}_{+}})^{{}^{\prime}}\subseteq\mathcal{K}({\mathcal{G}_{+}})$ such that, on the one hand, the corresponding polynomial optimization problem provides a lower bound on the value of (CPOP) and on the other hand, is computationally tractable. Such subsets are called * certificates of nonnegativity*. The choice of a suitable certificate of nonnegativity is crucial for obtaining a good lower bound for the problem (CPOP).

Let us be more specific. For a given $\mathcal{G}$ the induced * preprime* is given by

[TABLE]

Note that $\mathcal{K}({\mathcal{G}_{+}})=\mathcal{K}({\operatorname{Prep}(\mathcal{G})_{+}})$ . Throughout the paper we assume that for a given $\mathcal{G}$ ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}m_{2d}}$ is the cardinality of the set $\operatorname{Prep}(\mathcal{G})$ restricted to polynomials of degree at most $2d$ . In order to relax (CPOP) to a finite size optimization problem we introduce polynomial hierarchies.

Definition 2.1.

Let $\mathcal{G}$ be a collection of polynomials and let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{GEN}}$ be a subset of $\mathcal{K}({\mathcal{G}_{+}})$ . We define the following degree $d\in\mathbb{N}$ depending hierarchy of certificates of nonnegativity:

[TABLE]

In several contexts it is more useful to consider the preprime of the constraints, i.e., ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{GEN}})}$ . Every such hierarchy of polynomials yields a sequence of lower bounds given by the following optimization program:

[TABLE]

∎

Throughout this paper we assume that the set $\mathcal{G}$ is chosen such that $\operatorname{Prep}(G)$ is Archimedean, a property which is e.g., implied by the compactness of $\mathcal{G}_{+}$ . In what follows we occasionally enforce compactness of $\operatorname{Prep}(G)$ by adding box constraints ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}l_{i}}:=x_{i}\pm N\geq 0$ to $\mathcal{G}$ with $N\in\mathbb{N}$ sufficiently large for $i\in[n]$ .

Under this assumption we obtain from Krivine’s general Positivstellensatz [Kri64a, Kri64b], see also [Mar08, Theorem 5.4.4], the following Schmüdgen-type Positivstellensatz; see [Sch02, Theorem 5.1]:

Theorem 2.2.

Let $\operatorname{Prep}(\mathcal{G})$ be Archimedean and let $\mathbb{GEN}\subseteq\mathcal{K}({\mathcal{G}_{+}})$ such that $\mathbb{GEN}$ is closed under addition. Let $f(\boldsymbol{x})>0$ for all $\boldsymbol{x}\in\mathcal{G}_{+}$ . Then there exists a $d\in\mathbb{N}$ such that ${f}_{\mathbb{GEN},\operatorname{Prep}(\mathcal{G})}^{2d}={f}_{\mathcal{G}}^{*}$ .

For the SOS hierarchy this theorem was first shown by Schmüdgen in [Sch91].

In the following subsections we introduce some of the most prominent inner approximations of the cone $\mathcal{K}({\mathcal{G}_{+}})$ .

2.1. Sum of Squares

The * SOS method* approximates the cone $\mathcal{K}({\mathcal{G}_{+}})$ by using the set of * sum of square polynomials* instead of the entire set of nonnegative polynomials. Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SOS}}\ :=\ \{s\leavevmode\nobreak\ |\leavevmode\nobreak\ s=\sum_{i=1}^{k}f_{i}^{2},\leavevmode\nobreak\ f\in\mathbb{R}[\boldsymbol{x}],\leavevmode\nobreak\ k\in\mathbb{N}^{*}\}$ be the set of (finite) sum of square polynomials (SOS). The SOS program of degree $2d$ takes the following form:

[TABLE]

analogously the SOS∗ program of degree $2d$ takes the form ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{\mathbb{SOS}}_{\operatorname{Prep}(\mathcal{G})}^{2d}}$ . For the SOS-hierarchy Putinar proved the following Positivstellensatz, which is an improvement of Schmüdgen’s Positivstellensatz.

Theorem 2.3 (Putinar’s Positivstellensatz; [Put93]).

Let $\mathcal{G}$ be a set of polynomial constraints with $\bigcup_{d\in\mathbb{N}}\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ being Archimedean, and let $f\in\mathbb{R}[\boldsymbol{x}]$ with $f(\boldsymbol{x})>0$ for all $\boldsymbol{x}\in\mathcal{G}_{+}$ . Then there exists a $d\in\mathbb{N}$ such that ${f}_{\mathbb{SOS},\mathcal{G}}^{2d}={f}_{\mathcal{G}}^{*}$ .

theorem 2.3 provides a sequence of cones that approximate $\mathcal{K}_{\mathcal{G}}$ from the inside, such that the values of ${f}_{\mathbb{SOS},\mathcal{G}}^{2d}$ give a sequence of lower bounds that converges in $d$ to the optimal value of (CPOP).

The program ( $\mathbb{\mathbb{SOS}}_{\mathcal{G}}^{2d}$ ) can be solved using a semidefinite program (SDP) of size $\binom{n+d}{d}^{O(1)}$ ; see e.g., [Las01, Nes00, Par00, Sho87]. This is implied by the following fact; see e.g., [Par00].

Theorem 2.4.

A polynomial $p\in\mathbb{R}[\boldsymbol{x}]$ is a SOS of degree $d$ if and only if there exists a positive semidefinite matrix $G$ , called the Gram matrix, such that $p=\boldsymbol{z}^{\top}G\boldsymbol{z}$ , for $\boldsymbol{z}$ being the vector of $n$ -variate monomials of total degree at most $d$ .

The size of the SDP program is $\binom{n+d}{d}^{O(1)}$ . Moreover, for BCPOP problems, when hypercube constraints $\pm(x_{i}^{2}-x_{i})$ are incorporated in $\mathcal{G}$ , it is known for ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}d_{\mathcal{G}}}:=\max\{\deg(g)\leavevmode\nobreak\ |\leavevmode\nobreak\ g\in\mathcal{G}\}$ that $\mathbb{SOS}^{2n+2\lceil d_{\mathcal{G}}/2\rceil}$ solves the problem exactly, i.e., ${f}_{\mathbb{SOS},\mathcal{G}}^{2n+2\lceil d_{\mathcal{G}}/2\rceil}={f}_{\mathcal{G}}^{*}$ ; see e.g., [BS14].

2.1.1. SOS - The dual perspective: Lasserre hierarchy

Consider a BCPOP. Let $\lambda\in\mathbb{R}$ be such that $f-\lambda\notin\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ , for some $d\in\mathbb{N}^{*}$ . By the hyperplane separation theorem for convex cones, there exists a hyperplane that separates $f-\lambda$ from $\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ . Note that for BCPOP we can restrict to polynomials defined on the hypercube $\{0,1\}^{n}\to\mathbb{R}$ , i.e., to the vector space of multi-linear polynomials. The hyperplane is represented by the polynomial ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mu}:\{0,1\}^{n}\to\mathbb{R}$ , which is a normal vector to the hyperplane, such that for every polynomial $h\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ we have $\sum_{\boldsymbol{x}\in\{0,1\}^{n}}\mu(\boldsymbol{x})\cdot h(\boldsymbol{x})\geq 0$ and $\sum_{\boldsymbol{x}\in\{0,1\}^{n}}(f-\lambda)(\boldsymbol{x})\cdot h(\boldsymbol{x})<0$ . By scaling we can assume that $\sum_{\boldsymbol{x}\in\{0,1\}^{n}}\mu(\boldsymbol{x})=1$ . To every function $\mu$ we can associate a linear operator ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\widetilde{\mathbb{E}}_{\mu}}:\{0,1\}^{n}\to\mathbb{R}$ mapping polynomials to real numbers, defined by

[TABLE]

which is called the pseudoexpectation. The dual problem to ( $\mathbb{\mathbb{SOS}}_{\mathcal{G}}^{2d}$ ) is the $\overline{\text{SOS}}$ program of degree $2d$ . It takes the form

[TABLE]

and is known as the Lasserre relaxation (of degree $2d$ ) . It can be solved using an SDP of size $\binom{n}{\leq d}^{O(1)}$ [Las01]. Analogously, the $\overline{\text{SOS}}^{\leavevmode\nobreak\ *}$ program of degree $2d$ takes the form ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{\overline{\mathbb{SOS}}}_{\operatorname{Prep}(\mathcal{G})}^{2d}}$ .

Problem ( $\mathbb{\overline{\mathbb{SOS}}}_{\mathcal{G}}^{2d}$ ) can be reformulated in terms of moments / localizing matrices. Consider $h=s^{2}\cdot g\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ , for $s\in\mathbb{R}[\boldsymbol{x}]$ and $g\in\mathcal{G}$ . Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}d^{{}^{\prime}}}=\lfloor\frac{2d-\deg(g)}{2}\rfloor$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\boldsymbol{v}}$ be the * vector of coefficients* of $h$ , such that $h(\boldsymbol{x})=\sum_{I\subseteq[n]}v_{I}\prod_{i\in I}x_{i}$ . We can write

[TABLE]

where ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}M_{g}^{2d}}\in\mathbb{R}^{\binom{n}{\leq d^{{}^{\prime}}}\times\binom{n}{\leq d^{{}^{\prime}}}}$ is a real, symmetric matrix whose rows and columns are indexed by sets $I,J\subseteq[n]$ of size at most $d^{{}^{\prime}}$ such that $M_{g}^{2d}(I,J)=\widetilde{\mathbb{E}}_{\mu}[g\cdot\prod_{k\in I\cup J}x_{k}]$ . For $g=1$ the matrix is called the moment matrix, and for all other $g$ it is called the localizing matrix for the constraint $g$ . Since for every real valued vector $\boldsymbol{v}$ the requirement $\boldsymbol{v}^{\top}M\boldsymbol{v}\geq 0$ is equivalent to $M$ being positive semidefinite (PSD), denoted by ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}M\succeq 0}$ , we can reformulate ( $\mathbb{\overline{\mathbb{SOS}}}_{\mathcal{G}}^{2d}$ ) as:

[TABLE]

2.2. Scaled Diagonally Dominant Sum of Squares

In [AM14] Ahmadi and Majumdar proposed an approximation of the cone $\mathbb{SOS}$ based on scaled diagonally-dominant polynomials (SDSOS), defined below in subsection 2.2.1. Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SDSOS}}$ be the set of finite sums of scaled diagonally-dominant polynomials. We obtain the following program:

[TABLE]

analogously, the SDSOS ${}^{\leavevmode\nobreak\ *}$ program takes the form $\mathbb{\mathbb{SDSOS}}_{\operatorname{Prep}(\mathcal{G})}^{2d}$ . Since $\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})\subseteq\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})\subseteq\mathcal{K}_{\mathcal{G}}$ for every $d\in\mathbb{N}$ , we have ${f}_{\mathbb{SDSOS},\mathcal{G}}^{2d}\leq{f}_{\mathbb{SOS},\mathcal{G}}^{2d}\leavevmode\nobreak\ \leq\leavevmode\nobreak\ {f}_{\mathcal{G}}^{*}$ . Moreover, $\mathbb{SDSOS}^{2d}$ can be solved using Second Order Conic Programming (SOCP) of size $\binom{n+d}{d}^{O(1)}$ ; see [AM14].

2.2.1. Scaled diagonally-dominant polynomials

We introduce the formal details for SDSOS certificates.

Definition 2.5.

A real symmetric $m\times m$ matrix $M$ is called diagonally-dominant (dd) if for every $i\in[m]$ we have $M(i,i)\geq\sum_{i\neq j}|M(i,j)|$ . Moreover, $M$ is called scaled diagonally-dominant (sdd) if there exist a positive real diagonal matrix $D$ such that $DMD$ is dd. A polynomial $p(\boldsymbol{x})\in\mathbb{R}[\boldsymbol{x}]$ of total degree $d$ is scaled diagonally-dominant, denoted $p\in\mathbb{SDSOS}$ , if there exist an sdd matrix $M$ such that $p=\boldsymbol{z}^{\top}M\boldsymbol{z}$ , for $\boldsymbol{z}$ being the vector of $n$ -variate monomials of total degree at most $d$ . ∎

Every SDSOS polynomials is an SOS polynomial: By definition 2.5, consider an sdd matrix $DMD$ , for $M$ being a dd matrix. By the Gershgorin circle theorem, the matrix $M$ is PSD. Moreover, $DMD=DMD^{\top}$ . Since $DMD^{\top}$ is a congruent transformation of $M$ , that does not change the sign of the eigenvalues, the matrix $DMD^{\top}$ is also PSD.

Next, we provide a further characterization of SDSOS polynomials. We start with recalling the known characterization of diagonally dominant (dd) matrices.

Lemma 2.6 ([BC75]).

A symmetric $m\times m$ matrix $M$ is dd if and only if

[TABLE]

for $c_{ij}\geq 0$ and $\{\boldsymbol{w}_{ij}\}_{i,j\in[m]}\subseteq\mathbb{R}^{n}$ being a set of vectors, each with at most two nonzero entries at positions $i$ and $j$ which equal $\pm 1$ .

By definition 2.5 and lemma 2.6, every $n$ -variate sdd polynomial $s$ of degree at most $d$ is of the form

[TABLE]

where ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\boldsymbol{z}}$ is the vector of $n$ -variate monomials of maximal degree $d$ , ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}Q}$ is a dd matrix, and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}D}$ is a positive diagonal matrix. Since every vector $\boldsymbol{v}_{IJ}$ has at most two nonzero entries, both equal to $\pm 1$ , the SDSOS polynomial $s$ is always of the form $s(x)=\sum_{k}(a_{k}p_{k}(x)+b_{k}q_{k}(x))^{2}$ , where $p,q$ are monomials and $a_{k},b_{k}\in\mathbb{R}$ .

2.2.2. SDSOS - Dual perspective

For the BCPOP the dual of the problem ( $\mathbb{\mathbb{SDSOS}}_{\mathcal{G}}^{2d}$ ) is a relaxation of the problem ( $\mathbb{\overline{\mathbb{SOS}}}_{\mathcal{G}}^{2d}$ ). Indeed, similar as for formulation ( $\mathbb{\overline{\mathbb{SOS}}}_{\mathcal{G}}^{2d}$ ), a conic duality theory can be used to transform program ( $\mathbb{\mathbb{SDSOS}}_{\mathcal{G}}^{2d}$ ) into its dual of the form

[TABLE]

for $\mathbb{\widetilde{\mathbb{E}}}$ being a linear map, defined as in subsection 2.1.1. Analogously the $\overline{\text{SDSOS}}^{\leavevmode\nobreak\ *}$ program of degree $2d$ takes the form ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{\overline{\mathbb{SDSOS}}}_{\operatorname{Prep}(\mathcal{G})}^{2d}}$ .

Similar as in (2.1), Formulation ( $\mathbb{\overline{\mathbb{SDSOS}}}_{\mathcal{G}}^{2d}$ ) can be transformed into matrix form. In this case we obtain a set of $2\times 2$ matrices that are required to be PSD. More formally, let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}K}\subseteq\{I\leavevmode\nobreak\ |\leavevmode\nobreak\ I\subseteq[n],\leavevmode\nobreak\ |I|\leq d\}$ . For $M\in\mathbb{R}^{\binom{n}{\leq d}\times\binom{n}{\leq d}}$ being a real, symmetric matrix whose rows/columns are indexed with sets $I,J\subseteq[n]$ of size at most $d$ , let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}M_{\big{|}K}}$ be the principal submatrix of $M$ of entries that lie in the rows and columns indexed by the sets in $K$ .

We obtain that $\mathbb{\overline{\mathbb{SDSOS}}}_{\mathcal{G}}^{2d}$ is equivalent to:

[TABLE]

For BCPOP, both ( $\mathbb{\overline{\mathbb{SDSOS}}}_{\mathcal{G}}^{2d}$ ) and (2.3) are solvable via an SOCP of size $\binom{n}{\leq d}^{O(1)}$ . For more details we refer the reader to [AM17].

2.3. Sherali Adams

An alternative method to approximate the sum of squares cone is based on nonnegative polynomials that depend on a limited number of variables, called * $d$ -juntas*. The resulting program is called the Sherali Adams algorithm (SA) and was first introduced in [SA90] as a method to tighten the linear programming relaxations for 0/1 hypercube optimization problems. Thus, we assume throughout the section and whenever we consider (SA) that the $\{0,1\}^{n}$ hypercube constraints are contained in $\mathcal{G}$ , meaning that $\mathcal{G}=\{1,\pm(x_{1}^{2}-x_{1}),\ldots,\pm(x_{n}^{2}-x_{n}),g_{1},\ldots,g_{m}\}$ .

For $I\subseteq[n]$ we denote ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\boldsymbol{x}_{I}}=\prod_{i\in I}x_{i}$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\overline{\boldsymbol{x}}}_{I}=\prod_{i\in I}(1-x_{i})$ . Let

[TABLE]

A nonnegative $d$ -junta is a function $f:\{0,1\}^{n}\to\mathbb{R}_{\geq 0}$ which depends only on at most $d$ input coordinates. It is easy to check that the set $\{h\in\mathbb{R}[\boldsymbol{x}]\leavevmode\nobreak\ |\leavevmode\nobreak\ h\in\mathbb{SA},\leavevmode\nobreak\ \deg(h)\leq d\}$ is precisely the set of nonnegative $d$ -juntas over the Boolean hypercube $\{0,1\}^{n}$ . The * degree- $2d$ Sherali Adams* is the following problem:

[TABLE]

analogously SA∗ takes the form $\mathbb{\mathbb{SA}}_{\operatorname{Prep}(\mathcal{G})}^{2d}$ . Note that the superscript in $\operatorname{Hier}_{\mathcal{G}}^{d}({\mathbb{SA}})$ is $d$ (not $2d$ ), because of the way SA was defined historically, providing that ${f}_{\mathbb{SA},\mathcal{G}}^{2d}\leq{f}_{\mathbb{SOS},\mathcal{G}}^{2d}$ . However, this does not affect the polynomial equivalence between the proof systems; see definition 1.1.

The program $\mathbb{SA}^{2d}$ can be solved using the linear program (LP) of size $\binom{n}{\leq d}^{O(1)}$ .

2.3.1. SA - Dual perspective

Similarly as in subsection 2.1.1 and subsection 2.2.2 one can use a conic duality theory to transform the program ( $\mathbb{\mathbb{SA}}_{\mathcal{G}}^{2d}$ ) into its dual of the form:

[TABLE]

The program ( $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2d}$ ) is a linear system of size $\binom{n}{\leq d}^{O(1)}$ . Analogously, the $\overline{\text{SA}}^{\leavevmode\nobreak\ *}$ program of degree $2d$ takes the form ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{\overline{\mathbb{SA}}}_{\operatorname{Prep}(\mathcal{G})}^{2d}}$ .

2.4. Sum of Nonnegative Circuit

A method for approximating the cone $\mathcal{K}({\mathcal{G}})$ , which is independent of SOS, is based on sums of nonnegative circuit polynomials (SONC), defined below in subsection 2.4.1. The technique was introduced by Iliman and the second author in [IdW16]. Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{SONC}}$ be the set of finite sums of nonnegative circuit polynomials. We consider the following program:

[TABLE]

analogously $\mathbb{SONC}^{*}$ takes the form $\mathbb{\mathbb{SONC}}_{\operatorname{Prep}(\mathcal{G})}^{2d}$ . As shown in [DIdW17, Theorem 4.8], for an arbitrary real polynomial that is strictly positive on a compact, basic closed semialgebraic set $\mathcal{G}$ there exists a $\textsc{SONC}^{*}$ certificate of nonnegativity, i.e., the Schmüdgen-type Positivstellensatz theorem 2.2 applies to SONC. Moreover, searching through the space of degree $d$ certificates can be done via a relative entropy program (REP) [DIdW17] of size $\binom{n+d}{d}^{O(1)}$ ; see also [CS17, CS16, CMW18]. REPs are convex optimization programs and are efficiently solvable with interior point methods; see e.g., [CS17, NN94] for more details.

2.4.1. Nonnegative Circuit Polynomials

We recall the most relevant statements about SONCs.

Definition 2.7.

A polynomial $f\in\mathbb{R}[\mathbf{x}]$ is called a circuit polynomial if it is of the form

[TABLE]

with ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}r}\leq n$ , exponents ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\boldsymbol{\alpha}(j)}$ , ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\boldsymbol{\beta}}\in A$ , and coefficients ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}f_{\boldsymbol{\alpha}(j)}}\in\mathbb{R}_{>0}$ , ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}f_{\boldsymbol{\beta}}}\in\mathbb{R}$ , such that $\operatorname{New}(f)$ is a simplex with even vertices $\boldsymbol{\alpha}(0),\boldsymbol{\alpha}(1),\ldots,\boldsymbol{\alpha}(r)$ and the exponent $\boldsymbol{\beta}$ is in the strict interior of $\operatorname{New}(f)$ .

For every circuit polynomial we define the corresponding circuit number as

[TABLE]

∎

One determines nonnegativity of circuit polynomials via its circuit number $\Theta_{f}$ as follows:

Theorem 2.8 ([IdW16], Theorem 3.8).

Let $f$ be a circuit polynomial then $f$ is nonnegative if and only if $|f_{\boldsymbol{\beta}}|\leq\Theta_{f}$ and $\boldsymbol{\beta}\not\in(2\mathbb{N})^{n}$ or $f_{\boldsymbol{\beta}}\geq-\Theta_{f}$ and $\boldsymbol{\beta}\in(2\mathbb{N})^{n}$ .

Let

[TABLE]

Following Reznick, we define maximal mediated sets; note that these objects are well-defined due to [Rez89, Theorem 2.2]

Definition 2.9.

Let $A\subseteq\mathbb{Z}^{n}$ such that $\operatorname{Vert}\left(\operatorname{conv}(A)\right)\in(2\mathbb{Z})^{n}$ . We call a set $M\subseteq\operatorname{conv}(A)\cap\mathbb{Z}^{n}$ ( $A$ -)mediated if every element of $M$ is the midpoint of two distinct points in $\operatorname{conv}(A)\cap(2\mathbb{Z})^{n}$ .

We define the maximal mediated set $\operatorname{conv}(A)^{*}$ as the unique $A$ -mediated set which contains every other $A$ mediated set.

Let $\operatorname{conv}(A)$ be a simplex. If $\operatorname{conv}(A)^{*}=\operatorname{conv}(A)\cap\mathbb{Z}^{n}$ , then we call $\operatorname{conv}(A)$ an $H$ -simplex. If $\operatorname{conv}(A)^{*}$ consist only of $\operatorname{Vert}\left(\operatorname{conv}(A)\right)$ and the midpoints of the vertices, then we call $\operatorname{conv}(A)^{*}$ an $M$ -simplex.

∎

Generalizing a result by Reznick in [Rez89], Iliman and the second author proved that maximal mediated sets are exactly the correct object for determining whether a nonnegative circuit polynomial is a sum of squares.

Theorem 2.10 ([IdW16], Theorem 5.2).

Let $f$ be a nonnegative circuit polynomial with inner term $f_{\boldsymbol{\beta}}\boldsymbol{x}^{\boldsymbol{\beta}}$ . Then $f$ is a sum of squares if and only if $f$ is a sum of monomial squares or if $\boldsymbol{\beta}\in\operatorname{New}(f)^{*}$ .

Especially $f$ is always an SOS if $\operatorname{New}(f)$ is an $H$ -simplex, and $f$ is never an SOS if $\operatorname{New}(f)$ is an $M$ -simplex.

For further details about SONCs see e.g., [dW15, DIdW17, DKdW18, IdW16, SdW18]. A description of the dual of the SONC cone was recently provided in [DNT18], which we, however, do not need for the purpose of this article.

2.5. Comparing proof systems

In this section we introduce the notation used for comparing the proof systems presented in Section 2, from the proof complexity perspective. For a gentle introduction to proof complexity we refer the reader to e.g., [Raz16].

Following the notation in definition 2.1: Let $\mathbb{GEN}$ be a set of polynomials which we axiomatically assume to be nonnegative and $\mathcal{G}$ be a set of polynomials, which form the semialgebraic set $\mathcal{G}_{+}$ . The GEN proof system is the set of all algebraic derivations $f$ such that $f\in\bigcup_{d\in\mathbb{N}}\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{GEN}})$ deducing nonnegativity of polynomials $f$ over $\mathcal{G}_{+}$ . Analogously, the GEN ${}^{\leavevmode\nobreak\ *}$ proof system is the set of all algebraic derivations $f$ such that $f\in\bigcup_{d\in\mathbb{N}}\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{GEN}})$ deducing nonnegativity of polynomials $f$ over $\mathcal{G}_{+}$ . The proof systems SOS, SOS ${}^{\leavevmode\nobreak\ *}$ , SDSOS, SDSOS ${}^{\leavevmode\nobreak\ *}$ , SA, SA ${}^{\leavevmode\nobreak\ *}$ , SONC and SONC ${}^{\leavevmode\nobreak\ *}$ are defined analogously.

The complexity of the certificate depends on the $d$ needed to certify the nonnegativity. Revising definition 1.1 we say that a proof system $P$ * contains* a proof system $Q$ if for every set of polynomials $\mathcal{G}$ and a polynomial $f$ admitting a degree $d$ certificate of nonnegativity over $\mathcal{G}_{+}$ in $Q$ , $f$ admits also a degree $O(d)$ certificate in $P$ . A system $P$ * strictly contains* $Q$ if $P$ contains $Q$ but $Q$ does not contain $P$ . I.e., there exist at least one set $\mathcal{G}$ and a polynomial $f$ nonnegative over $\mathcal{G}_{+}$ such that $f$ admits a degree $d$ certificate in $P$ but for every $c\in\mathbb{N}$ $f$ does not admit a degree $cd$ certificate in $Q$ . Systems $P$ and $Q$ are * polynomially equivalent* if $P$ contains $Q$ and $Q$ contains $P$ . Finally, systems $P$ and $Q$ are * polynomially incomparable* if neither $P$ contains $Q$ , nor $Q$ contains $P$ . I.e., there exist sets $\mathcal{G}$ , $\mathcal{G}^{{}^{\prime}}$ and polynomials $f$ , $f^{{}^{\prime}}$ nonnegative over $\mathcal{G}_{+}$ , $\mathcal{G}_{+}^{{}^{\prime}}$ , respectively, such that $f$ admits a degree $d$ certificate in $P$ but for every $c\in\mathbb{N}$ $f$ does not admit a degree $cd$ certificate in $Q$ and $f^{{}^{\prime}}$ admits a degree $d^{{}^{\prime}}$ certificate in $Q$ but for every $c\in\mathbb{N}$ $f^{{}^{\prime}}$ does not admit a degree $cd^{{}^{\prime}}$ certificate in $P$ .

3. SOS vs. SONC

It is well-known that the $\mathbb{SOS}$ cone and $\mathbb{SONC}$ cone are not contained in each other [IdW16, Proposition 7.2] This statement, however, gives no prediction whether or not for CPOPs these systems are polynomially equivalent or not. In this section we show that for every $n$ there exist CPOPs such that the difference between the minimal degrees of a SOS and a SONC certificate is arbitrarily large and vice versa.

3.1. SONC does not contain SOS

We consider the following family of polynomials:

Definition 3.1.

We define the family of signed quadrics $(N_{n})_{n\in\mathbb{N}^{*}}$ by ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}N_{n}}:=\left(1-\sum_{j=1}^{n}x_{j}\right)^{2}$ . ∎

It is obvious that every $N_{n}$ is SOS and that its zero set is the unit ball of the 1-norm, i.e., for all $n\in\mathbb{N}^{*}$ we have

[TABLE]

The support of $N_{2}$ and $N_{3}$ is depicted together with their Newton polytopes in figure 2.

It is known that for every $n$ the function $N_{n}$ cannot be written as a combination of $(n-1)$ -juntas [Lee15, Theorem 1.12]. It is, however, also straightforward to conclude that for every $n\in\mathbb{N}$ the polynomial $N_{n}$ is not a SONC polynomial.

Lemma 3.2.

For all $n\geq 2$ it holds that $N_{n}\notin\mathbb{SONC}$ .

Proof.

By equation 3.1 the real zero set $\mathcal{V}(N_{n})$ is equal to the boundary of the $n$ -dimensional cross-polytope; see e.g., [Zie07]. In particular, it is an $(n-1)$ dimensional piecewise-linear set. A SONC, however, has at most $2^{n}$ many distinct real zeros by [IdW16, Corollary 3.9]. ∎

In [SdW18, Example 3.7] it is shown that $N_{2}$ is not a SONC due to a term by term inspection. We point out that one could build over that argument and reprove inductively lemma 3.2 using the fact that the support set of $N_{2}$ equals the restriction of the support set of $N_{n}$ restricted to a specific $2$ -face of $\operatorname{New}(N_{n})$ .

Corollary 3.3.

For every $n\in\mathbb{N}$ with $n\geq 2$ and every $t\geq 1$ there exist infinitely many systems $\mathcal{G}$ such that $\operatorname{Hier}_{\mathcal{G}}^{2}({\mathbb{SOS}})\not\subset\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2t}({\mathbb{SONC}})$ .

Proof.

Let $n\in\mathbb{N}_{\geq 2}$ be fixed. Consider a system

[TABLE]

where $N_{n}$ is the signed quadric and $\mathcal{G}=\{g_{1},\ldots,g_{s}\}$ is a system of polynomials such that $\min_{i\in[s]}\{\deg(g_{i})\}\geq 2t+1$ , $\mathcal{V}(N_{n})\subseteq\mathcal{G}_{+}$ , and $\mathcal{G}_{+}$ compact.

On the one hand, there exists an SOS certificate of degree $2$ for the system equation 3.2 given by $N_{n}$ alone, as $N_{n}$ is already an SOS and moreover $\min_{\mathbf{x}\in\mathbb{R}^{n}}N_{n}=\min_{\mathbf{x}\in\mathcal{G}_{+}}N_{n}$ .

On the other hand, due to the Positivstellensatz result for SONCs [DIdW17, Theorem 4.8], there exists a SONC certificate of the form $N_{n}\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SONC}})$ , for some value of $d$ . But since $N_{n}$ is not a SONC due to lemma 3.2 the certificate necessarily has to incorporate at least one of the constraints defining the set $\mathcal{G}$ . Hence, $d>t$ . ∎

3.2. SOS does not contain SONC

In this section we show the inverse of the result from subsection 3.1, namely that SONC is not contained in SOS.

Definition 3.4 (Generalized Motzkin Polynomial).

Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbf{e}}:=\sum_{j=1}^{n}\mathbf{e}_{j}$ . For every $n\in\mathbb{N}$ with $n\geq 2$ we define the Generalized Motzkin Polynomial as

[TABLE]

∎

Note the $M_{2}$ is the usual * Motzkin polynomial* $1+x_{1}^{2}x_{2}^{4}+x_{1}^{4}x_{2}^{2}-3x_{1}^{2}x_{2}^{2}$ . The support of $M_{2}$ and $M_{3}$ is depicted together with their Newton polytopes in figure 3.

Proposition 3.5.

For every $n\in\mathbb{N}_{\geq 2}$ we have $M_{n}\in\mathbb{SONC}$ , but $M_{n}\notin\mathbb{SOS}$ , and moreover $\mathcal{V}(M_{n})=\{-1,1\}^{n}$ .

Note that $M_{n}$ not being $\mathbb{SOS}$ for $n\geq 2$ was in fact already shown by Motzkin (not only $n=2$ ); see [Mot67] and see also [Rez89, Section 6]. Furthermore, $\operatorname{New}(M_{n})^{*}$ being an $M$ -simplex, which implies $M_{n}$ not being $\mathbb{SOS}$ was shown by Reznick in [Rez89, Theorem 6.9]. We provide an own proof of all these facts here for convenience of the reader.

Proof.

Let $n\geq 2$ . First, we show that $M_{n}$ is a nonnegative circuit polynomial which vanishes exactly on the Boolean hypercube $\{-1,1\}^{n}$ .

According to definition 2.7 $M_{n}$ is a circuit polynomial with inner term $-(n+1)\mathbf{x}^{2\mathbf{e}}$ . A straightforward computation yields for the barycentric coordinates $\lambda_{j}=1/(n+1)$ for every $j=0,\ldots,n$ and the circuit number satisfies $\Theta_{M_{n}}=n+1$ . Thus, $M_{n}$ is nonnegative by theorem 2.8. As mentioned before, every nonnegative circuit polynomial has at most one zero on every orthant [IdW16, Corollary 3.9]. Moreover, an evaluation shows that $M_{n}(\mathbf{x})=0$ for every $\mathbf{x}\in\{-1,1\}^{n}$ . Thus, $\mathcal{V}(M_{n})=\{-1,1\}^{n}$ .

Second, we show that $M_{n}$ is not a sum of squares. Since $M_{n}$ is a nonnegative circuit polynomial, which is not a sum of monomial squares, this is equivalent to the fact that the lattice point $2\mathbf{e}$ does not belong to the maximal mediated set $\operatorname{New}(M_{n})^{*}$ by theorem 2.10. Here, we show more generally that $\operatorname{New}(M_{n})$ is even an $M$ -simplex, which implies $2\mathbf{e}\notin\operatorname{New}(M_{n})^{*}$ . We observe:

[TABLE]

This follows from

[TABLE]

and the fact that every lattice point $\mathbf{w}\in(2\mathbb{Z}^{n})$ with $||\mathbf{w}-2\mathbf{e}||_{1}=2$ satisfies either $\mathbf{w}\in\operatorname{Vert}\left(\operatorname{New}(M_{n})\right)$ , or $\mathbf{w}\notin\operatorname{New}(M_{n})$ due to $w_{i}=0$ , $w_{j}\neq 0$ for some $i,j\in[n]$ with $i\neq j$ .

We have $2\mathbf{e}\notin\operatorname{New}(M_{n})^{*}$ . By definition, every point in $\operatorname{New}(M_{n})^{*}$ is the midpoint of two distinct points in $\operatorname{New}(M_{n})^{*}\cap(2\mathbb{Z})^{n}$ . This is impossible due to (3.3), (3.4), and the fact that the convex combination in (3.4) is unique since $\operatorname{New}(M_{n})$ is a simplex. Thus, $\operatorname{New}(M_{n})^{*}\cap(2\mathbb{Z})^{n}=\operatorname{Vert}\left(M_{n}\right)$ and $\operatorname{New}(M_{n})$ is an $M$ -simplex by [Rez89, Theorem 2.5]. ∎

Corollary 3.6.

For every $n\in\mathbb{N}$ with $n\geq 2$ and every $t\geq n+1$ there exist infinitely many systems $\mathcal{G}$ such that $\operatorname{Hier}_{\mathcal{G}}^{n+1}({\mathbb{SONC}})\not\subset\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{t}({\mathbb{SOS}})$ .

Proof.

Let $n\in\mathbb{N}_{\geq 2}$ be fixed. Consider a system

[TABLE]

where $M_{n}$ is the generalized Motzkin polynomial and $\mathcal{G}=\{g_{1},\ldots,g_{s}\}$ is a system of polynomials such that $\min_{i\in[s]}\{\deg(g_{i})\}\geq t+1$ , $\{\pm 1\}^{n}\subseteq\mathcal{G}_{+}$ and $\mathcal{G}_{+}$ compact.

On the one hand, there exists a SONC certificate of degree $n+1$ for the system equation 3.5 given by $M_{n}$ alone as $M_{n}$ is already a SONC by Proposition 3.5 and moreover $\min_{\mathbf{x}\in\mathbb{R}^{n}}M_{n}=\min_{\mathbf{x}\in\mathcal{G}_{+}}M_{n}$ .

On the other hand, due to theorem 2.3, there exists an SOS certificate of the form $M_{n}\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{d}({\mathbb{SOS}})$ for some $d\in\mathbb{N}_{+}$ . But since $M_{n}$ is not a SOS due to Proposition 3.5, the certificate has to necessarily involve at least one constraint from $\mathcal{G}$ . Thus $d>t$ . ∎

Corollary 3.7.

The pairs of systems $SONC$ and $SOS$ ; $SONC$ and $SOS^{\leavevmode\nobreak\ *}$ ; $SONC^{\leavevmode\nobreak\ *}$ and $SOS$ ; $SONC^{\leavevmode\nobreak\ *}$ and $SOS^{\leavevmode\nobreak\ *}$ are polynomially incomparable.

Proof.

Follows immediately from corollary 3.3 and corollary 3.6. ∎

4. SDSOS vs SONC

In this section, we show that for constrained polynomial optimization problems CPOPs, SONC strictly contains SDSOS. The same relations holds for the SONC∗ and for SDSOS∗ algorithms.

4.1. SDSOS is SONC

We start with proving that every SDSOS polynomial of degree $d$ is also a SONC polynomial of degree $d$ .

Lemma 4.1.

Every scaled diagonally dominant polynomial is a circuit polynomial.

Proof.

By [AM14] we know that every scaled diagonally dominant polynomial $s$ can be written as a sum of binomial squares, i.e., of the form $s(\boldsymbol{x})=\sum_{k}s_{k}(\boldsymbol{x})$ , for $s_{k}(\boldsymbol{x})=(a_{k}p_{k}(\boldsymbol{x})+b_{k}q_{k}(\boldsymbol{x}))^{2}$ , where $p,q$ are monomials and $a_{k},\leavevmode\nobreak\ b_{k}\in\mathbb{R}$ . Thus, $s(\boldsymbol{x})=\sum_{k}(a_{k}p_{k}(\boldsymbol{x}))^{2}+(b_{k}q_{k}(\boldsymbol{x}))^{2}-2a_{k}p_{k}(\boldsymbol{x})b_{k}q_{k}(\boldsymbol{x})$ , where $a_{k}^{2},\leavevmode\nobreak\ b_{k}^{2}\in\mathbb{R}_{\geq 0}$ for every index $k$ in the summation. Moreover, by definition 2.7, for every $k$ , $\operatorname{New}(s_{k})$ is a one dimensional simplex with two even vertices $\boldsymbol{\alpha}(0),\boldsymbol{\alpha}(1)$ , given by the exponents of the squared monomials, and the exponent $\boldsymbol{\beta}$ of the term $-2a_{k}p_{k}(\boldsymbol{x})b_{k}q_{k}(\boldsymbol{x})$ is in the strict interior of $\operatorname{New}(f)$ since $\boldsymbol{\beta}=1/2\leavevmode\nobreak\ \boldsymbol{\alpha}(0)+1/2\leavevmode\nobreak\ \boldsymbol{\alpha}(1)$ . Finally, the circuit number $\Theta_{s_{k}}$ is equal to $\left(\frac{f_{\boldsymbol{\alpha}(0)}}{\lambda_{0}}\right)^{\lambda_{0}}\cdot\left(\frac{f_{\boldsymbol{\alpha}(1)}}{\lambda_{1}}\right)^{\lambda_{1}}=2a_{k}b_{k}$ , thus by theorem 2.8, $s_{k}$ is a nonnegative circuit polynomial. ∎

We note that a similar statement to lemma 4.1 was very recently, independently observed in [CMW18].

Corollary 4.2.

For every $d\in\mathbb{N}^{*}$ we have ${f}_{\mathbb{SDSOS},\mathcal{G}}^{2d}\leq{f}_{\mathbb{SONC},\mathcal{G}}^{2d}$ and ${f}_{\mathbb{SDSOS},\operatorname{Prep}(\mathcal{G})}^{2d}\leq{f}_{\mathbb{SONC},\operatorname{Prep}(\mathcal{G})}^{2d}$ .

Proof.

Assume that for some polynomial $f\in\mathbb{R}[\boldsymbol{x}]$ and $\lambda\in\mathbb{R}$ there exists a $d\in\mathbb{N}^{*}$ such that $f-\lambda\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})$ . We show that necessarily it has to hold $f-\lambda\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SONC}})$ . Since $f-\lambda\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})$ there necessarily exist SDSOS polynomials $s_{i}$ such that $f-\lambda=\sum_{i=0}^{m}s_{i}g_{i}$ for $g_{i}\in\mathcal{G}$ and $\deg(s_{i}g_{i})\leq 2d$ for every $i\in[m]$ . Moreover, by lemma 4.1 every SDSOS polynomial is a SONC polynomial, thus we have $f-\lambda=\sum_{i=0}^{m}s_{i}g_{i}\in\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SONC}})$ . The proof works analogously for the second inequality. ∎

It remains to show that the for every $n$ there exist CPOPs such that the ratio of the minimal degrees of a SDSOS and a SONC is not bounded by a constant.

Corollary 4.3.

SONC proof system strictly contains SDSOS proof system and the same relation holds for SONC ${}^{\leavevmode\nobreak\ *}$ and SDSOS ${}^{\leavevmode\nobreak\ *}$ .

Proof.

The corollary follows from corollary 4.2 and the fact that $\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})\subseteq\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SOS}})$ ( $\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SDSOS}})\subseteq\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SOS}})$ ) together with Proposition 3.5. ∎

As a consequence we show that there exists no equivalent of Putinar’s Positivstellensatz for SDSOS algorithm.

Corollary 4.4.

There exist $f\in\mathbb{R}[\boldsymbol{x}]$ and infinitely many systems of polynomials $\mathcal{G}=\{g_{1},\ldots,g_{s}\}\subset\mathbb{R}[\boldsymbol{x}]$ such that $\mathcal{G}_{+}$ is Archimedean, $f(\boldsymbol{x})>0$ for all $\boldsymbol{x}\in\mathcal{G}_{+}$ but for all $d\in\mathbb{N}$ , $f\notin\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})$ .

Proof.

By constructing an explicit example, it was shown in [DKdW18, Theorem 5.1] that the equivalent statement holds for the SONC case. Thus, the statement follows immediately from corollary 4.2. ∎

4.2. Closures under Changes of Bases and Relations to SOCP

In the rest of this section we provide two observations regarding the behaviour of $\mathbb{SONC}$ and $\mathbb{SDSOS}$ under a change of bases and their relation to second order cone programming.

In [DIdW17, Lemma 4.1] the authors showed that the SONC cone is not closed under multiplication, i.e., if $s_{1},s_{2}\in\mathbb{SONC}^{2d}$ , then this does not imply $s_{1}\cdot s_{2}\in\mathbb{SONC}^{4d}$ in general. Moreover, $\mathbb{SONC}^{2d}$ is not closed under affine transformations or more generally a change of bases; see [DKdW18, Corollary 3.2]. These results are in sharp contrast to the SOS cone, which is closed both under multiplication and under a change of bases. Similarly as for SONC, it is well-known that $\mathbb{SDSOS}$ is not closed under multiplication, and a change of bases; see e.g., [AH17].

More precisely, we define

[TABLE]

analogously for ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\operatorname{cls}(\mathbb{SONC})}$ .

Currently, it is an open problem in the community to decide whether the closure of $\mathbb{SDSOS}^{2d}$ with respect to a change of bases equals $\mathbb{SOS}^{2d}$ .

Problem 4.5.

Is $\operatorname{cls}(\mathbb{SDSOS})=\mathbb{SOS}$ ?

From the results of this section we obtain the following corollary.

Corollary 4.6.

If $\operatorname{cls}(\mathbb{SDSOS})=\mathbb{SOS}$ , then $\mathbb{SOS}\subsetneq\operatorname{cls}(\mathbb{SONC})$ .

Proof.

Follows directly from lemma 4.1, which does not depend on the chosen basis and $\mathbb{SOS}\neq\mathbb{SONC}$ ; see [IdW16, Proposition 7.2]. ∎

Moreover we obtain the following consequence about the relation of $\mathbb{SONC}$ and second order cone programming $\mathbb{SOCP}$ . Since we use $\mathbb{SOCP}$ ’s only in the following corollary, we omit a full definition of SOCP and refer the reader to the standard literature like [BV04]. The bounds ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{f}_{\mathbb{SOCP},\mathcal{G}}^{2d}}$ and ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}{f}_{\mathbb{SOCP},\operatorname{Prep}(\mathcal{G})}^{2d}}$ are defined analogously to the other hierarchies.

Corollary 4.7.

For every $d\in\mathbb{N}^{*}$ we have ${f}_{\mathbb{SOCP},\mathcal{G}}^{2d}\leq{f}_{\mathbb{SONC},\mathcal{G}}^{2d}$ and ${f}_{\mathbb{SOCP},\operatorname{Prep}(\mathcal{G})}^{2d}\leq{f}_{\mathbb{SONC},\operatorname{Prep}(\mathcal{G})}^{2d}$ .

Proof.

It was shown by Ahmadi and Majumdar that every $\mathbb{SDSOS}$ certificate is $\mathbb{SOCP}$ in [AM17, Theorem 10], and, very recently, that every $\mathbb{SOCP}$ certificate is $\mathbb{SDSOS}$ by Ding and Lim [DL18, Theorem 3.3]. Thus, the statement follows immediately from corollary 4.2. ∎

We remark that, however, Averkov [Ave18, Theorem 17] recently showed that the semidefinite extension degree of SONC equals two, and thus SONC is an SOCP-lift; see also [GPT15] for further details on lifts.

5. Hierarchies on the Boolean Hypercube

In this section we prove the dependencies between various hierarchies on the Boolean hypercube $\{0,1\}^{n}$ .

Let, for this section, $\mathcal{G}$ be a collection of polynomials such that for all $i\in[n]$ we have $\pm(x_{i}^{2}-x_{i})\in\mathcal{G}$ and $l_{i}:=N\pm x_{i}\in\mathcal{G}$ for $N\in\mathbb{R}_{>1}$ , such that $\mathcal{G}_{+}\subseteq\{0,1\}^{n}$ . Let $\mathbb{GEN}\subset\mathcal{K}({\mathcal{G}_{+}})$ be an arbitrary class of polynomials, which are nonnegative on the Boolean hypercube. We consider the corresponding optimization problem (2.1). We start with proving a general statement saying that every proof certificate that can certify nonnegativity of an $n$ -variate polynomial over the unconstrained Boolean hypercube with an $O(n)$ degree certificate is at least as strong as the $\textsc{SA}^{*}$ hierarchy.

Theorem 5.1.

Let $f\in\mathbb{R}[\boldsymbol{x}]$ . Assume that there exists a $c\in\mathbb{N}^{*}$ such that for every $n\in\mathbb{N}$ , and $\mathcal{G}:=\{\pm(x_{i}^{2}-x_{i}):\leavevmode\nobreak\ i\in[n]\}$ such that

[TABLE]

Then for every finite set of polynomial constraints $\mathcal{G^{\prime}}$ with $\mathcal{G}^{\prime}_{+}\subseteq\{0,1\}^{n}$ and for every $d\in\mathbb{N}$ with $\leavevmode\nobreak\ d\leq n$ it holds that ${f}_{\mathbb{SA},\operatorname{Prep}(\mathcal{G}^{\prime})}^{2d}\leq\ {f}_{\mathbb{GEN},\operatorname{Prep}(\mathcal{G}^{\prime})}^{2cd}$ .

Proof.

Consider a polynomial $f\in\mathbb{R}[\boldsymbol{x}]$ , a set $\mathcal{G^{{}^{\prime}}_{+}}\subseteq\{0,1\}^{n}$ and a real number $\lambda$ such that $f-\lambda\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G^{{}^{\prime}}})}^{2d}({\mathbb{SA}})$ . We show that $f-\lambda\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G}^{\prime})}^{2cn}({\mathbb{GEN}})$ . Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}m_{2d}^{{}^{\prime}}}$ be the cardinality of the set $\operatorname{Prep}(\mathcal{G}^{{}^{\prime}})$ restricted to polynomials of degree at most $2d$ . By definition, $f-\lambda\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G^{{}^{\prime}}})}^{2d}({\mathbb{SA}})$ implies that there exists a certificate $f-\lambda=\sum_{i=0}^{m_{2d}^{{}^{\prime}}}p_{i}G_{i}$ , for $G_{i}\in\operatorname{Prep}(\mathcal{G}^{{}^{\prime}})$ such that every polynomial $p_{i}$ is of the form $p_{i}=\sum_{\ell=0}^{r_{i}}q_{i\ell}$ such that every $q_{i\ell}$ is a nonnegative $k_{i}$ -junta with ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}k_{i}}:=\lfloor\frac{2d-\deg(G_{i})}{2}\rfloor$ .

For every $i\in[m_{2d}^{{}^{\prime}}]$ we consider a set of polynomials $\mathcal{G}^{{}^{\prime\prime}}_{i}\subseteq\mathcal{G}^{\prime}$ such that $\{\pm(x_{j}^{2}-x_{j}),l_{j}\ |\ j\in[k_{i}]\}\subseteq\mathcal{G}^{{}^{\prime\prime}}_{i}$ and $\mathcal{G}^{{}^{\prime\prime}}_{i+}=\{0,1\}^{k_{i}}$ . Let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}m_{i,2d}^{{}^{\prime\prime}}}$ be the cardinality of the set $\operatorname{Prep}(\mathcal{G}^{{}^{\prime\prime}}_{i})$ restricted to polynomials of degree at most $2d$ . By the assumption (5.1) there exists a $c\in\mathbb{N}^{*}$ such that every $k$ -variate polynomial, which is nonnegative over the Boolean hypercube $\{0,1\}^{k}$ , has a degree $2ck$ certificate using polynomials in $\mathbb{GEN}$ . Thus, we have in particular $q_{i\ell}\in\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G}^{{}^{\prime\prime}}_{i})}^{2ck_{i}}({\mathbb{GEN}})$ for every $i\in[m_{2d}^{{}^{\prime}}]$ and $\ell\in[r_{i}]$ . Hence, we can write $q_{i\ell}=\sum_{j=0}^{m_{2d}^{{}^{\prime\prime}}}s_{i\ell j}G_{i\ell j}$ such that $s_{i\ell j}\in\mathbb{GEN}$ , $G_{i\ell j}\in\operatorname{Prep}(\mathcal{G}^{{}^{\prime\prime}}_{i})$ , and for every $j\in[m_{2d}^{{}^{\prime\prime}}]$ we have $\deg(s_{i\ell j}G_{i\ell j})\leq 2ck_{i}$ . In summary, we obtain:

[TABLE]

where for every $i\in[m_{2d}^{{}^{\prime}}]$ , $j\in[m_{2d}^{{}^{\prime\prime}}]$ , and $\ell\in[r_{i}]$ the degree $\deg(s_{i\ell j}G_{i\ell j}G_{i})$ is at most $2ck_{i}+\deg(G_{i})=2c\lfloor\frac{2d-\deg(G_{i})}{2}\rfloor+\deg(G_{i})\leq 2cd$ , and hence

[TABLE]

By the containment $\mathcal{G}^{{}^{\prime\prime}}_{i}\subseteq\mathcal{G}^{{}^{\prime}}$ , for every $i\in[m_{2d}^{{}^{\prime\prime}}]$ , we get that

[TABLE]

and the statement follows. ∎

5.1. Properties of the $\mathbb{GEN}$ proof system

In theorem 5.1 we gave a sufficient condition for the proof system to be at least as strong as SA∗. As a consequence every proof system satisfying that condition attains the properties of the SA proof system. In particular, this applies to the conditioning property, which has been widely used to construct algorithmic results for various BCPOP problem, see e.g., [LR16]. In what follows we provide a formal description of the property.

Lemma 5.2 (Conditioning).

For every $d\in\mathbb{N}^{*}$ , let ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{L}}$ be the linear operator feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2d}$ . Let $i\in[n]$ be an index such that $0<\mathbb{L}[x_{i}]<1$ . We define ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{L}_{i,(0)}},{\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathbb{L}_{i,(1)}}\leavevmode\nobreak\ :\leavevmode\nobreak\ \mathbb{R}[\boldsymbol{x}]\to\mathbb{R}$ such that:

[TABLE]

Then it holds that $\mathbb{L}[\cdot]=\mathbb{L}[x_{i}]\mathbb{L}_{i,(1)}[\cdot]+(1-\mathbb{L}[x_{i}])\mathbb{L}_{i,(0)}[\cdot]$ . Moreover, both operators $\mathbb{L}_{i,(0)},\mathbb{L}_{i,(1)}$ are feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2d-2}$ .

The proof can be found in e.g., [Rot13, Lemma 2]. Note that for every $i\in[n]$ and $\mathbb{L}_{i,(0)},\mathbb{L}_{i,(0)}$ satisfying the requirements in lemma 5.2 we have $\mathbb{L}_{i,(0)}[x_{i}]=0,\leavevmode\nobreak\ \leavevmode\nobreak\ \mathbb{L}_{i,(1)}[x_{i}]=1$ . lemma 5.2, applied iteratively implies for every set $S\subseteq[n]$ with $|S|\leq d$ that every linear operator $\mathbb{L}[\cdot]$ feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2d+2\lceil d_{\mathcal{G}}/2\rceil}$ can be written as a convex combination of linear operators, feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2\lceil d_{\mathcal{G}}/2\rceil}$ , that maps variables with indices in $S$ to 0 or 1. In other words, $\mathbb{L}=\sum_{i=1}^{2^{d}}\mathbb{L}_{i}$ such that for every $i$ and $j\in S$ we have $\mathbb{L}_{i}[x_{j}]\in\{0,1\}$ .

Example 5.3.

Consider a set of polynomials $\mathcal{G}=\{1,\pm(x_{1}^{2}-x_{1}),\pm(x_{2}^{2}-x_{2}),3/2-x_{1}-x_{2})\}$ . The feasibility set of $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2d+2\lceil d_{\mathcal{G}}/2\rceil}$ is the convex hull of its integral solutions.The feasibility set for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{6}$ is shown in figure 4 a. The feasibility set for a standard relaxation (replacing the integrality constraints with constraints $0\leq x_{i}\leq 1$ ) that corresponds to the feasibility region of $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2}$ can be seen in figure 4 b. The feasibility set of linear operators $\mathbb{L}[\cdot]$ feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2}$ that additionally satisfies the property of being expressed as a convex combinations of operators integral on variable $x_{1}$ ( $x_{2}$ ) can be seen in figure 4 c (d), respectively. Finally, the set of operators feasible for $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{4}$ can be seen in figure 4 e. Note that the feasibility region in figure 4 e is an intersection of regions in figure 4 c and figure 4. ∎

In the construction of the algorithm the conditioning property allows to take the solution of $\mathbb{\overline{\mathbb{SA}}}_{\mathcal{G}}^{2n+2\lceil d_{\mathcal{G}}/2\rceil}$ and choose a crucial variable that was assigned a fractional value and ask this value to be either 0 or 1, depending of the users preferences. The resulting solution is conditioned to be integral on this variable and stays feasible for $2n+2\lceil d_{\mathcal{G}}/2\rceil-2$ degree $\overline{\mathbb{SA}}$ . Clearly, the more variables are conditioned to be integral, the higher is the required degree of the $\overline{\mathbb{SA}}$ solution.

In general the conditioning property might not be easy to prove for a dual formulation of a given proof system. Assume that $\mathbb{GEN}\subset\mathcal{K}({\mathcal{G}_{+}})$ is such that the corresponding conic program admits no duality gap (the minimum value of the primal problem equals the maximum value of the dual program). Note that this assumption is necessary since theorem 5.1 provides arguments on the primal side, namely the semialgebraic proof system. If the corresponding conic program admits a duality gap, then the implications of theorem 5.1 on the dual side might not be correct.

In what follows we provide a corollary giving sufficient conditions for the proof system to admit the conditioning property.

Corollary 5.4.

Every proof system $\mathbb{GEN}$ satisfying the requirements of theorem 5.1 admits the conditioning property stated in lemma 5.2.

5.2. SONC vs SA

In this section we show two results. First, we show that $\textsc{SONC}^{*}$ is at least as strong as $\textsc{SA}^{*}$ , meaning that SONC ${}^{\leavevmode\nobreak\ *}$ contains SA ${}^{\leavevmode\nobreak\ *}$ . Second, by showing that every circuit polynomial is a $d$ -junta, we show that SA is at least as strong as SONC and the same holds for systems strengthened with a Schmüdgen-like Positivstellensatz i.e., SA contains SONC and SA ${}^{\leavevmode\nobreak\ *}$ contains SONC ${}^{\leavevmode\nobreak\ *}$ . As a result we get that SA ${}^{\leavevmode\nobreak\ *}$ and SONC ${}^{\leavevmode\nobreak\ *}$ are polynomially equivalent.

Lemma 5.5.

There exists a $c\in\mathbb{N}$ such that for every $d\in\mathbb{N}^{*}$ we get ${f}_{\mathbb{SA},\operatorname{Prep}(\mathcal{G}^{\prime})}^{2d}\leq\ {f}_{\mathbb{SONC},\operatorname{Prep}(\mathcal{G}^{\prime})}^{2cd}$ , meaning that SONC ${}^{\leavevmode\nobreak\ *}$ contains SA ${}^{\leavevmode\nobreak\ *}$ .

Proof.

By [DKdW18, Theorem 4.7] the $\textsc{SONC}^{*}$ satisfies the condition of theorem 5.1 and thus the proof follows. ∎

Next we prove that the inverse of the inequality in lemma 5.5 also holds over the Boolean hypercube. We start with a technical lemma.

Lemma 5.6.

Let $f$ be a circuit polynomial of total degree $d$ . Then $f$ is a $d$ -junta.

Proof.

Consider a nonnegative circuit polynomial of the form

[TABLE]

Since, by the definition 2.7, $\boldsymbol{\beta}=\sum_{j=0}^{r}\lambda_{j}\boldsymbol{\alpha}(j)$ with $\sum_{j=0}^{r}\lambda_{j}=1$ , $\lambda_{j}>0$ and $\boldsymbol{\beta}\in\mathbb{N}^{r}$ we have $\lambda_{j}\leq\frac{1}{r+1}$ for at least one $j$ . Thus, the total degree of $f$ satisfies $\deg(f)\geq r+1$ .

This means on the contrary, if $f$ is of degree $d$ , then the homogenization of $f$ contains at most $d$ many variables, which implies that $f$ is a $d$ -junta by definition, see subsection 2.3. ∎

We obtain the following corollary:

Corollary 5.7.

Let $f\in\mathbb{R}[\boldsymbol{x}]$ . For every $d\in\mathbb{N}^{*}$ we have ${f}_{\mathbb{SONC},\mathcal{G}}^{2d}\leq\ {f}_{\mathbb{SA},\mathcal{G}}^{4d}$ and ${f}_{\mathbb{SONC},\operatorname{Prep}(\mathcal{G})}^{2d}\leq\ {f}_{\mathbb{SA},\operatorname{Prep}(\mathcal{G})}^{4d}$ . In other words SA contains SONC and SA ${}^{\leavevmode\nobreak\ *}$ contains SONC ${}^{\leavevmode\nobreak\ *}$ .

Proof.

Assume that there exists a degree- $2d$ SONC certificate for $f$ of the form $f=\sum_{j=1}^{m}s_{i}g_{i}$ such that the $s_{i}$ are SONCs and $g_{i}\in\mathcal{G}$ . For every $i\in[m]$ we write $s_{i}=\sum_{j=1}^{k_{i}}c_{ij}$ where $k_{i}\in\mathbb{N}$ and every $c_{ij}$ is a nonnegative circuit polynomial satisfying $\deg(c_{ij}g_{i})\leq 2d$ . By lemma 5.6, every $c_{ij}$ is a $\deg(c_{ij})$ -junta, which yields the first inequality. For the second inequality the proof works analogously with $g_{i}\in\operatorname{Prep}(\mathcal{G})$ . ∎

5.3. SONC vs. SDSOS

In section 4, we saw that in general every SDSOS certificate is a SONC certificate, but not vice versa. Here, we show that the situation is more special on the Boolean hypercube: It turns out that the relation of the two certificates depends on the type of hierarchy, which we allow for SDSOS. We show the following result:

Theorem 5.8.

Let $\mathcal{G}$ be as before with $\mathcal{G}_{+}\subseteq\{0,1\}^{n}$ . Then for all $d\in\mathbb{N}$ we have the following dependencies $\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SDSOS}})=\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SONC}})$ ; $\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})\subsetneq\operatorname{Hier}_{\operatorname{Prep}(\mathcal{G})}^{2d}({\mathbb{SONC}})$ ; $\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SDSOS}})\subseteq\operatorname{Hier}_{\mathcal{G}}^{2d}({\mathbb{SONC}})$ . This implies that SONC ${}^{\leavevmode\nobreak\ *}$ is polynomially equivalent with SDSOS ${}^{\leavevmode\nobreak\ *}$ , SONC ${}^{\leavevmode\nobreak\ *}$ strictly contains SDSOS and SONC contains SDSOS.

Proof.

Following [DKdW18][Definition 10], for every $\mathbf{v}\in\{0,1\}^{n}$ the function

[TABLE]

is called the Kronecker delta (function) of the vector $\mathbf{v}$ . By [DKdW18, Theorem 12] $f$ has a certificate of the form

[TABLE]

where $s_{1},\ldots,s_{2n}$ are SONCs of degree at most $n-2$ , $c_{\mathbf{v}}\in\mathbb{R}_{\geq 0}$ , $g_{i}\in\operatorname{Prep}(\mathcal{G})$ , and $p_{\mathbf{v}}\in\mathcal{G}$ .

Part (1): By [DKdW18, Lemma 14], $\delta_{\boldsymbol{v}}(\boldsymbol{x})\in\operatorname{Prep}(\mathcal{G})$ . Thus, by corollary 4.2, it only remains to show that every SONC involved in the certificate is a binomial square. But this was shown in Case 2 of the proof of [DKdW18, Theorem 16].

Part (2): Follows immediately from the fact that the example from [DKdW18, Theorem 19] to prove corollary 4.4 is an example defined over the Boolean hypercube.

Part (3): Follows from the fact that, by lemma 5.6, every SDSOS certificate can be rewritten as the SONC certificate. ∎

5.4. SDSOS vs SA

In this section we show that for BCPOPs, SA contains SDSOS.

Theorem 5.9.

Let $\mathcal{G}$ be as before with $\mathcal{G}_{+}\subseteq\{0,1\}^{n}$ . Let $f\in\mathbb{R}[\boldsymbol{x}]$ . For every $d\in\mathbb{N}^{*}$ we have ${f}_{\mathbb{SDSOS},\mathcal{G}}^{2d}\leq\ {f}_{\mathbb{SA},\mathcal{G}}^{4d}$ .

The theorem follows immediately from corollary 5.7 and theorem 5.8. We provide, however, an independent proof here, which works on the dual side, and which we consider to be individually interesting.

For every set $K\subseteq[n]$ we denote its power set by ${\color[rgb]{0.2,0.2,0.75}\definecolor[named]{pgfstrokecolor}{rgb}{0.2,0.2,0.75}\mathcal{P}(K)}$ .

Proof.

Consider the problem

[TABLE]

This problem is a relaxation of the problem $\overline{\mathbb{SOS}}^{4d}$ , see (2.2), since, by Sylvester’s Theorem, a symmetric matrix $M$ is PSD if and only if all the principal submatrices are PSD.

First, we show that the degree $4d$ Sherali Adams is equivalent to (5.4), a similar reasoning can be found also e.g., in [Lau03, Section 3.2, Equation (19)]. Consider a constraint $g\in\mathcal{G}$ and a set $K\subseteq[n]$ , such that $|K|\leq\lfloor\frac{4d-\deg(g)}{2}\rfloor$ . A Möbius matrix $Z_{K}^{-1}$ , $Z_{K}^{-1}\in\{-1,0,1\}^{2^{|K|}\times 2^{|K|}}$ , is a square matrix indexed by subsets $I,J$ of $K$ such that $Z_{K}^{-1}(I,J)=(-1)^{|J\setminus I|}$ if $I\subseteq J$ and $Z_{K}^{-1}(I,J)=0$ otherwise. Compute a matrix:

[TABLE]

One can check that ${D_{g}^{4d}}_{\big{|}\mathcal{P}(K)}$ is a diagonal matrix with entries

[TABLE]

see. e.g. [Lau03, Lemma 2].

Finally, since $Z_{K}^{-1}{M_{g}^{4d}}_{\big{|}\mathcal{P}(K)}\left(Z_{K}^{-1}\right)^{\top}$ is a congruent transformation of ${M_{g}^{4d}}_{\big{|}\mathcal{P}(K)}$ that preserves eigenvalues, we get that ${M_{g}^{4d}}_{\big{|}\mathcal{P}(K)}\succeq 0\Leftrightarrow{D_{g}^{4d}}_{\big{|}\mathcal{P}(K)}\succeq 0\Leftrightarrow{D_{g}^{4d}}_{\big{|}\mathcal{P}(K)}(I,I)\geq 0$ , for all $I\in\mathcal{P}(K)$ . Setting $J=K\setminus I$ we get a one-to-one mapping to functions in $\mathbb{SA}_{\mathcal{G}}^{4d}$ .

Next, we show that every linear operator $\widetilde{\mathbb{E}}[\cdot]$ that is feasible for $\overline{\mathbb{SA}}^{{{}^{\prime}}4d}$ is also feasible for $\overline{\mathbb{SDSOS}}^{{{}^{\prime}}2d}$ , that is, for every $g\in\mathcal{G}$ and every $K,\leavevmode\nobreak\ L\subseteq[n]$ , such that $|K|,\leavevmode\nobreak\ |L|\leq\lfloor\frac{2d-\deg(g)}{2}\rfloor$ the matrix ${M_{g}^{2d}}_{\big{|}\{K,L\}}$ is PSD.

Consider the set $H=K\cup L$ . Since $|H|\leq 2\lfloor\frac{2d-\deg(g)}{2}\rfloor$ , for $d,\deg(g)\in\mathbb{N}_{+}$ we have $|H|\leq\lfloor\frac{4d-\deg(g)}{2}\rfloor$ . Now consider a submatrix ${M_{g}^{4d}}_{\big{|}\mathcal{P}(H)}$ . Every linear operator $\widetilde{\mathbb{E}}[\cdot]$ that is feasible for $\overline{\mathbb{SA}}^{{{}^{\prime}}4d}$ has to satisfy ${M_{g}^{4d}}_{\big{|}\mathcal{P}(H)}\succeq 0$ . Finally, since ${M_{g}^{2d}}_{\big{|}\{K,L\}}$ is a principal submatrix of ${M_{g}^{4d}}_{\big{|}\mathcal{P}(H)}$ , by the Sylvester’s criterion the linear operator $\widetilde{\mathbb{E}}[\cdot]$ that is feasible for $\overline{\mathbb{SA}}^{{{}^{\prime}}4d}$ has to satisfy also ${M_{g}^{2d}}_{\big{|}\{K,L\}}\succeq 0$ . This finishes the proof. ∎

Bibliography80

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AH 17] A. A. Ahmadi and G. Hall, Sum of squares basis pursuit with linear and second order cone programming. , Algebraic and geometric methods in discrete mathematics. AMS special session on algebraic and geometric methods in applied discrete mathematics, San Antonio, TX, USA, January 11, 2015. Proceedings, Providence, RI: American Mathematical Society (AMS), 2017, pp. 27–53 (English).
2[AM 13] A. Atserias and E. Maneva, Sherali–adams relaxations and indistinguishability in counting logics , SIAM Journal on Computing 42 (2013), no. 1, 112–137.
3[AM 14] A. A. Ahmadi and A. Majumdar, DSOS and SDSOS optimization: LP and socp-based alternatives to sum of squares optimization , 48th Annual Conference on Information Sciences and Systems, CISS 2014, Princeton, NJ, USA, March 19-21, 2014, 2014, pp. 1–5.
4[AM 17] by same author, DSOS and SDSOS optimization: More tractable alternatives to sum of squares and semidefinite optimization , Co RR abs/1706.02586 (2017).
5[AMT 14] A. A. Ahmadi, A. Majumdar, and R. Tedrake, Control and verification of high-dimensional systems with DSOS and SDSOS programming , CDC 2014, Los Angeles, CA, USA, December 15-17, 2014, 2014, pp. 394–401.
6[ARV 09] S. Arora, S. Rao, and U. V. Vazirani, Expander flows, geometric embeddings and graph partitioning , J. ACM 56 (2009), no. 2, 5:1–5:37.
7[Ave 18] G. Averkov, Optimal size of linear matrix inequalities in semidefinite approaches to polynomial optimization , 2018, Preprint; see ar Xiv:1806.08656.
8[BC 75] G. P. Barker and D. Carlson, Cones of diagonally dominant matrices. , Pacific J. Math. 57 (1975), no. 1, 15–32.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

New Dependencies of Hierarchies in Polynomial Optimization

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

Our Results

Definition 1.1**.**

Acknowledgements

2. Preliminaries

Definition 2.1**.**

Theorem 2.2**.**

2.1. Sum of Squares

Theorem 2.3** (Putinar’s Positivstellensatz; [Put93]).**

Theorem 2.4**.**

2.1.1. SOS - The dual perspective: Lasserre hierarchy

2.2. Scaled Diagonally Dominant Sum of Squares

2.2.1. Scaled diagonally-dominant polynomials

Definition 2.5**.**

Lemma 2.6** ([BC75]).**

2.2.2. SDSOS - Dual perspective

2.3. Sherali Adams

2.3.1. SA - Dual perspective

2.4. Sum of Nonnegative Circuit

2.4.1. Nonnegative Circuit Polynomials

Definition 2.7**.**

Theorem 2.8** ([IdW16], Theorem 3.8).**

Definition 2.9**.**

Theorem 2.10** ([IdW16], Theorem 5.2).**

2.5. Comparing proof systems

3. SOS vs. SONC

3.1. SONC does not contain SOS

Definition 3.1**.**

Lemma 3.2**.**

Proof.

Corollary 3.3**.**

Proof.

3.2. SOS does not contain SONC

Definition 3.4** (Generalized Motzkin Polynomial).**

Proposition 3.5**.**

Proof.

Corollary 3.6**.**

Proof.

Corollary 3.7**.**

Proof.

4. SDSOS vs SONC

4.1. SDSOS is SONC

Lemma 4.1**.**

Proof.

Corollary 4.2**.**

Proof.

Corollary 4.3**.**

Proof.

Corollary 4.4**.**

Proof.

4.2. Closures under Changes of Bases and Relations to SOCP

Problem 4.5**.**

Corollary 4.6**.**

Proof.

Corollary 4.7**.**

Proof.

5. Hierarchies on the Boolean Hypercube

Theorem 5.1**.**

Proof.

5.1. Properties of the GEN\mathbb{GEN}GEN proof system

Lemma 5.2** (Conditioning).**

Example 5.3**.**

Corollary 5.4**.**

5.2. SONC vs SA

Lemma 5.5**.**

Proof.

Lemma 5.6**.**

Proof.

Corollary 5.7**.**

Proof.

Definition 1.1.

Definition 2.1.

Theorem 2.2.

Theorem 2.3 (Putinar’s Positivstellensatz; [Put93]).

Theorem 2.4.

Definition 2.5.

Lemma 2.6 ([BC75]).

Definition 2.7.

Theorem 2.8 ([IdW16], Theorem 3.8).

Definition 2.9.

Theorem 2.10 ([IdW16], Theorem 5.2).

Definition 3.1.

Lemma 3.2.

Corollary 3.3.

Definition 3.4 (Generalized Motzkin Polynomial).

Proposition 3.5.

Corollary 3.6.

Corollary 3.7.

Lemma 4.1.

Corollary 4.2.

Corollary 4.3.

Corollary 4.4.

Problem 4.5.

Corollary 4.6.

Corollary 4.7.

Theorem 5.1.

5.1. Properties of the $\mathbb{GEN}$ proof system

Lemma 5.2 (Conditioning).

Example 5.3.

Corollary 5.4.

Lemma 5.5.

Lemma 5.6.

Corollary 5.7.

Theorem 5.8.

Theorem 5.9.