Algorithm for Optimization and Interpolation based on Hyponormality

C\'edric Josz

arXiv:1703.06604·math.OC·March 21, 2017

Algorithm for Optimization and Interpolation based on Hyponormality

C\'edric Josz

PDF

Open Access

TL;DR

This paper introduces a unified linear algebra-based algorithm leveraging hyponormality to solve complex polynomial optimization and exponential interpolation problems, extending existing methods and avoiding algebraic geometry.

Contribution

It presents the first algorithm capable of extracting global solutions from complex polynomial optimization problems and offers an alternative to Prony's method for interpolation.

Findings

01

Successfully extracts global solutions for complex polynomial optimization.

02

Provides an algebraic alternative to Prony's method for interpolation.

03

Demonstrates enforcement of hyponormality to improve relaxation solutions.

Abstract

On one hand, consider the problem of finding global solutions to a polynomial optimization problem and, on the other hand, consider the problem of interpolating a set of points with a complex exponential function. This paper proposes a single algorithm to address both problems. It draws on the notion of hyponormality in operator theory. Concerning optimization, it seems to be the first algorithm that is capable of extracting global solutions from a polynomial optimization problem where the variables and data are complex numbers. It also applies to real polynomial optimization, a special case of complex polynomial optimization, and thus extends the work of Henrion and Lasserre implemented in GloptiPoly. Concerning interpolation, the algorithm provides an alternative to Prony's method based on the Autonne-Takagi factorization and it avoids solving a Vandermonde system. The algorithm and…

Tables1

Table 1. Table 1: Properties of shift operators

Truncated data	Shift operators	Experiment
General case (only Hermitian)	Existence not guaranteed	Example 6.1
Hermitian but not hyponormal	$T_{1}, \dots, T_{n}, T_{1}^{}, \dots, T_{n}^{}$ do not commute	Example 6.2
Hyponormal	$T_{1}, \dots, T_{n}, T_{1}^{}, \dots, T_{n}^{}$ commute	Example 6.3
Enforced hyponormality	$T_{1}, \dots, T_{n}, T_{1}^{}, \dots, T_{n}^{}$ are made to commute	Example 6.4
Hermitian Toeplitz	Unitary $T_{k}^{*} = T_{k}^{- 1}, k = 1, \dots, n$	Example 6.5
Hermitian Hankel	Real symmetric $T_{k}^{T} = T_{k}, k = 1, \dots, n$	Example 6.6
Complex Hankel	Complex symmetric $T_{k}^{T} = T_{k}, k = 1, \dots, n$	Example 6.7

Equations300

\begin{array}[]{ll}\inf_{z\in\mathbb{C}^{n}}{}&f(z)~{}:=\sum\limits_{\alpha,\beta}f_{\alpha,\beta}\bar{z}^{\alpha}z^{\beta}\\[10.00002pt] \mathrm{s.t.}&g_{i}(z):=\sum\limits_{\alpha,\beta}g_{i,\alpha,\beta}\bar{z}^{\alpha}z^{\beta}\geqslant 0,\quad i=1,\ldots,m\end{array}

\begin{array}[]{ll}\inf_{z\in\mathbb{C}^{n}}{}&f(z)~{}:=\sum\limits_{\alpha,\beta}f_{\alpha,\beta}\bar{z}^{\alpha}z^{\beta}\\[10.00002pt] \mathrm{s.t.}&g_{i}(z):=\sum\limits_{\alpha,\beta}g_{i,\alpha,\beta}\bar{z}^{\alpha}z^{\beta}\geqslant 0,\quad i=1,\ldots,m\end{array}

K := {z \in C^{n} : g_{i} (z) ⩾ 0, i = 1, \dots, m} .

K := {z \in C^{n} : g_{i} (z) ⩾ 0, i = 1, \dots, m} .

d_{K} := max {1, k_{1}, \dots, k_{m}}

d_{K} := max {1, k_{1}, \dots, k_{m}}

\begin{array}[]{c}\inf_{y}~{}L_{y}(f)~{}~{}~{}\text{s.t.}~{}~{}~{}M_{d}(y)\succcurlyeq 0~{}\text{and}~{}M_{d-k_{i}}(g_{i}y)\succcurlyeq 0,~{}i=1,\ldots,m,\\ \\ \sup_{\lambda\in\mathbb{R},\sigma_{j}\in\Sigma_{d}[z]}~{}\lambda~{}~{}~{}\text{s.t.}~{}~{}~{}f-\lambda=\sigma_{0}+\sigma_{1}g_{1}+\ldots+\sigma_{m}g_{m}\end{array}

\begin{array}[]{c}\inf_{y}~{}L_{y}(f)~{}~{}~{}\text{s.t.}~{}~{}~{}M_{d}(y)\succcurlyeq 0~{}\text{and}~{}M_{d-k_{i}}(g_{i}y)\succcurlyeq 0,~{}i=1,\ldots,m,\\ \\ \sup_{\lambda\in\mathbb{R},\sigma_{j}\in\Sigma_{d}[z]}~{}\lambda~{}~{}~{}\text{s.t.}~{}~{}~{}f-\lambda=\sigma_{0}+\sigma_{1}g_{1}+\ldots+\sigma_{m}g_{m}\end{array}

M_{d} (y) := (y_{α, β})_{∣ α ∣, ∣ β ∣ ⩽ d}

M_{d} (y) := (y_{α, β})_{∣ α ∣, ∣ β ∣ ⩽ d}

M_{d - k_{i}} (g_{i} y) := γ, δ \sum g_{i, γ, δ} y_{α, β}_{∣ α ∣, ∣ β ∣ ⩽ d - k_{i}} .

M_{d - k_{i}} (g_{i} y) := γ, δ \sum g_{i, γ, δ} y_{α, β}_{∣ α ∣, ∣ β ∣ ⩽ d - k_{i}} .

σ (z) = k \sum ∣ α ∣ ⩽ d \sum p_{k, α} z^{α}^{2} where p_{k, α} \in C .

σ (z) = k \sum ∣ α ∣ ⩽ d \sum p_{k, α} z^{α}^{2} where p_{k, α} \in C .

y_{α, β} = \int_{K} \overset{z}{ˉ}^{α} z^{β} d μ, \forall∣ α ∣, ∣ β ∣ ⩽ d .

y_{α, β} = \int_{K} \overset{z}{ˉ}^{α} z^{β} d μ, \forall∣ α ∣, ∣ β ∣ ⩽ d .

\begin{array}[]{rccl}f:&\mathbb{C}^{n}&\longrightarrow&\mathbb{C}\\ &z\hphantom{{}^{n}}&\longmapsto&\sum\limits_{k=1}^{d}w_{k}\exp\left(\sum\limits_{i=1}^{n}f_{ik}z_{i}\right)\end{array}

\begin{array}[]{rccl}f:&\mathbb{C}^{n}&\longrightarrow&\mathbb{C}\\ &z\hphantom{{}^{n}}&\longmapsto&\sum\limits_{k=1}^{d}w_{k}\exp\left(\sum\limits_{i=1}^{n}f_{ik}z_{i}\right)\end{array}

f (α) = y_{α}, \forall∣ α ∣ ⩽ 2 d .

f (α) = y_{α}, \forall∣ α ∣ ⩽ 2 d .

f (α) = k = 1 \sum d w_{k} exp (i = 1 \sum n f_{k i} α_{i}) = k = 1 \sum d w_{k} (exp (f_{k}))^{α} = \int_{C^{n}} z^{α} d ν

f (α) = k = 1 \sum d w_{k} exp (i = 1 \sum n f_{k i} α_{i}) = k = 1 \sum d w_{k} (exp (f_{k}))^{α} = \int_{C^{n}} z^{α} d ν

ν := k = 1 \sum d w_{k} δ_{e x p (f_{k})} .

ν := k = 1 \sum d w_{k} δ_{e x p (f_{k})} .

H_{d} (y) := (y_{α + β})_{∣ α ∣, ∣ β ∣ ⩽ d}

H_{d} (y) := (y_{α + β})_{∣ α ∣, ∣ β ∣ ⩽ d}

f (t) = k = 1 \sum d A_{k} exp (σ_{k} t) cos (w_{k} t + ϕ_{k})

f (t) = k = 1 \sum d A_{k} exp (σ_{k} t) cos (w_{k} t + ϕ_{k})

p (z) := ∣ α ∣ ⩽ d \sum p_{α} z^{α} := (z - exp (f_{1})) \dots (z - exp (f_{d}))

p (z) := ∣ α ∣ ⩽ d \sum p_{α} z^{α} := (z - exp (f_{1})) \dots (z - exp (f_{d}))

\begin{array}[]{rcl}\sum\limits_{|\beta|\leqslant d}f(\alpha+\beta)~{}p_{\beta}&=&\sum\limits_{|\beta|\leqslant d}\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}(\alpha+\beta))~{}p_{\beta}\\[11.49995pt] &=&\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}\alpha)~{}\sum\limits_{|\beta|\leqslant d}p_{\beta}\exp(f_{k})^{\beta}\\[5.0pt] &=&\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}\alpha)~{}p(\exp(f_{k}))=0\end{array}

\begin{array}[]{rcl}\sum\limits_{|\beta|\leqslant d}f(\alpha+\beta)~{}p_{\beta}&=&\sum\limits_{|\beta|\leqslant d}\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}(\alpha+\beta))~{}p_{\beta}\\[11.49995pt] &=&\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}\alpha)~{}\sum\limits_{|\beta|\leqslant d}p_{\beta}\exp(f_{k})^{\beta}\\[5.0pt] &=&\sum\limits_{k=1}^{d}w_{k}\exp(f_{k}\alpha)~{}p(\exp(f_{k}))=0\end{array}

^{d} 1^{d - 1}^{d} exp (f_{1})^{d - 1} ⋮^{d} exp (f_{1})^{d - 1} \dots \dots \dots 1^{d - 1} exp (f_{d})^{d - 1} ⋮ exp (f_{d})^{d - 1} w_{1} ⋮ w_{d} = f (0) ⋮ f (d - 1) .

^{d} 1^{d - 1}^{d} exp (f_{1})^{d - 1} ⋮^{d} exp (f_{1})^{d - 1} \dots \dots \dots 1^{d - 1} exp (f_{d})^{d - 1} ⋮ exp (f_{d})^{d - 1} w_{1} ⋮ w_{d} = f (0) ⋮ f (d - 1) .

normal ⟹ subnormal ⟹ hyponormal

normal ⟹ subnormal ⟹ hyponormal

\begin{array}[]{rcl}\langle u,[T^{*},T]u\rangle&=&\|Tu\|^{2}-\|T^{*}u\|^{2}\\ &=&\|Nu\|^{2}-\|PN^{*}u\|^{2}\\ &=&\|N^{*}u\|^{2}-\|P(N^{*}u)\|^{2}\geqslant 0\\ \end{array}

\begin{array}[]{rcl}\langle u,[T^{*},T]u\rangle&=&\|Tu\|^{2}-\|T^{*}u\|^{2}\\ &=&\|Nu\|^{2}-\|PN^{*}u\|^{2}\\ &=&\|N^{*}u\|^{2}-\|P(N^{*}u)\|^{2}\geqslant 0\\ \end{array}

\begin{array}[]{rccc}T:&l^{2}(\mathbb{N})&\longmapsto&l^{2}(\mathbb{N})\\ &(u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(0,u_{0},u_{1},\ldots)\end{array}

\begin{array}[]{rccc}T:&l^{2}(\mathbb{N})&\longmapsto&l^{2}(\mathbb{N})\\ &(u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(0,u_{0},u_{1},\ldots)\end{array}

\begin{array}[]{rccc}T^{*}:&l^{2}(\mathbb{N})&\longmapsto&l^{2}(\mathbb{N})\\ &(u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(u_{1},u_{2},u_{3},\ldots)\end{array}

\begin{array}[]{rccc}T^{*}:&l^{2}(\mathbb{N})&\longmapsto&l^{2}(\mathbb{N})\\ &(u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(u_{1},u_{2},u_{3},\ldots)\end{array}

\begin{array}[]{rccc}N:&l^{2}(\mathbb{Z})&\longmapsto&l^{2}(\mathbb{Z})\\ &(\ldots,u_{-2},u_{-1},u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(\ldots,u_{-3},u_{-2},u_{-1},u_{0},u_{1},\ldots)\end{array}

\begin{array}[]{rccc}N:&l^{2}(\mathbb{Z})&\longmapsto&l^{2}(\mathbb{Z})\\ &(\ldots,u_{-2},u_{-1},u_{0},u_{1},u_{2},\ldots)&\longrightarrow&(\ldots,u_{-3},u_{-2},u_{-1},u_{0},u_{1},\ldots)\end{array}

[T_{1}^{*}, T_{1}] [T_{1}^{*}, T_{2}] ⋮ [T_{1}^{*}, T_{n}] [T_{2}^{*}, T_{1}] [T_{2}^{*}, T_{2}] ⋮ [T_{2}^{*}, T_{n}] \dots \dots \dots [T_{n}^{*}, T_{1}] [T_{n}^{*}, T_{2}] ⋮ [T_{n}^{*}, T_{n}] ≽ 0

[T_{1}^{*}, T_{1}] [T_{1}^{*}, T_{2}] ⋮ [T_{1}^{*}, T_{n}] [T_{2}^{*}, T_{1}] [T_{2}^{*}, T_{2}] ⋮ [T_{2}^{*}, T_{n}] \dots \dots \dots [T_{n}^{*}, T_{1}] [T_{n}^{*}, T_{2}] ⋮ [T_{n}^{*}, T_{n}] ≽ 0

i, j = 1 \sum n ⟨ u_{i}, [T_{j}^{*}, T_{i}] u_{j} ⟩ ⩾ 0.

i, j = 1 \sum n ⟨ u_{i}, [T_{j}^{*}, T_{i}] u_{j} ⟩ ⩾ 0.

(A B B^{*} C^{*}) = (I B A^{- 1} 0 I) (A 0 0 C - B A^{- 1} B^{*}) (I 0 A^{- 1} B^{*} I) .

(A B B^{*} C^{*}) = (I B A^{- 1} 0 I) (A 0 0 C - B A^{- 1} B^{*}) (I 0 A^{- 1} B^{*} I) .

I T_{1} T_{2} ⋮ T_{1} T_{1}^{*} T_{1}^{*} T_{1} T_{1}^{*} T_{2} ⋮ T_{1}^{*} T_{n} T_{2}^{*} T_{2}^{*} T_{1} T_{2}^{*} T_{2} ⋮ T_{2}^{*} T_{n} \dots \dots \dots \dots T_{n}^{*} T_{n}^{*} T_{1} T_{n}^{*} T_{2} ⋮ T_{n}^{*} T_{n} ≽ 0

I T_{1} T_{2} ⋮ T_{1} T_{1}^{*} T_{1}^{*} T_{1} T_{1}^{*} T_{2} ⋮ T_{1}^{*} T_{n} T_{2}^{*} T_{2}^{*} T_{1} T_{2}^{*} T_{2} ⋮ T_{2}^{*} T_{n} \dots \dots \dots \dots T_{n}^{*} T_{n}^{*} T_{1} T_{n}^{*} T_{2} ⋮ T_{n}^{*} T_{n} ≽ 0

i, j \sum \overline{t_{i}} t_{j} T_{i}^{*} T_{j} ≽ 0.

i, j \sum \overline{t_{i}} t_{j} T_{i}^{*} T_{j} ≽ 0.

[T_{i}^{*}, T_{i}] ≽ 0 ⟺ [T_{i}^{*}, T_{i}] = 0.

[T_{i}^{*}, T_{i}] ≽ 0 ⟺ [T_{i}^{*}, T_{i}] = 0.

I T_{i} T_{j} T_{i}^{*} T_{i}^{*} T_{i} T_{i}^{*} T_{j} T_{j}^{*} T_{i}^{*} T_{j} T_{j}^{*} T_{j} ≽ 0.

I T_{i} T_{j} T_{i}^{*} T_{i}^{*} T_{i} T_{i}^{*} T_{j} T_{j}^{*} T_{i}^{*} T_{j} T_{j}^{*} T_{j} ≽ 0.

\left\{\begin{array}[]{c}T_{1}=UD_{1}U^{*}\\ T_{2}=UD_{2}U^{*}\\ \vdots\\ T_{n}=UD_{n}U^{*}\end{array}\right.

\left\{\begin{array}[]{c}T_{1}=UD_{1}U^{*}\\ T_{2}=UD_{2}U^{*}\\ \vdots\\ T_{n}=UD_{n}U^{*}\end{array}\right.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPolynomial and algebraic computation · Advanced Optimization Algorithms Research · Numerical Methods and Algorithms

Full text

Algorithm for Optimization and Interpolation based on Hyponormality

Cédric Josz111Laboratory for Analysis and Architecture of Systems (LAAS), French National Center for Scientific Research (CNRS), 7, avenue du Colonel Roche, Toulouse, 31000, France (). The research was funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement 666981 TAMING).

[email protected]

Abstract

On one hand, consider the problem of finding global solutions to a polynomial optimization problem and, on the other hand, consider the problem of interpolating a set of points with a complex exponential function. This paper proposes a single algorithm to address both problems. It draws on the notion of hyponormality in operator theory. Concerning optimization, it seems to be the first algorithm that is capable of extracting global solutions from a polynomial optimization problem where the variables and data are complex numbers. It also applies to real polynomial optimization, a special case of complex polynomial optimization, and thus extends the work of Henrion and Lasserre implemented in GloptiPoly. Concerning interpolation, the algorithm provides an alternative to Prony’s method based on the Autonne-Takagi factorization and it avoids solving a Vandermonde system. The algorithm and its proof are based exclusively on linear algebra. They are devoid of notions from algebraic geometry, contrary to existing methods for interpolation. The algorithm is tested on a series of examples, each illustrating a different facet of the approach. One of the examples demonstrates that hyponormality can be enforced numerically to strenghten a convex relaxation and to force its solution to have rank one.

keywords:

Autonne-Takagi factorization, Cholesky factorization, Hankel matrix, hyponormality, moment problem, Toeplitz matrix.

AMS:

49M20, 65F99, 47N10.

\slugger

mmsxxxxxxxx–x

1 Introduction

Consider the problem of finding global solutions to the following complex polynomial optimization problem

[TABLE]

where we use the multi-index notation $z^{\alpha}:=z_{1}^{\alpha_{1}}\cdots z_{n}^{\alpha_{n}}$ for $z\in{\mathbb{C}}^{n}$ , $\alpha\in{\mathbb{N}}^{n}$ , and $\bar{z}$ stands for the conjugate of $z$ . As usual, $\mathbb{C}$ denotes the set of complex numbers and $\mathbb{R}$ will denote the set of real numbers. The functions $f,g_{1},\ldots,g_{m}$ are real-valued polynomials so that in the above sums only a finite number of coefficients $f_{\alpha,\beta}$ and $g_{i,\alpha,\beta}$ are nonzero and they satisfy $\overline{f_{\alpha,\beta}}=f_{\alpha,\beta}$ and $\overline{g_{i,\alpha,\beta}}=g_{i,\alpha,\beta}$ . The feasibility set is defined as

[TABLE]

We define its degree to be

[TABLE]

where $k_{i}:=\max\{|\alpha|,|\beta|~{}\text{s.t.}~{}g_{i,\alpha,\beta}\neq 0\}$ is the maximal degree $|\alpha|:=\sum_{k=1}^{n}\alpha_{k}$ in either the conjugate or non-conjugate powers of the polynomial $g_{i}$ . Note that this is different from the degree of the polynomial, which is related to the sum of the conjugate and non-conjugate powers, i.e. $\text{deg}(g_{i}):=\max\{|\alpha|+|\beta|~{}\text{s.t.}~{}g_{i,\alpha,\beta}\neq 0\}$ .

To solve (1), Molzahn and I proposed [43] proposed a semidefinite programming relaxation hierarchy in complex numbers which generalizes Lasserre’s hierarchy [49, 50, 60, 61] and relies on the recent results [25, 68]. It consists in the following primal-dual problems

[TABLE]

where $\succcurlyeq 0$ stands for positive semidefinite and the integer $d$ is the truncation order. The moment matrix is defined by

[TABLE]

and the localizing matrix is defined by

[TABLE]

A polynomial $\sigma(z)=\sum_{|\alpha|,|\beta|\leqslant d}\sigma_{\alpha,\beta}\bar{z}^{\alpha}z^{\beta}$ is a Hermitian sum of squares, i.e. it belongs to $\Sigma_{d}[z]$ , if it is of the form

[TABLE]

This is equivalent to $(\sigma_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}\succcurlyeq 0$ where $\succcurlyeq$ stands for positive semidefinite (see [43] for an explanation). In the above equation, $|\cdot|$ stands for modulus of a complex number.

If one were to convert a complex polynomial optimization problem to real numbers and apply Lasserre’s hierarchy, the moment matrix would be $2^{d}$ times bigger asymptotically (for a large number of variables), hence the advantage of the complex hierarchy we’ve just described. This comes at the cost of a potentially lower optimal value at a given truncation order $d$ . Significant numerical advantages can be seen on the optimal power flow in electrical engineering [12, 57, 43] on instances with several thousand variables and constraints.

Global convergence is guaranteed in the presence of a sphere constraint, i.e. $|z_{1}|^{2}+\ldots+|z_{n}|^{2}=R^{2}$ . At the cost of an additional variable, any complex polynomial optimization problem with compact feasible set can be solved by this approach, as explained in [43]. In that work, conditions for extracting global minimizers were given, but a general procedure for extracting them was left for future work. One of the objectives of this paper is to fill this gap. Specifically, we propose an algorithm to extract an atomic measure $\mu$ from the truncated data $M_{d}(y)$ , i.e. that satisfies

[TABLE]

The atoms are then global solutions to the polynomial optimization problem. There exists no method in the present literature that achieves this to the best of our knowledge. Our algorithm also applies to real polynomial optimization, for which a method already exists [39] and was implemented in Gloptipoly [40]. A variant to that method was later proposed in [53]. We next consider a seemingly unrelated problem for which the same algorithm applies.

Consider the following sum of complex exponential functions

[TABLE]

composed of weights $w_{1},\ldots,w_{p}\in\mathbb{C}^{n}$ and frequencies $f_{1},\ldots,f_{d}\in\mathbb{C}^{n}$ (using the shorthand $f_{k}=(f_{1k},\ldots,f_{nk})^{T}$ where $(\cdot)^{T}$ stands for transpose). Say we want to interpolate a set of imposed values $(y_{\alpha})_{|\alpha|\leqslant 2d}$ with such a function

[TABLE]

In other words, the problem consists in computing weights and frequencies that match the interpolation values. As is well known (e.g.,[47]), these values satisfy

[TABLE]

where

[TABLE]

We use the notation $\exp(f_{k}):=(\exp(f_{1k}),\ldots,\exp(f_{nk}))^{T}$ and $\delta_{z}$ stands for the Dirac measure at $z\in\mathbb{C}^{n}$ . The interpolation values are thus the moments on $\mathbb{C}^{n}$ of the measure $\nu$ . In this setting, we can consider the following moment matrix

[TABLE]

which is a complex Hankel matrix (i.e. $y_{\alpha,\beta}=y_{\gamma,\delta}$ for all $|\alpha|,|\beta|,|\gamma|,|\delta|\leqslant d$ such that $\alpha+\beta=\gamma+\delta$ ). Our algorithm extracts the sought measure $\nu$ from this matrix by using the Autonne-Takagi factorization [77, 7], also known as symmetric singular value decomposition, which applies to complex symmetric matrices. To the best of our knowledge, this factorization has not been used in this context. For a thorough survey on the applications of complex symmetry, see [31]; for recent development on the Autonne-Takagi factorization, see [41]. Another feature of our algorithm is that it avoids solving a Vandermonde linear system, as in the recent preprint [36]. Existing methods were initiated by Baron Gaspard Riche de Prony in 1795 [26], and have been subject to formidable developments [9, 62, 71, 5, 58, 46] and applications [8, 32, 64, 63, 70, 76]. One application is to recover a damped sinusoidal function of real variable $t$ of the form

[TABLE]

from a small number of its evaluations, where

•

$A_{k}$ : amplitude

•

$\sigma_{k}$ : damping

•

$w_{k}$ : angular frequency

•

$\phi_{k}$ : phase shift.

Indeed, such a function is special case of a complex exponential function presented above since $\cos(w_{k}t+\phi_{k})=1/2\exp(iw_{k}t+i\phi_{k})+1/2\exp(-iw_{k}t-i\phi_{k})$ . We briefly recall Prony’s method in the univariate setting of (9)-(10) above. It consists in defining the polynomial

[TABLE]

which, for all $|\alpha|\leqslant d$ , satisfies

[TABLE]

so that its coefficients lie in the Kernel of the Hankel matrix $\mathcal{H}_{d}(y)$ in (13). The Kernel is unidimensional and $p_{d}=1$ , so the coefficients are uniquely determined. The frequencies of the interpolation function are given by the its roots, and the weights can then be deduced by the Vandermonde system

[TABLE]

This paper is organized as follows. Section 2 summarizes the contributions of this paper. Section 3 provides some background on hyponormality and compares the notion in infinite and finite dimensions. Section 4 reviews a result on the moment problem in complex numbers and provides a new result for Toeplitz matrices. The same machinery is then applied to the truncated moment problem which arises in exponential interpolation. The constructive proofs of Section 4 yield an algorithm for optimization and interpolation in Section 5. It is illustrated on various examples in Section 6. Finally, Section 7 concludes our work.

2 Contributions

The main contribution of this paper is to apply hyponormality to polynomial optimization and exponential interpolation. This has not been considered in past literature to the best of our knowledge. We obtain the following new results using hyponormality.

We propose a procedure to extract global solutions from a polynomial optimization problem when using the moment/sum-of-squares hierarchy in complex numbers (algorithm in Section 5). 2. 2.

We propose a variant to Prony’s method based on the Autonne-Takagi factorization using the same algorithm in Section 5, thereby unifying two a priori unrelated problems: optimization and interpolation. 3. 3.

We propose a moment/sum-of-squares hierarchy for complex polynomial optimization in which hyponormality is enforced via convex constraints and illustrate it on an example (Example 6.4 in Section 6). 4. 4.

We propose a solution to the truncated moment problem on a semi-algebraic set when the data forms a Toeplitz matrix (Theorem 2 in Section 4). We find that Toeplitz and Hankel matrices play analogeous role with respect to the moment problem. 5. 5.

We analyse the different properties of shift operators associated to truncated data (eg. unitary, symmetric) and relate them to applications in optimization and interpolation (examples 1 through 7 in Section 6).

3 Joint hyponormality

We recall some fundamental notions in operator theory in order to discuss joint hyponormality. Let $\mathcal{B}(\mathcal{H})$ denote the set of linear bounded operators acting on a complex Hilbert space $(\mathcal{H},\langle\cdot,\cdot\rangle)$ . We convene that the inner product is conjugate-linear in its first variable, and linear in its second. A notion of positivity can be defined for an element $T$ of $\mathcal{B}(\mathcal{H})$ , which is that $\langle u,Tu\rangle\geqslant 0$ for all $u\in\mathcal{H}$ , and which we denote $T\succcurlyeq 0$ . In addition, the commutator of $A,B\in\mathcal{B}(\mathcal{H})$ is defined as $[A,B]:=AB-BA$ . Finally, let $A^{*}$ denote the adjoint of $A\in\mathcal{B}(\mathcal{H})$ . Given these notations, an operator $T\in\mathcal{B}(\mathcal{H})$ is said to be $\ldots$

•

normal if $[T^{*},T]=T^{*}T-TT^{*}=0$ ;

•

subnormal if it can be extended to a normal operator $N$ on a larger Hilbert space $\mathcal{K}$ ;

•

hyponormal if $[T^{*},T]=T^{*}T-TT^{*}\succcurlyeq 0$ .

The following implications hold

[TABLE]

The first implication is obvious. Concerning the second, let $P$ denote the orthogonal projection of $\mathcal{K}$ on $\mathcal{H}$ . Then, for all $u\in\mathcal{H}$ , we have $Tu=NPu$ and thus, $T^{*}=(NP)^{*}=P^{*}N^{*}=PN^{*}$ as projections are self-adjoint. Following [20], we have

[TABLE]

where $\|\cdot\|$ stands for the norm induced by the inner product. The notions of subnormality and hyponormality were introduced by Halmos [35] in 1950 when studying the unilateral shift operator, that is

[TABLE]

where $l^{2}(\mathbb{N})$ is the Hilbert space of square-summable sequences of complex numbers indexed by the natural numbers $\mathbb{N}$ . Its adjoint is

[TABLE]

It is not normal since for all $u\in l^{2}(\mathbb{N})$ , we have $(T^{*}T-TT^{*})u=(u_{0},0,\ldots,0,\ldots)$ . However, it admits the following normal extension

[TABLE]

where $l^{2}(\mathbb{Z})$ is the Hilbert space of square-summable sequences indexed by the integers $\mathbb{Z}$ . By definition, $T$ is subnormal, and, as a consequence, it is hyponormal. We can check that for all $u\in l^{2}(\mathbb{N})$ , we have $\langle u,[T^{*},T]u\rangle=|u_{0}|^{2}\geqslant 0$ . Hyponormality has been subject to various developments since it was first introduced, see for example [14, 22, 27, 28, 29, 56, 78, 69, 15].

In this paper, we will define a shift operator for each variable (optimization variable or variable of interpolation function). We will thus rely on an extension of hyponormality to multiple operators. Following the definition of Athavale [6], operators $T_{1},\ldots,T_{n}\in\mathcal{B}(\mathcal{H})$ are jointly hyponormal if

[TABLE]

in the sense that for all $u_{1},\ldots u_{n}\in\mathcal{H}$ , there holds

[TABLE]

This is equivalent to111This can seen with the Schur complement. If $A,B,C\in\mathcal{B}(\mathcal{H})$ and $A$ is invertible, then

$\begin{pmatrix}A&B^{*}\\ B&C\hphantom{{}^{*}}\end{pmatrix}=\begin{pmatrix}I&0\\ BA^{-1}&I\end{pmatrix}\begin{pmatrix}A&0\\ 0&C-BA^{-1}B^{*}\end{pmatrix}\begin{pmatrix}I&A^{-1}B^{*}\\ 0&I\end{pmatrix}.$

[TABLE]

but it is stronger than requiring $t_{1}T_{1}+\ldots+t_{n}T_{n}$ to be hyponormal for all $t_{1},\ldots,t_{n}\in\mathbb{C}$ , that is

[TABLE]

The key ingredient to making the notion of joint hyponormality relevant for practical purposes is that in finite dimensions, the trace of $[T_{i}^{*},T_{i}]$ is equal to zero, and thus

[TABLE]

This brings about the following equivalences in finite dimension which will be used throughout the paper:

$T_{1},\ldots,T_{n}$ are jointly hyponormal. 2. 2.

The inequality in (22) is an equality. 3. 3.

$T_{1},\ldots,T_{n},T_{1}^{*},\ldots,T_{n}^{*}$ commute pair-wise. 4. 4.

For all pairs $(T_{i},T_{j})$ with $i<j$ , it holds that

[TABLE] 5. 5.

There exists a unitary matrix $U$ (i.e. $U^{*}U=UU^{*}=I$ where $I$ denotes the identity) and diagonal matrices $D_{1},\ldots,D_{n}$ such that

[TABLE]

These equivalences bring a new perspective on the moment problem to the best of our knowledge. They were implicitely used in our preprint [43]. It allows one to deal with a kind of truncated moment problem that arises in practice, but which has not been given much attention to from a theoretical perspective.

4 Truncated moment problem

The moment problem is an old yet active subject of research [37, 3, 48, 18, 17, 20, 66, 72, 16, 19, 45, 44, 74, 73, 23, 54, 38, 10, 59, 52, 2]. We now focus on a kind of moment problem that arises in practice. Consider an integer $d\in\mathbb{N}$ and an infinite sequence of complex numbers $(y_{\alpha,\beta})_{\alpha,\beta\in\mathbb{N}^{n}}$ . In the context of polynomial optimization, we are interested in knowing whether there exists a positive Borel measure $\mu$ supported on the semi-algebraic set $K$ such that

[TABLE]

Should it exist, we are also interested in computing the measure from its moments $(y_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}$ . In the context of exponential interpolation, a complex-valued measure supported on $\mathbb{C}^{n}$ is presumed to exist such that

[TABLE]

and we are solely interested in computing the measure from its moments.

In previous work [18, 16, 19], the authors raise the question of whether there exists a positive Borel measure $\mu$ supported on the semi-algebraic set $K$ such that

[TABLE]

This corresponds to a degree truncation, as opposed to a square truncation as above. The discrepancy is illustrated in Figure 1 below. It calls for different notions of moment matrices, as shown in Figure 2. The moment matrix resulting from square truncation is referred to as pruned complex moment matrix in [51], but the moment problem with square truncation is not considered. The moment problem with degree truncation is equivalent in some sense (see [19, Theorem 5.2]) to the even-dimensional real moment problem, i.e. where we seek a measure on a real semi-algebraic set such that

[TABLE]

given a real sequence $(y_{\alpha})_{\alpha\in\mathbb{N}^{2n}}$ . In contrast, the square truncation we consider captures the real truncated moment problem as the special case (both even- and odd-dimensional). It corresponds to the case where the moment data forms a Hankel matrix (see [43, Corollary 3.9] and Theorem 2 below).

In Theorem 1 and Theorem 2 below, solutions to the moment problem with square truncation are given using the notion of hyponormality presented in Section 3. The proofs are constructive and the algorithm proposed in Section 5 replicates each step of the proofs. To introduce the results, we need some notation. In the multivariate case, we say that truncated data is Hermitian if $\overline{y_{\beta,\alpha}}=y_{\alpha,\beta}$ for all $|\alpha|,|\beta|\leqslant d$ . Given an integer $r\in\mathbb{N}$ , a $r$ -atomic measure is sum of $r$ Dirac measures in $r$ distinct points (called atoms) with nonzero weights. It is said to be positive if all the weights are positive, and supported on a set if the atoms lie in the set. In the setting of interpolation which we treat after Theorem 1 and Theorem 2, the weights are complex numbers.

Theorem 1.

[43*, Theorem 3.8]**

Consider some complex numbers $(y_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}$ with $d\geqslant d_{K}$ forming a Hermitian matrix. Assume that $K$ contains the constraint $\sum_{k=1}^{n}|z_{k}|^{2}\leqslant R^{2}$ for some radius $R\geqslant 0$ . Then there exists a positive $\text{rank}M_{d-d_{K}}(y)$ -atomic measure* $\mu$ supported on $K$ such that

[TABLE]

if and only if

$M_{d}(y)\succcurlyeq 0$ and $M_{d-k_{i}}(g_{i}y)\succcurlyeq 0,~{}i=1\ldots m$ ; 2. 2.

$\text{rank}M_{d}(y)=\text{rank}M_{d-d_{K}}(y)$ ; 3. 3.

$\begin{pmatrix}M_{d-d_{K}}(y)&M_{d-d_{K}}(\bar{z}_{i}y)&M_{d-d_{K}}(\bar{z}_{j}y)\\ M_{d-d_{K}}(z_{i}y)&M_{d-d_{K}}(|z_{i}|^{2}y)&M_{d-d_{K}}(\bar{z}_{j}z_{i}y)\\ M_{d-d_{K}}(z_{j}y)&M_{d-d_{K}}(\bar{z}_{i}z_{j}y)&M_{d-d_{K}}(|z_{j}|^{2}y)\end{pmatrix}\succcurlyeq 0,~{}\forall 1\leqslant i<j\leqslant n$ .

Moreover, $\mu$ is the unique $\text{rank}M_{d-d_{K}}(y)$ -atomic measure satisfying (33), and for each $1\leqslant i\leqslant m$ , it has exactly $\text{rank}M_{d}(y)-\text{rank}M_{d-d_{K}}(g_{i}y)$ atoms that are zeros of $g_{i}$ . In the univariate case $n=1$ , Condition 3 must be replaced by

[TABLE]

Proof.

We provide a sketch of the proof. We focus on the “if” part, as it is important in the sequel. The positive semidefinite moment matrix of rank $r:=\text{rank}M_{d}(y)$ can be factorized as $M_{d}(y)=X^{*}X$ where $(\cdot)^{*}$ stands for adjoint, i.e. conjugate transpose. This can be achieved with the Cholesky factorization [13]. The rows of $X=:(x_{i,\alpha})$ are indexed by $1\leqslant i\leqslant r$ and the columns are $X$ are indexed by $|\alpha|\leqslant d$ . Thanks to Conditions 1 and 2, there exists complex matrices $T_{1},\ldots,T_{n}$ of order $r$ , called shift operators, such that for each $1\leqslant k\leqslant n$ , we have

[TABLE]

Here, $x_{\alpha}$ denotes the $\alpha$ -column of $X$ and $e_{k}$ is the row vector of size $n$ that contains only zeros apart from 1 in position $k$ . We now explain why the shift operators exist. Consider the finite-dimensional Hilbert space $\mathcal{H}:=\text{span}(x_{\alpha})_{|\alpha|\leqslant d}$ . Condition 2 implies that $\mathcal{H}=\text{span}(x_{\alpha})_{|\alpha|\leqslant d-1}$ . Given some complex numbers $(u_{\alpha})_{|\alpha|\leqslant d-1}$ , Condition 1 implies that

[TABLE]

where $\|x\|:=\sqrt{x^{*}x}$ denote the 2-norm of a vector $x\in\mathbb{C}^{r}$ . As a result, given two possibly different sets of coefficients $(u_{\alpha})_{|\alpha|\leqslant d-1}$ and $(v_{\alpha})_{|\alpha|\leqslant d-1}$ , if $\sum_{|\alpha|\leqslant d-1}u_{\alpha}x_{\alpha}=\sum_{|\alpha|\leqslant d-1}v_{\alpha}x_{\alpha}$ , then $\sum_{|\alpha|\leqslant d-1}u_{\alpha}x_{\alpha+e_{k}}=\sum_{|\alpha|\leqslant d-1}v_{\alpha}x_{\alpha+e_{k}}$ . In other words, each element of $\mathcal{H}$ has a unique image by $T_{k}$ , which makes the shift well-defined. In fact, it is bounded by the radius $R$ .

Condition 3 implies that for all $1\leqslant i<j\leqslant n$ , we have

[TABLE]

As discussed in Section 3, this makes the shifts jointly hyponormal. Thus there exists a unitary matrix $P$ such that $T_{k}=PD_{k}P^{*}$ where $D_{k}=:\text{diag}(d_{k1},\ldots,d_{kr})$ is a diagonal matrix for each $1\leqslant k\leqslant n$ . Bearing in mind that $M_{d}(y)=X^{*}X$ , we have for all $|\alpha|,|\beta|\leqslant d$ that

[TABLE]

where $P=:(p_{1}\ldots p_{r})$ denote the columns of $P$ and $d_{j}:=(d_{1j},\ldots,d_{nj})$ . As a result, eigenvalues of the shift operators correspond to the support of a measure, and their eigenvectors yield the weights of a measure. Precisely, the following measure

[TABLE]

solves the truncated problem up to order $d$ , i.e. (33). The uniqueness of the measure can easily be deduced from Lemma 3 and Lemma 4 in the Appendix. ∎

We will say that the truncated moment data $(y_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}$ is hyponormal if it satisfies Condition 3 of Theorem 1. Indeed, this condition corresponds to the joint hyponormality of the shift operators. We propose to enforce the joint hyponormality of the shift operators in the complex hierarchy by requiring the truncated data to be hypornormal. This yields the following primal problem

[TABLE]

and its dual counterpart

[TABLE]

where a polynomial $p$ belongs to $\mathbb{R}_{d}[z,\bar{z}]$ if it is of the form $\sum_{|\alpha|,|\beta|\leqslant d}p_{\alpha,\beta}\bar{z}^{\alpha}z^{\beta}$ where $p_{\alpha,\beta}\in\mathbb{C}$ . Recall that for complex polynomials, the coefficients $p_{\alpha,\beta}$ are unique (see Section 3.2, footnote 4 in [43] for a comparison with real polynomials). Thus the complex polynomial $p$ can be identified with the Hermitian matrix $(p_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}$ . This is what we do in the semidefinite constraints in the above dual problem. In Example 6.4, we apply this hierarchy of relaxations to a bivariate optimization problem. Compared to the complex moment/sum-of-squares hierarchy, it solves this particular problem at a lower order.

There exists several identifiable cases where the shift operators are naturally jointly hyponormal. It was noticed in [43, Corollary 3.9] that it is the case when the truncated data forms a Hermitian Hankel matrix (thus real-valued). This corresponds exactly to the truncated data generated by the Lasserre hierarchy for real polynomial optimization. In that case, the shift operators are real symmetric. Below, we show that hyponormality is also guaranteed if we assume that the truncated data forms a Toeplitz matrix. This result has not been presented in the literature to the best of our knowledge. In the multivariate case, we say that the truncated data is Hankel if $y_{\alpha,\beta}=y_{\gamma,\delta}$ for all $|\alpha|,|\beta|,|\gamma|,|\delta|\leqslant d$ such that $\alpha+\beta=\gamma+\delta$ , and we say that the truncated data is Toeplitz if $y_{\alpha,\beta}=y_{\gamma,\delta}$ for all $|\alpha|,|\beta|,|\gamma|,|\delta|\leqslant d$ such that $\alpha-\beta=\gamma-\delta$ . In other words, $y_{\alpha,\beta}$ only depends on $\alpha+\beta$ in a Hankel matrix, and it only depends on $\alpha-\beta$ in a Toeplitz matrix.

Theorem 2.

Consider some complex numbers $(y_{\alpha,\beta})_{|\alpha|,|\beta|\leqslant d}$ with $d\geqslant d_{K}$ forming either a Hermitian Toeplitz matrix or a Hermitian Hankel matrix. There exists a positive $\text{rank}M_{d-d_{K}}(y)$ -atomic measure $\mu$ supported on $K$ such that

[TABLE]

if and only if

$M_{d}(y)\succcurlyeq 0$ and $M_{d-d_{K}}(g_{i}y)\succcurlyeq 0,~{}i=1\ldots m$ ; 2. 2.

$\text{rank}M_{d}(y)=\text{rank}M_{d-d_{K}}(y)$ .

Moreover, $\mu$ is the unique $\text{rank}M_{d-d_{K}}(y)$ -atomic measure satisfying (42), and for each $1\leqslant i\leqslant m$ , the measure $\mu$ has exactly $\text{rank}M_{d}(y)-\text{rank}M_{d-d_{K}}(g_{i}y)$ atoms that are zeros of $g_{i}$ .

Proof.

( $\Longrightarrow$ ) Same as proof as in [43, Theorem 3.8]. ( $\Longleftarrow$ ) The proof is similar to that of Theorem 1. We focus on the areas where it differs and only consider the case of Toeplitz matrices. The Toeplitz property implies that the shift operators in the proof of Theorem 1 are well-defined and unitary. Indeed, for all complex numbers $(u_{\alpha})_{|\alpha|\leqslant d-1}$ , it holds that

[TABLE]

Using the same argument as in the proof of Theorem 1, the shift $T_{k}$ is well-defined. In addition, it is isometric and thus satisfies $T_{k}^{*}T_{k}=T_{k}T_{k}^{*}=I$ . Now, observe that $(T_{1},\ldots,T_{n})$ is a pair-wise commuting tuple of operators on the Hilbert space $\mathcal{H}$ . As a consequence, $(T_{1},\ldots,T_{n},T_{1}^{*},\ldots,T_{n}^{*})=(T_{1},\ldots,T_{n},T_{1}^{-1},\ldots,T_{n}^{-1})$ is also a pair-wise commuting tuple of operators. Indeed, if two invertible square matrices $A$ and $B$ commute, so do $A^{-1}$ and $B^{-1}$ (since $A^{-1}B^{-1}ABB^{-1}A^{-1}=A^{-1}B^{-1}BAB^{-1}A^{-1}$ ), and so do $A$ and $B^{-1}$ (since $B^{-1}ABB^{-1}=B^{-1}BAB^{-1}$ ). It follows that $T_{1},\ldots,T_{n}$ are jointly hyponormal. The rest of the proof is identical to that of Theorem 1. ∎

In the univariate case $n=1$ with support equal to the full space $K=\mathbb{C}$ , the truncated moment problem in Theorem 2 with Toeplitz data is actually the truncated trigonometric moment problem. A solution to this problem has been given by [42, P. 211], [4, Theorem I.I.12], and [21, Theorem 6.12]. It can be stated as follows. A Toeplitz matrix with positive upper left element is positive semidefinite if and only if it is represented by a positive Borel measure. In other words, the rank need not be preserved (Condition 2 of Theorem 2) for there to exist a measure. The trigonometric moment problem has been considered more recently in [30, 1, 81].

In the setting of interpolation, the shift operators are more simple to study since we assume the existence of a representing measure (i.e. $\nu=\sum_{k=1}^{d}w_{k}\delta_{\exp(f_{k})}$ ). The measure is uniquely determined by its moments $y_{\alpha}$ up to degree $2d$ , and thus the interpolation problem has a unique solution with $d$ exponentials. In addition, the rank is preserved, i.e. $\text{rank}\mathcal{H}_{d}(y)=\text{rank}\mathcal{H}_{d-1}(y)$ . These claims can be easily be deduced from Lemma 3 and Lemma 4 in the Appendix. We next prove that shift operators associated to the Hankel moment matrix are guaranteed to exist, that they are simultaneously diagonalizable, and that they are complex symmetric.

The moment matrix $\mathcal{H}_{d}(y)$ is of Hankel type and is thus complex symmetric. According to the Autonne-Takagi factorization [7] [77, Theorem II] which applies to any square complex symmetric matrix, there exists a unitary matrix $U$ (i.e. $U^{*}U=UU^{*}=I$ ) and a diagonal matrix $D$ with real nonnegative entries such that $\mathcal{H}_{d}(y)=UDU^{T}$ . Note that the diagonal values of $D$ are the eigenvalues of $\mathcal{H}_{d}(y)\mathcal{H}_{d}(y)^{*}$ , whose rank is equal to that of $\mathcal{H}_{d}(y)$ . Defining $X:=\sqrt{D}~{}U^{T}$ , we may in fact write that $\mathcal{H}_{d}(y)=X^{T}X$ where the rows of $X=:(x_{i,\alpha})$ are indexed by $1\leqslant i\leqslant d$ and the columns are $X$ are indexed by $|\alpha|\leqslant d$ . In addition, since our data is represented by the measure $\nu$ , we know that $\mathcal{H}_{d}(y)=V^{T}V$ where the rows of $V=:(s_{k}\exp(f_{k})^{\alpha})$ are indexed by $1\leqslant k\leqslant d$ and the columns are $V$ are indexed by $|\alpha|\leqslant d$ . Here, $s_{k}$ is a complex number such that $s_{k}^{2}=w_{k}$ where $w_{k}$ is a complex weight of the measure $\nu$ . As a result, $X^{T}X=V^{T}V$ and thus $X$ and $V$ have the same ranges. Hence, there exists an invertible matrix $P$ such that $X=PV$ . Thus $V^{T}P^{T}PV=V^{T}V$ , or, in other words, $V^{T}(P^{T}P-I)V=0$ . According to Lemma 3 and Lemma 4 in the Appendix, the range of $V$ is equal to $\mathbb{C}^{d}$ , so that we in fact have $v^{T}(P^{T}P-I)v=0$ for all $v\in\mathbb{C}^{d}$ . As a result, $P^{T}P=PP^{T}=I$ . Going back to the relationship $X=PV$ , we have in particular that $x_{\alpha}=Pv_{\alpha}$ for all $|\alpha|\leqslant d$ . Defining $D_{k}:=\text{diag}(\exp(f_{1k}),\ldots,\exp(f_{dk}))$ , we have that $D_{k}v_{\alpha}=v_{\alpha+e_{k}}$ , and thus $PD_{k}P^{T}x_{\alpha}=x_{\alpha+e_{k}}$ . The shift operators are thus $T_{k}:=PD_{k}P^{T}$ for $1\leqslant k\leqslant n$ . Remembering that $\mathcal{H}_{d}(y)=X^{T}X$ , we have

[TABLE]

so that the sought measure is entirely determined

[TABLE]

We conclude by proving that the shifts are complex symmetric. Since $\text{rank}\mathcal{H}_{d}(y)=\text{rank}\mathcal{H}_{d-1}(y)$ , we have that $\text{span}(x_{\alpha})_{|\alpha|\leqslant d}=\text{span}(x_{\alpha})_{|\alpha|\leqslant d-1}$ . Consider some complex numbers $(u_{\alpha})_{|\alpha|\leqslant d-1}$ and $(v_{\alpha})_{|\alpha|\leqslant d-1}$ . Let $u:=\sum_{|\alpha|\leqslant d-1}u_{\alpha}x_{\alpha}$ and $v:=\sum_{|\alpha|\leqslant d-1}v_{\alpha}x_{\alpha}$ and compute

[TABLE]

whence $T_{k}=T_{k}^{T}$ .

5 Algorithm

Input:

•

number of variables $n$

•

truncation order $d$

•

moment matrix of rank $r$

Output:

•

$r$ -atomic measure

Below, the notation $\bullet$ either stands either for conjugate transpose or transpose depending on whether the algorithm is being applied to polynomial optimization ( $\bullet=*$ ) or interpolation ( $\bullet=T$ ). In the case of real polynomial optimization, $\bullet=*=T$ .

Factorize the moment matrix into $X^{\bullet}X$ . The rows of $X=(x_{i,\alpha})$ are indexed by $1\leqslant i\leqslant r$ and the columns of $X$ are indexed by $|\alpha|\leqslant d$ . 2. 2.

Find a subset of the columns of $X$ that generate the column space of $X$ . Let $\alpha(1),\ldots,\alpha(r)\in\mathbb{N}^{n}$ denote their indexes. 3. 3.

Compute the shift operators $T_{1},\ldots,T_{n}\in\mathbb{C}^{r\times r}$ by applying them only to the column basis, i.e. $T_{k}x_{\alpha(i)}=x_{\alpha(i)+e_{k}}$ for $1\leqslant i\leqslant r$ . 4. 4.

Generate some random $t_{1},\ldots,t_{n}\in\mathbb{R}$ and diagonalize the matrix $\sum_{i=1}^{n}t_{i}T_{i}=PDP^{\bullet}$ with $P^{\bullet}P=PP^{\bullet}=I$ . Let $P=:(p_{1}\ldots p_{r})$ denote the columns of $P$ . 5. 5.

Compute the measure $\mu=\sum\limits_{k=1}^{r}x_{0}^{\bullet}p_{k}p_{k}^{\bullet}x_{0}~{}\delta_{(p_{k}^{\bullet}T_{i}p_{k})_{1\leqslant i\leqslant n}}$ .

In the case of polynomial optimization, the atoms of the measure are global solutions. In the case of interpolation, the arguments of the atoms are the frequencies (modulo $2\pi$ for their imaginary parts) and the weights of the measure are the weights of the complex exponential sum.

6 Numerical experiments

We use Matlab 2015b, CVX [24, 33], and SeDuMi [75]. The Cholesky factorization of positive semidefinite matrices is computed via an eigendecomposition followed by a QR factorization. The Autonne-Takagi factorization is computed using the implementation of Guo, Luk, Xu, and Piao [80, 79, 34, 55]. Another algorithm for this factorization is discussed in [11]. Table 1 below summarizes the experiments. Each of them illustrates a different property of the shift operators that arises in applications.

Example 6.1 (Nonexistent shifts).

Consider the following truncated data:

[TABLE]

The existence of shift operators is guaranteed if Conditions 1 and 2 of Theorem 1 hold. Condition 2 is satified because $\text{rank}M_{1}(y)=\text{rank}M_{2}(y)=1$ . However, Condition 1 is not satisfied. The spectrum of $M_{2}(y)$ is equal to $\{0,6\}$ but there does not exist $R\geqslant 0$ such that $M_{1}[(R^{2}-|z|^{2})y]\succcurlyeq 0$ . Indeed, the characteristic polynomial of $M_{1}[(R^{2}-|z|^{2})y]$ with indeterminate $X$ is equal to

[TABLE]

The moment matrix can be factorized as $M_{2}(y)=X^{*}X$ where

[TABLE]

*There does not exist a shift operator $T$ acting on $\mathbb{C}$ , i.e. a scalar $T\in\mathbb{C}$ , such that $Tx_{0}=x_{1}$ and $Tx_{1}=x_{2}$ . That would imply that $T\times 1=1$ and $T\times 1=2$ , which is absurd. *

Example 6.2 (Non-hyponormal shifts).

Consider the following truncated data which was randomly generated:

[TABLE]

We chose the data so that $M_{2}(y)\succcurlyeq 0$ and $M_{1}(y)$ is invertible. As result, $M_{1}[(R^{2}-|z_{1}|^{2}-|z_{2}|^{2})y]\succcurlyeq 0$ holds as long a $R>0$ is big enough. Hence Condition 1 of Theorem 1 holds. In addition, we chose the data so that the rank is preserved, i.e. $\text{rank}M_{1}(y)=\text{rank}M_{2}(y)=3$ and so Condition 2 holds. All the conditions of Theorem 1 are thus satisfied, apart from Condition 3. We now show that Condition 3 fails to hold, and thus no atomic measure may be extracted from the data.

We have the Cholesky decomposition $M_{2}(y)=X^{*}X$ where

[TABLE]

for which the column basis indexed by $\{1,z_{1},z_{2}\}$ is readily identified. We then obtain the shift operators

[TABLE]

and

[TABLE]

For example, the image by $T_{1}$ of the column of $X$ indexed $z_{2}$ is equal to the column of $X$ indexed by $z_{1}z_{2}$ :

[TABLE]

Using an eigendecomposition, we find that

[TABLE]

and

[TABLE]

where $\text{sp}\{\cdot\}$ stands for spectrum. This shows that Condition 3 of Theorem 1 does not hold, that is to say, $T_{1}$ and $T_{2}$ are not jointly hyponormal.

Example 6.3 (Complex polynomial optimization over an ellipse).

Minimize

[TABLE]

over $z_{1},z_{2}\in\mathbb{C}$ subject to

[TABLE]

The feasible set of this example is taken from [43, Example 3.2], itself originally from [67]. It is represented in Figure 3.

The first order complex relaxation is not defined because the variables appear to the second power (notice that $d_{K}=2$ ). The second order relaxation yields the value 1.00047 and a spectral decomposition indicates that $\text{rank}M_{0}(y)=1$ and $\text{rank}M_{2}(y)=3$ . Thus Condition 2 of Theorem 1 does not hold. The third order relaxation yields 1.93291 and the following moment matrix:

[TABLE]

Using a spectral decomposition, we find that $\text{rank}M_{1}(y)=\text{rank}M_{3}(y)=2$ . In addition, we find the Cholesky decomposition $M_{3}(y)=X^{*}X$ where

[TABLE]

for which the column basis indexed by $\{1,z_{1}\}$ is readily identified. We then obtain the shift operators

[TABLE]

For example, the image by $T_{1}$ of the column of $X$ indexed by $z_{1}z_{2}$ is equal to the column of $X$ indexed by $z_{1}^{2}z_{2}$ :

[TABLE]

Using an eigendecomposition, we find that

[TABLE]

and

[TABLE]

This implies that $T_{1}$ and $T_{2}$ are jointly hyponormal and all three conditions of Theorem 1 hold. An atomic measure can be extracted from the truncated data. Due to hyponormality, we may diagonalize the shifts simultaneously. Taking a random linear combination, we have

[TABLE]

with

[TABLE]

and

[TABLE]

The first coordinate of the atoms is given by the diagonal of

[TABLE]

The second coordinate of the atoms is given by the diagonal of

[TABLE]

The atoms can now be read by looking at the first diagonal elements, then the second diagonal elements:

[TABLE]

With $x_{1}$ denoting the column of $X$ indexed by $1$ , and with $p_{1}$ and $p_{2}$ denoting the first and second colums of $P$ , the respective weights of the atoms are

[TABLE]

Example 6.4 (Enforcing joint hyponormality in a convex relaxation).

Minimize

[TABLE]

over $z_{1},z_{2}\in\mathbb{C}$ subject to

[TABLE]

The feasible set of this example is the same as in Example 6.3 and Figure 3.

The first order complex relaxation is not defined because the variables appear to the second power (notice that $d_{K}=2$ ). The second order relaxation yields the value 0.155089 and the following moment matrix:

[TABLE]

A spectral decomposition reveals that $\text{rank}M_{1}(y)=\text{rank}M_{2}(y)=3$ . In addition, we have the Cholesky decomposition $M_{2}(y)=X^{*}X$ where

[TABLE]

for which the column basis indexed by $\{1,z_{1},z_{2}\}$ is readily identified. We then obtain the shift operators

[TABLE]

Using an eigendecomposition, we find that

[TABLE]

and

[TABLE]

This shows that Condition 3 of Theorem 1 does not hold and that $T_{1}$ and $T_{2}$ are not hyponormal. Thus no measure can be extracted from the data.

The third order complex relaxation yields the value 0.428175 and the moment matrix satisfies $\text{rank}M_{1}(y)=\text{rank}M_{3}(y)=1$ . Thus a Dirac measure can be extracted, with support $(z_{1},z_{2})=(-0.8165i,1.5275)$ and weight $1.0000$ . Instead of computing the third order relaxation, we enforce hyponormality by adding the following constraint in the second order relaxation:

[TABLE]

We then obtain the value 0.428175 and the following moment matrix:

[TABLE]

which satisfies $\text{rank}M_{0}(y)=\text{rank}M_{2}(y)=1$ . In addition, we have the Cholesky decomposition $M_{2}(y)=X^{*}X$ where

[TABLE]

The shift operators can be read from $X$ and they are equal to

[TABLE]

Using an eigendecomposition, we find that

[TABLE]

and

[TABLE]

Jointly hyponormality has been successfully enforced. It has reduced the rank of the moment matrix from rank 3 to rank 1. The eigenvalues of $T_{1}$ and $T_{2}$ are their values, thus we find the Dirac measure with support $(z_{1},z_{2})=(-0.8165i,1.5275)$ and weight $1.0000$ , precisely as in the third order relaxation. Notice that in the third order relaxation, there is one conic constraint of size $10\times 10$ and one of size $6\times 6$ . In the second order relaxation augmented with (71), there is one conic constraint of size $9\times 9$ , one of size $6\times 6$ , and one of size $3\times 3$ .

Example 6.5 (Complex polynomial optimization on the torus).

Minimize

[TABLE]

over $z\in\mathbb{C}$ subject to

[TABLE]

The feasible set is represented in Figure 4.

The first and second order complex relaxations are not defined, while the third relaxation yields the value $0.999999$ and the following moment matrix:

[TABLE]

Notice that the Toeplitz property holds. For instance, the term indexed by $(\bar{z},z^{2})$ is equal to the term indexed by $(\bar{z}^{2},z^{3})$ (their common value is $0.2500+0.4330i$ ). The moment matrix satisfies $\text{rank}M_{0}(y)=1$ and $\text{rank}M_{1}(y)=\text{rank}M_{2}(y)=\text{rank}M_{3}(y)=2$ . While Condition 2 in Theorem 2 does not hold since $d=d_{K}=3$ , the rank is preserved between the current truncation order ( $d=3$ ) and the previous ( $d-1=2$ ). This suffices to guarantee the extraction of $\text{rank}M_{d-1}(y)$ -atomic measure, i.e with 2 atoms. However, the atoms will not necessarily lie in the feasible set, so we must check for this in the end. We have the Cholesky decomposition $M_{3}(y)=X^{*}X$ where

[TABLE]

for which the row basis indexed by $\{1,z\}$ is readily identified. We then obtain the shift operator

[TABLE]

which is unitary

[TABLE]

We have $T=PDP^{*}$ where

[TABLE]

and

[TABLE]

whence the two atoms are

[TABLE]

Their respective weights are

[TABLE]

It is easy to check that the atoms found are third roots of unity. In the fourth order relaxation, all conditions of Theorem 2 hold, and we find the optimal value $1.0000$ and the same solutions.

Example 6.6 (Real polynomial optimization).

Minimize

[TABLE]

over $x_{1},x_{2},x_{3}\in\mathbb{R}$ subject to

[TABLE]

This example is taken from [50, Ex. 5]. Henrion and Lasserre’s method for extracting global minimizers is illustrated on this example in [39, Section 2.3]. The feasible set is represented in Figure 5.

The first order relaxation yields the value -3.0000 and a spectral decomposition indicates that $\text{rank}M_{0}(y)=1$ and $\text{rank}M_{1}(y)=3$ . Thus, the rank is not preserved. The third order relaxation yields -2.0000 and the following moment matrix:

[TABLE]

in which the rank is preserved since $\text{rank}M_{1}(y)=\text{rank}M_{2}(y)=3$ . Notice that the Hankel property holds. For instance the term indexed by $(x_{1}^{2},x_{2})$ is equal to the term indexed by $(x_{1},x_{1}x_{2})$ (their common value is 4.9625). We have the Cholesky decomposition $M_{2}(y)=X^{T}X$ where

[TABLE]

for which the row basis indexed by $\{1,x_{1},x_{2}\}$ is readily identified. We then obtain the shift operators

[TABLE]

which are symmetric matrices. For example, the image by $T_{1}$ of the column of $X$ indexed by $x_{2}$ is equal to the column of $X$ indexed by $x_{1}x_{2}$ :

[TABLE]

Taking a random linear combination, we have

[TABLE]

where

[TABLE]

The first coordinate of the atoms is given by the diagonal of

[TABLE]

while the second coordinate of the atoms is given by the diagonal of

[TABLE]

The atoms can now be read

[TABLE]

With $x_{1}$ denoting the column of $X$ indexed by $1$ , and with $p_{1},p_{2},p_{3}$ denoting the colums of $P$ , the respective weights of the atoms are

[TABLE]

Example 6.7 (Prony’s method via Autonne-Takagi factorization).

Consider the following complex exponential function where $z_{1},z_{2}\in\mathbb{C}$ :

[TABLE]

Its real part is represented in Figure 6.

We reconstruct this function by evaluating it at $f(\alpha_{1},\alpha_{2})$ for $\alpha_{1},\alpha_{2}\in\mathbb{N}$ and $\alpha_{1}+\alpha_{2}\leqslant 2d$ . We start with $d=1$ and increment $d$ until the rank is preserved, i.e. $\text{rank}\mathcal{H}_{d}(y)=\text{rank}\mathcal{H}_{d-1}(y)$ . We have $\text{rank}\mathcal{H}_{0}(y)=1$ and $\text{rank}\mathcal{H}_{1}(y)=\text{rank}\mathcal{H}_{2}(y)=2$ . Thus we stop at the second order. The second order Hankel moment matrix contains the evaluations $f(\alpha_{1},\alpha_{2})$ for $\alpha_{1}+\alpha_{2}\leqslant 4$ . Precisely, $\mathcal{H}_{d}=(f(\alpha+\beta))_{|\alpha|,|\beta|\leqslant 2}$ and we have

[TABLE]

The Autonne-Takagi factorization yields $\mathcal{H}_{2}(y)=X^{T}X$ where

[TABLE]

in which the row basis indexed by $\{1,z_{1}\}$ can be identified. We then obtain the complex symmetric shift operators

[TABLE]

For example, the image by $T_{2}$ of the column of $X$ indexed by $z_{1}$ is equal to the column of $X$ indexed by $z_{1}z_{2}$ :

[TABLE]

Taking a random linear combination, we get

[TABLE]

where

[TABLE]

We confirm that the inverse of $P$ is equal to its transpose since

[TABLE]

The first coordinate of the atoms is given by the diagonal of

[TABLE]

while the second coordinate of the atoms is given by the diagonal of

[TABLE]

The atoms can now be read

[TABLE]

With $x_{1}$ denoting the column of $X$ indexed by $1$ , and with $p_{1}$ and $p_{2}$ denoting the first and second colums of $P$ , the respective weights of the atoms are

[TABLE]

Taking the complex logarithm of the atoms, we obtain

[TABLE]

As for the weights, they can be written

[TABLE]

which leads to the following function of $z_{1}$ and $z_{2}$

[TABLE]

The two terms in the sum can be visualized in Figure 7.

7 Conclusion

An algorithm is proposed for finding global solutions to polynomial optimization problems and for exponential interpolation. It is founded on the notion of hyponormality and is related to the truncated moment problem. Various numerical applications are provided along with graphical illustrations.

Acknowledgements

Many thanks to Paulo Ricardo Arantes Gilz, Didier Henrion, Jean Bernard Lasserre, and Mihai Putinar for fruitful discussions.

Appendix A First Lemma

Lemma 3.

If $z^{(1)},\ldots,z^{(d)}$ are distinct points of $\mathbb{C}^{n}$ , then $v_{d-1}(z^{(1)}),\ldots,v_{d-1}(z^{(d)})$ are linearly independent vectors, where $v_{d}(z):=(z^{\alpha})_{|\alpha\leqslant d}$ .

Proof.

Consider some complex numbers $c_{1},\ldots,c_{d}$ such that

[TABLE]

Given $1\leqslant l\leqslant d$ , define the Lagrange interpolation polynomial

[TABLE]

where $i(k)\in\{1,\ldots,n\}$ is an index such that $z^{(k)}_{i(k)}\neq z^{(l)}_{i(k)}$ . It satisfies $L^{(l)}(z^{(k)})=1$ if $k=l$ and $L^{(l)}(z^{(k)})=0$ if $k\neq l$ . The degree of $L^{(l)}(z)=:\sum_{\alpha}L^{(l)}_{\alpha}z^{\alpha}$ is equal to $d-1$ . Thus we may multiply the equation in (97) by $L^{(l)}_{\alpha}$ to obtain

[TABLE]

Summing over all $|\alpha|\leqslant d-1$ yields $\sum_{k=1}^{d}c_{k}~{}L^{(l)}(z^{(k)})=c_{l}=0$ . ∎

Appendix B Second Lemma

Lemma 4.

If $u_{1},\ldots,u_{d}\in\mathbb{C}^{n}$ are linearly independent, and $c_{1},\ldots,c_{d}\in\mathbb{C}\setminus\{0\}$ , then $\mathcal{R}(\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{T})=\mathcal{R}(\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{*})=\text{span}\{u_{1},\ldots,u_{d}\}$ where $\mathcal{R}$ denotes the range.

Proof.

If $z\in\mathbb{C}^{n}$ , then $(\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{T})z=\sum_{i=1}^{d}(c_{i}u_{i}^{T}z)u_{i}\in\text{span}\{u_{1},\ldots,u_{d}\}$ and $(\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{*})z=\sum_{i=1}^{d}(c_{i}u_{i}^{*}z)u_{i}\in\text{span}\{u_{1},\ldots,u_{d}\}$ . Conversly, an element of the span $\sum_{i=1}^{d}\lambda_{i}u_{i}$ with $\lambda_{1},\ldots,\lambda_{n}\in\mathbb{C}$ belongs to the the range of $\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{T}$ if there exists $z\in\mathbb{C}^{n}$ such that

[TABLE]

which is equivalent to each of the next three lines:

[TABLE]

Since $(c_{1}u_{1}\ldots c_{d}u_{d})\in\mathbb{C}^{n\times d}$ has rank $d$ , its transpose has rank $d$ . Thus there exists a desired $z\in\mathbb{C}^{n}$ . Likewise, $\sum_{i=1}^{d}\lambda_{i}u_{i}$ belongs to the the range of $\sum_{i=1}^{d}c_{i}u_{i}u_{i}^{*}$ if there exists $z\in\mathbb{C}^{n}$ such that

[TABLE]

Since $(c_{1}u_{1}\ldots c_{d}u_{d})\in\mathbb{C}^{n\times p}$ has rank $d$ , its conjugate transpose has rank $d$ . Thus there exists a desired $z\in\mathbb{C}^{n}$ . ∎

Bibliography81

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] X. Li A. and S. Ranga , Szegö Polynomials and the Truncated Trigonometric Moment Problem , The Ramanujan Journal, 12 (2006), pp. 461–472.
2[2] J. Agler and J. E. Mc Carthy , Pick Interpolation and Hilbert Function Spaces , American Mathematical Soc., 2002.
3[3] N.I. Akhiezer , The Classical Moment Problem and Some Related Questions in Analysis , Hafner Publ. Co., New York, 1965.
4[4] N. I. Akhiezer and M. Krein , Some Questions in the Theory of Moments , Transl. Math. Monographs 2, 58 (1962), pp. 164–168.
5[5] F. Andersson, Marcus Carlsson, and M. V. de Hoop , Nonlinear Approximation of Functions in Two Dimensions by Sums of Wave Packets , Appl. Comput. Harmon. Anal., 29 (2010), pp. 198––213.
6[6] A. Athavale , On Joint Hyponormality of Operators , Proceedings of the American Mathematical Society, 103 (1988).
7[7] L. Autonne , Sur les Matrices Hypohermitiennes et les Matrices Unitaires , Annales de l’Université de Lyon, Nouvelle Série, I. Sciences, Médecine. Fascicule 38, (1915).
8[8] T. Bäckström , Vandermonde Factorization of Toeplitz Matrices and Applications in Filtering and Warping , IEEE Trans. on Signal Processing, 61 (2013).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Algorithm for Optimization and Interpolation based on Hyponormality

Abstract

keywords:

AMS:

1 Introduction

2 Contributions

3 Joint hyponormality

4 Truncated moment problem

Theorem 1**.**

Proof.

Theorem 2**.**

Proof.

5 Algorithm

6 Numerical experiments

Example** 6.1**** (Nonexistent shifts).**

Example** 6.2**** (Non-hyponormal shifts).**

Example** 6.3**** (Complex polynomial optimization over an ellipse).**

Example** 6.4**** (Enforcing joint hyponormality in a convex relaxation).**

Example** 6.5**** (Complex polynomial optimization on the torus).**

Example** 6.6**** (Real polynomial optimization).**

Example** 6.7**** (Prony’s method via Autonne-Takagi factorization).**

7 Conclusion

Acknowledgements

Appendix A First Lemma

Lemma 3**.**

Proof.

Appendix B Second Lemma

Lemma 4**.**

Proof.

Theorem 1.

Theorem 2.

Example 6.1 (Nonexistent shifts).

Example 6.2 (Non-hyponormal shifts).

Example 6.3 (Complex polynomial optimization over an ellipse).

Example 6.4 (Enforcing joint hyponormality in a convex relaxation).

Example 6.5 (Complex polynomial optimization on the torus).

Example 6.6 (Real polynomial optimization).

Example 6.7 (Prony’s method via Autonne-Takagi factorization).

Lemma 3.

Lemma 4.