Complete set of translation invariant measurements with Lipschitz bounds

Jameson Cahill; Andres Contreras; Andres Contreras Hip

arXiv:1903.02811·math.FA·March 8, 2019

Complete set of translation invariant measurements with Lipschitz bounds

Jameson Cahill, Andres Contreras, Andres Contreras Hip

PDF

TL;DR

This paper constructs low-dimensional, stable, and invariant signal representations under finite group actions, with explicit Lipschitz bounds, addressing stability and discriminability issues in invariant signal processing.

Contribution

It introduces a novel method to create low-dimensional, complete, and Lipschitz-invariant representations for signals under finite unitary group actions, using algebraic geometry tools.

Findings

01

Constructed invariant representations with explicit Lipschitz bounds.

02

Established existence of complete $ ext{Z}_m$-invariant representations for any $m$.

03

Provided a stable, discriminative transform applicable to signal classification.

Abstract

In image and audio signal classification, a major problem is to build stable representations that are invariant under rigid motions and, more generally, to small diffeomorphisms. Translation invariant representations of signals in $C^{n}$ are of particular importance. The existence of such representations is intimately related to classical invariant theory, inverse problems in compressed sensing and deep learning. Despite an impressive body of litereature on the subject, most representations available are either: i) not stable due to the presence of high frequencies; ii) non discriminative; iii) non invariant when projected to finite dimensional subspaces. In the present paper, we construct low dimensional representations of signals in $C^{n}$ that are invariant under finite unitary group actions, as a special case we establish the existence of low-dimensional and complete…

Equations169

σ : G \to U (n) \mbox an df or g \in G \mbox an d x \in C^{n}, g x = σ (g) x .

σ : G \to U (n) \mbox an df or g \in G \mbox an d x \in C^{n}, g x = σ (g) x .

d_{G} ([x], [y]) = g \in G in f ∥ x - g y ∥

d_{G} ([x], [y]) = g \in G in f ∥ x - g y ∥

f : C^{n} / G \to C^{2 n + 1}

f : C^{n} / G \to C^{2 n + 1}

c d_{G} ([x], [y]) \leq ∥ f ([x]) - f ([y]) ∥ \leq C d_{G} ([x], [y])

c d_{G} ([x], [y]) \leq ∥ f ([x]) - f ([y]) ∥ \leq C d_{G} ([x], [y])

T^{k} x (j) = x (j - k mod n) \mbox f or k = 1, ..., n

T^{k} x (j) = x (j - k mod n) \mbox f or k = 1, ..., n

T x (j) = e^{2 π ij / n} \overset{x}{^} (j)

T x (j) = e^{2 π ij / n} \overset{x}{^} (j)

∥Φ (x) - Φ (y) ∥ \leq C d_{Z_{m}} ([x], [y]),

∥Φ (x) - Φ (y) ∥ \leq C d_{Z_{m}} ([x], [y]),

F (x) := (x_{0}, x_{1}^{p}, x_{2}^{p}, \dots, x_{p - 1}^{p}, x_{1}^{p - 2} x_{2}, \dots, x_{1}^{p - k} x_{k}, \dots, x_{1} x_{p - 1}) .

F (x) := (x_{0}, x_{1}^{p}, x_{2}^{p}, \dots, x_{p - 1}^{p}, x_{1}^{p - 2} x_{2}, \dots, x_{1}^{p - k} x_{k}, \dots, x_{1} x_{p - 1}) .

(ζ^{n_{1}})^{p - k} ζ^{n_{k}} = 1, \mbox an d so ζ^{n_{k}} = (ζ^{n_{1}})^{k} .

(ζ^{n_{1}})^{p - k} ζ^{n_{k}} = 1, \mbox an d so ζ^{n_{k}} = (ζ^{n_{1}})^{k} .

P (x) = (p_{i} (x))_{i = 1}^{N}

P (x) = (p_{i} (x))_{i = 1}^{N}

ν_{se p} = {(x, y) ∣ \mbox f or a l l P \in C [x]^{G}, \mbox w e ha v e P (x) = P (y)} .

ν_{se p} = {(x, y) ∣ \mbox f or a l l P \in C [x]^{G}, \mbox w e ha v e P (x) = P (y)} .

ν_{se p} = {(x, g x) ∣ x \in C^{n}, g \in G} .

ν_{se p} = {(x, g x) ∣ x \in C^{n}, g \in G} .

P_{T} = {x_{i}^{m_{i}}, x_{j}^{a_{j k}} x_{k}^{b_{j k}} : 1 \leq i \leq n, 1 \leq j < k \leq n}

P_{T} = {x_{i}^{m_{i}}, x_{j}^{a_{j k}} x_{k}^{b_{j k}} : 1 \leq i \leq n, 1 \leq j < k \leq n}

F_{T} (x) = (x_{1}^{m_{1}}, \dots, x_{n}^{m_{n}}, {x_{j}^{a_{j k}} x_{k}^{b_{j k}}}_{1 \leq j, k \leq n}),

F_{T} (x) = (x_{1}^{m_{1}}, \dots, x_{n}^{m_{n}}, {x_{j}^{a_{j k}} x_{k}^{b_{j k}}}_{1 \leq j, k \leq n}),

Φ (x) = (∣ ⟨ x, φ_{i} ⟩ ∣^{2})_{i = 1}^{N}

Φ (x) = (∣ ⟨ x, φ_{i} ⟩ ∣^{2})_{i = 1}^{N}

⟨ S, T ⟩_{H S} = tr (S T) .

⟨ S, T ⟩_{H S} = tr (S T) .

⟨ x x^{*}, y y * ⟩_{H S}

⟨ x x^{*}, y y * ⟩_{H S}

\tilde{Φ} (S) = (tr (S φ_{i} φ_{i}^{*}))_{i = 1}^{N}

\tilde{Φ} (S) = (tr (S φ_{i} φ_{i}^{*}))_{i = 1}^{N}

P_{i} = j = 1 \sum c_{i} p_{i, j},

P_{i} = j = 1 \sum c_{i} p_{i, j},

f_{i, j} (x, y, t)

f_{i, j} (x, y, t)

F_{i}

dim (Im (F)) \leq 2 n + 1.

dim (Im (F)) \leq 2 n + 1.

S = {[ℓ], [w] : 0 \neq = w \in Im (F), ℓ w = 0, [ℓ] \in P (C^{k \times N}), [w] \in P (C^{N})}

S = {[ℓ], [w] : 0 \neq = w \in Im (F), ℓ w = 0, [ℓ] \in P (C^{k \times N}), [w] \in P (C^{N})}

π_{1} ([ℓ], [w]) = [ℓ], π_{2} ([ℓ], [w]) = [w] .

π_{1} ([ℓ], [w]) = [ℓ], π_{2} ([ℓ], [w]) = [w] .

dim (S) = dim (π_{2}^{- 1} [w_{0}]) + dim (π_{2} (S)) .

dim (S) = dim (π_{2}^{- 1} [w_{0}]) + dim (π_{2} (S)) .

dim (π_{2}^{- 1} ([w_{0}])) = dim ({ℓ \in C^{k \times N} ∣ ℓ w_{0} = 0}) - 1 = k (N - 1) - 1 = k N - k - 1

dim (π_{2}^{- 1} ([w_{0}])) = dim ({ℓ \in C^{k \times N} ∣ ℓ w_{0} = 0}) - 1 = k (N - 1) - 1 = k N - k - 1

dim (S) \leq k N - k - 1 + 2 n .

dim (S) \leq k N - k - 1 + 2 n .

π_{1} (S)

π_{1} (S)

dim (π_{1} (S)) \leq dim (S) = k N - k - 1 + 2 n,

dim (π_{1} (S)) \leq dim (S) = k N - k - 1 + 2 n,

k N - k - 1 + 2 n < k N - 1

k N - k - 1 + 2 n < k N - 1

ker (ℓ) \cap (Im F - Im F) = {0},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Complete set of translation invariant measurements with Lipschitz bounds

Jameson Cahill and Andres Contreras and Andres Contreras Hip

Department of Mathematical Sciences, New Mexico State University, Las Cruces, New Mexico, USA

[email protected]

Department of Mathematical Sciences, New Mexico State University, Las Cruces, New Mexico, USA

[email protected]

Department of Mathematical Sciences, New Mexico State University, Las Cruces, New Mexico, USA

[email protected]

Abstract.

In image and audio signal classification, a major problem is to build stable representations that are invariant under rigid motions and, more generally, to small diffeomorphisms. Translation invariant representations of signals in $\mathbb{C}^{n}$ are of particular importance. The existence of such representations is intimately related to classical invariant theory, inverse problems in compressed sensing and deep learning. Despite an impressive body of litereature on the subject, most representations available are either: i) not stable due to the presence of high frequencies; ii) non discriminative; iii) non invariant when projected to finite dimensional subspaces. In the present paper, we construct low dimensional representations of signals in $\mathbb{C}^{n}$ that are invariant under finite unitary group actions, as a special case we establish the existence of low-dimensional and complete $\mathbb{Z}_{m}$ -invariant representations for any $m\in\mathbb{N}$ . Our construction yields a stable, discriminative transform with semi-explicit Lipschitz bounds on the dimension; this is particularly relevant for applications. Using some tools from Algebraic Geometry, we define a high dimensional homogeneous function that is injective. We then exploit the projective character of this embedding and see that the target space can be reduced significantly by using a generic linear transformation. Finally, we introduce the notion of non-parallel map, which is enjoyed by our function and employ this to construct a Lipschitz modification of it.

Keywords. invariant theory, signal classification, stable representation

1. Introduction

One of the most important problems in machine learning and signal processing is the classification of visual and audio signals, i.e., given an equivalence relation $\sim$ on $\mathbb{C}^{n}$ one would like to find a map $\Phi:\mathbb{C}^{n}\rightarrow\mathbb{C}^{N}$ with the property that $\Phi(x)=\Phi(y)$ if (and ideally only if) $x\sim y$ . One would also like $N$ to be as small as possible and for $\Phi$ to be relatively easy to compute. It is a nontrivial task and the study of this an other related problems has sparked a wealth of exciting developments in mathematics in recent years [19, 3, 6, 5, 16] see also [25] and references therein for more on the history of these types of problems. The sophisticated tools created and used for this purpose, bridge several, seemingly unrelated areas of mathematics.

One type of equivalence relation that comes up in many instances is when the equivalence classes are given by the orbits of some group action, that is, a group $G$ acting on $\mathbb{C}^{n}$ and $x\sim y$ whenever there is a $g\in G$ for which $x=gy$ . In all the applications considered in our study, we will assume that our action comes from a unitary representation of $G$ , i.e., we have a group homomorphism

[TABLE]

In this case we write $\mathbb{C}^{n}/G=\mathbb{C}^{n}/\sim$ for the space of orbits (this is a slight abuse of notation as this space depends on the specific representation of $G$ ) and the quotient metric on $\mathbb{C}^{n}/G$ is given by (see Lemma 3.3.6 in [7])

[TABLE]

where $[x]$ denotes the orbit of $x$ under the action of $G$ .

By some elementary considerations, using small separating sets, one can show the existence of an injective map

[TABLE]

that distinguishes orbits. This is essentially a restatement of Proposition 5.1.1 in [14], see Section 2.1. While this looks very good in principle because it solves the classification problem with a small number of measurements, it is not good enough for applications. In general one would like to obtain maps that are not only injective but also stable in the sense that the distortion of the representation is globally controlled. In addition, one would like to construct measurements that are robust enough to preserve injectivity under small perturbations, due to noise for example. To be more precise, one would want a bi-Lipschitz map $f$ , that is a map for which there are constants $0<c\leq C<\infty$ so that

[TABLE]

for every $x,y\in\mathbb{C}^{n}.$

In this work we study the classification problem under finite unitary actions of signals in $\mathbb{C}^{n}.$ We obtain a representation in low dimensions, the first of its kind in a nontrivial setting. Its simplicity makes it a good option for applications. We first obtain some general results that apply to arbitrary unitary actions of finite groups before specializing to the important case of discrete translations (cyclic groups) where our results can be made much more explicit.

One of the motivations of our study is the problem of finding a representation of functions in $L^{2}(\mathbb{R}^{n})$ that is invariant under translations. One such representation is given by the modulus of the Fourier transform; however, it is not injective and furthermore it is known to yield instabilities in the presence of high frequencies and it is for this reason not suitable for implementation. The numerous works that have tackled different aspects of this problem in recent years rely mostly on statistical learning techniques and as such they inherit some of their problems and limitiations.To develop a more satisfactory approach to classification of translation invariant signals in $L^{2}(\mathbb{R}^{n})$ , Mallat introduced [19] a scattering transform that is invariant under translations but more importantly, comes with a global upper-Lipschitz bound (the scattering transform is non-expansive). To deal with the problem of instabilities created by the presence of high frequencies, Mallat tames the contributions of fine scale oscillations by using a wavelet-based convolutional network; a limiting procedure gives the desired object at infinite depth. Although the scattering transform provides a non-expansive map, invariant under general small diffeomorphisms, it is not discriminative and is not actually invariant in finite dimensions (invariance is only achieved in the limit, while for implementation one needs to cut off the process after a few iterations). In applications one always works in large, but finite dimensions and thus an alternative to the scattering transform is desirable, one that can be readily applied in this setting with few measurements and with a guarantee of no misclassification. The goal of our work is to provide a different approach to the group invariant classification problem in the relevant finite dimensional setting.

Of particular interest is the case where $G=\mathbb{Z}_{m}$ is a cyclic group. The representation of $G$ in this case will simply be the powers of some matrix $T$ that satisfies $T^{m}=I$ . Since the minimal polynomial of $T$ divides $x^{m}-1$ it follows that $T$ is diagonalizable and all of its eigenvalues are $m$ th roots of unity. To better understand complete translation invariant measurements in this setting, we further specialize to $G=\mathbb{Z}_{n}$ where the action is given by

[TABLE]

It is well known $T$ is diagonalized by the discrete Fourier transform $\mathcal{F}x=\hat{x}$ , and so in the Fourier domain we have

[TABLE]

Theorem 1.1 guarantees the existence of a Lipschitz map that distinguishes orbits under this finite translation action with a number of measurements that is linear on the dimension.

Another important application that is closely related to the problem of phase retrieval is the case where $T=\omega I$ where $\omega$ is an $m$ th root of unity, see Section 4.1.

A consequence of our study gives a stable and discriminative nonlinear transform that is $\mathbb{Z}_{m}$ -invariant. More precisely, we prove the following:

Theorem 1.1.

There is a $\mathbb{Z}_{m}$ -invariant map $\Phi:\mathbb{C}^{n}\mapsto\mathbb{C}^{2n+1}$ that induces an injective map $\tilde{\Phi}:\mathbb{C}^{n}/\mathbb{Z}_{m}\mapsto\mathbb{C}^{2n+1},$ and a constant $C>0$ depending only on $m$ such that

[TABLE]

for every $x,y\in\mathbb{C}^{n}.$

We provide more analytic and geometric information about the map $\Phi$ in Section 4, in particular we give explicit bounds on the optimal Lipschitz constant $C.$

Constructing complete invariant representations with few measurements

As mentioned, in this paper we follow a different approach to build translation invariant representations. Our perspective hinges upon the use of polynomial measurements (as oposed to an infinite chain of convolution-taking modulus operations). The alternative we propose here has the following advantages

•

it lends itself well for applications; this is because the measurement and coding of signals is usually treated in very high but finite dimensional spaces. The calculations involved in the transform are very easy to handle and do not require complex mathematical or computational operations.

•

the number of measurements needed to differentiate signals in different orbits is linear in the number of dimensions.

•

the map we construct is actually $G$ -invariant, as opposed to approximately invariant. As such, it is an accurate representation at the chosen level of precision(scale).

•

the representation really solves the classification problem, it is truly injective.

•

not only is the map produced stable, but it comes with almost explicit bounds.

Our representation is not entirely constructive since the argument relies on choosing a linear map in the dimension reduction part and this is done using an abstract result. This in itself does not limit the range of applicability of the transform which works for a generic choice.

The construction of an initial large set of polynomial measurements that allow us to separate orbits follows from simple and intuitive observations. However, this is just the starting point as a map consisting of such monomials does not satisfy any of the other properties we ask of our representations.

To better illustrate the ideas and challenges behind the proof of Theorem 1.1, let us restrict ourselves to the case where $G$ is $\mathbb{Z}_{p},$ where $p$ is a prime number and the action is given by translation as in (1.4). We let $M$ be the modulation operator which is the diagonalization of the translation operator, that is, $M=\text{diag }{\left(e^{2\pi ij/p}\right)}_{j=0,\ldots,p-1}$ . In the Fourier domain we can define

[TABLE]

One can see that $F(\widehat{x})=F(\widehat{y})$ if and only if $\widehat{x}=M^{k}\widehat{y},$ for some $k\in\mathbb{N}.$ Indeed, suppose first that $F(\widehat{x})=F(\widehat{y}),$ where $\widehat{x}=(\widehat{x}_{0},\ldots,\widehat{x}_{p-1})$ and $\widehat{y}=(\widehat{y}_{0},\ldots,\widehat{y}_{p-1}).$ Then $\widehat{x}_{0}=\widehat{y}_{0}$ and there are $n_{k},k=2,\ldots,p,$ such that $\widehat{x}_{k}={\zeta}^{n_{k}}\widehat{y}_{k}\mbox{ for }k=2,\ldots,p,$ where $\zeta=e^{2\pi i/p}.$ On the other hand, we have

[TABLE]

But then, $\widehat{x}=M^{n_{1}}\widehat{y}.$

However, one can readily see that this map does not separate orbits, because if $x_{1}=0,$ then $x_{2},\ldots,x_{p-1}$ are completely free. Also, since the measurements are polynomials, we cannot expect to have global Lipschitz bounds. To solve the separating problem, we can add a monomial for each pair of variables, but then we have $\mathcal{O}(p^{2})$ measurements. We will see in Section 3 that by taking generic linear combinations of these measurements we can reduce the dimension dramatically. Then the problem becomes how to turn the resulting map into a Lipschitz one without losing the discriminative property and without adding more measurements. It is at this point that a geometric property we introduce (see definition 3.2) becomes crucial for building a stable transform. We combine this property satisfied by our polynomials together with the fact that our actions are unitary to reduce all our measurements to the unit sphere.

Although the maps we construct are injective, we show in Section 4.2 that their inverses are almost never Lipschitz. This of course does not rule out the possibility that other representations may come with a lower Lipschitz bound. However we believe such representations must be non-algebraic and therefore essentially different from ours.

The paper is organized as follows. In the second section we introduce some necessary algebraic background and we also discuss the connection with the well-known problem of phase retrieval. In Section 3 we prove some general results concerning the construction of effective low dimensional discriminative measurements under natural assumptions. In Section 4 we verify that the examples of interest for us satisfy the conditions of the theorems in Section 3 and we obtain explicit Lipschitz bounds for the resulting transforms. We finish the paper with a discussion on open questions and a few remarks in Section 5.

2. Background

2.1. Algebraic invariants

Given an action of a group $G$ on $\mathbb{C}^{n}$ we denote by $\mathbb{C}[x]^{G}$ the ring of polynomials in $\mathbb{C}[x]=\mathbb{C}[x_{1},...,x_{n}]$ that are invariant under this action. It is well known that this ring is finitely generated as a $\mathbb{C}$ -algebra, however the generating set could be arbitrarily large. Nonetheless one might hope that given a generating set $\{p_{i}\}_{i=1}^{N}$ then the map $P:\mathbb{C}^{n}\rightarrow\mathbb{C}^{N}$ given by

[TABLE]

would induce a map $\tilde{P}$ that is injective on $\mathbb{C}^{n}/G$ . Unfortunately this turns out not to be true in general. In fact, an example for this is $\mathbb{C}^{*},$ the multiplicative group of $\mathbb{C}$ acting on $\mathbb{C}^{2}$ via scalar multiplication [12], example 2.3.1.

As in [20], we define

[TABLE]

Then a set $S\subset{\mathbb{C}[x]}^{G}$ is said to be separating if whenever $P(x)=P(y)$ for all $P\in S,$ we have that $(x,y)\in\nu_{sep}$ . It is known that separating sets can exist which are much smaller than the number of generators of $\mathbb{C}[x]^{G}$ , in fact in [14] the following is proved in Theorem 5.1.1:

Theorem 2.1.

If $G$ acts on $\mathbb{C}^{n}$ then a separating set of size at most $2n+1$ exists.

Furthermore, in [12], section 2.3, it is noted that if $G$ is a finite group action, then

[TABLE]

We now specialize to the case where $G=\mathbb{Z}_{m}.$ Without loss of generality we can assume that the action of $G$ is given by the powers of a diagonal matrix $T=\text{diag}(t_{1},...,t_{n})$ where $t_{i}$ is an $m$ th root of unity for every $i$ . It is proved in Theorem 5.2.1 in [14] that the following set of $n(n+1)/2$ monomials is a separating set:

[TABLE]

where $m_{i}|m$ and $t_{i}$ is a primitive $m_{i}$ th root of unity, and $a_{jk}$ is minimal such that there exists a $b_{jk}<m_{k}$ with $x_{j}^{a_{jk}}x_{k}^{b_{jk}}$ invariant. From the set of measurements (2.2), an invariant map $F_{G}:\mathbb{C}^{n}\to\mathbb{C}^{n(n+1)/2}$ defined by

[TABLE]

induces an explicit injective map $\tilde{F}_{T}:\mathbb{C}^{n}/\mathbb{Z}_{m}\rightarrow\mathbb{C}^{n(n+1)/2}$ . We will reduce the dimension of the target space by showing that a suitably generic linear map $\ell:\mathbb{C}^{n(n+1)/2}\to\mathbb{C}^{2n+1}$ is injective when restricted to the image of $F_{T}.$ In general, this should be the optimal dimension, but we know already that for specific examples this can be reduced even further to $2n-1$ as can be seen from the special cases covered by [14], Proposition 5.2.2 (see also the discussion on subsection 2.2 about real phase retrieval being a particular case of this problem).

2.2. Phase retrieval

Phase retrieval is the problem of recovering a signal $x$ in $\mathbb{C}^{n}$ (or $\mathbb{R}^{n}$ ) up to a global phase factor from a collection of intensity measurements $(|\langle x,\varphi_{i}\rangle|^{2})_{i=1}^{N}$ . This type of problem comes up in many applications and has a rich history, but has seen considerable interest in the last decade or so since the publication of [1]. To state the problem in the setting of this paper let $\mathbb{T}=\{\lambda\in\mathbb{C}:|\lambda|=1\}$ denote the one dimensional torus and let $\mathbb{T}$ act on $\mathbb{C}^{n}$ by scalar multiplication. Then given any collection of vectors $\{\varphi_{i}\}_{i=1}^{N}$ the mapping $\Phi:\mathbb{C}^{n}\rightarrow\mathbb{R}^{N}$ given by

[TABLE]

is invariant under the action of $\mathbb{T}$ so we can consider the induced map $\tilde{\Phi}$ whose domain is $\mathbb{C}^{n}/\mathbb{T}$ . The first problem in phase retrieval is to understand when this map is injective.

Let $\mathbb{H}_{n}$ denote the space of $n\times n$ Hermitian matrices and note that $\mathbb{H}_{n}$ is a vector space over the real numbers (not the complex numbers) of dimension $n^{2}$ . The Hilbert-Schmidt inner product on $\mathbb{H}_{n}$ is given by

[TABLE]

Now consider the map from $\mathbb{C}^{n}$ to $\mathbb{H}_{n}$ given by $x\mapsto xx^{*}$ . First note that $xx^{*}=yy^{*}$ if and only if $x=\lambda y$ for some $\lambda\in\mathbb{T}$ , so this map is injective on $\mathbb{C}^{n}/\mathbb{T}$ , and the image of this map is the set $\mathcal{S}$ of positive rank one matrices in $\mathbb{H}_{n}$ (which looks like $\mathbb{P}^{n-1}\times\mathbb{R}_{+}$ ). Next observe that for $x,y\in\mathbb{C}^{n}$

[TABLE]

Given a collection of vectors $\{\varphi_{i}\}_{i=1}^{N}\subseteq\mathbb{C}^{n}$ define the linear map $\tilde{\Phi}:\mathbb{H}_{n}\rightarrow\mathbb{R}^{N}$ given by

[TABLE]

and note that $\Phi(x)=\tilde{\Phi}(xx^{*})$ , so $\Phi$ is injective on $\mathbb{C}^{n}/\mathbb{T}$ if and only if $\tilde{\Phi}$ is injective when restricted to $\mathcal{S}$ . If $\tilde{\Phi}$ is not injective on $\mathcal{S}$ then there are $xx^{*}\neq yy^{*}$ so that $\tilde{\Phi}(xx^{*})=\tilde{\Phi}(yy^{*})$ and so $xx^{*}-yy^{*}\in\ker(\tilde{\Phi})$ . From this observation it is straightforward to prove the following (see Lemma 9 in [4]):

Lemma 2.2.

$\tilde{\Phi}$ * is injective on $\mathcal{S}$ if and only if every nonzero matrix in $\ker(\tilde{\Phi})$ has rank at least 3.*

Since the dimension of the set of rank at most 2 $n\times n$ Hermitian matrices is $4n-4$ this led the authors of [4] to conjecture that $N\geq 4n-4$ was necessary for $\Phi$ to be injective. If this were taking place over the complex numbers this would follow directly from the Projective Dimension Theorem, in [11] the conjecture was proven for infinitely many values of $n$ , however in [23] a counterexample is constructed with $n=4$ and $N=11$ .

In most applications the measurements $\Phi(x)$ will never be exact and will be corrupted by noise of some form. Therefore we we would like the map to be not just injective but bi-Lipschitz as defined in (1.3) (here we use the quotient metric on $\mathbb{C}^{n}/\mathbb{T}$ ), however, it is shown in [4] that this $\Phi$ can never be bi-Lipschitz in this sense. There are (at least) two ways of dealing with this situation. The first, as done in [4], is to modify the map to get a new map that is bi-Lipschitz. Another alternative which is explored in [2] is to replace the quotient metric with a different metric on $\mathbb{C}^{n}/\mathbb{T}$ with respect to which $\Phi$ is bi-Lipschitz. In this paper we will encounter a similar situation where we will have an initial $G$ -invariant map which is injective but not Lipschitz. Our approach will be to exploit a geometric property of this map to produce a new map which is still injective but also Lipschitz.

We can also study phase retrieval over the real numbers where we replace $\mathbb{C}^{n}$ with $\mathbb{R}^{n}$ and $\mathbb{T}$ with $\{1,-1\}$ , which corresponds to the $\mathbb{Z}_{2}$ action on $\mathbb{R}^{n}$ given by $-I$ . In this case the analysis above is still valid, but for $x\in\mathbb{R}^{n}$ we have that $xx^{*}=xx^{T}$ is real and symmetric, and the entries of $xx^{T}$ are precisely the monomials in $\mathcal{P}_{-I}$ (see (2.2)). Therefore, from this perspective real phase retrieval can be thought of as a very special case of the type of cyclic group actions that we consider in this paper.

3. Main results

In this section we present a series of abstract results that yield a discriminative and stable representation $\Phi.$ It can be seen in the next section that the hypotheses needed for the existence of such $\Phi$ are satisfied by our objects of interest; we actually believe that the abstract framework here provided can find further applications in signal processing. The construction of the map $\Phi$ rests on discriminative polynomial measurements in some possibly high dimensional space. These are later mapped to an $\mathcal{O}(n)$ dimensional space via a generic linear map. Still, this map is not necessarily Lipschitz so we appeal to its geometric properties to modify it so that it becomes Lipschitz while preserving injectivity. In the rest of this section we will present the steps described.

3.1. Dimension reduction

We already have that the polynomial map $F_{T}$ defined in (LABEL:F_T) is separating. We look at the problem of reducing the dimension, and getting a Lipschitz bound. This has to be done carefully because we need to reduce the dimension while still preserving injectivity and finally controlling the distortion. The next theorem reduces the dimension.

Theorem 3.1.

Let $G$ act on $\mathbb{C}^{n}$ and suppose $P:\mathbb{C}^{n}\rightarrow\mathbb{C}^{N}$ is a polynomial $G$ -invariant map such that the induced map $\tilde{P}:\mathbb{C}^{n}/G\rightarrow\mathbb{C}^{N}$ is injective. Then for $k\geq 2n+1,$ $\ell\circ\tilde{P}$ is injective for a generic linear map $\ell:\mathbb{C}^{N}\rightarrow\mathbb{C}^{k}$ .

Proof.

First, since the components of $P$ are polynomials, we can write $P=(P_{1},P_{2},\ldots,P_{N}),$ and

[TABLE]

where $p_{i,j}$ is a monomial of degree $d_{i,j}.$ Then one of these monomials achieves maximum degree $d.$ Now we produce a new map $F(x,y,t)=(F_{1},...,F_{N})$ as follows:

[TABLE]

Note that $F$ is homogeneous and regular (it is a polynomial map), and so we know that $\mathrm{Im}(F)\subseteq\mathbb{C}^{N}$ is a projective variety. Therefore

[TABLE]

For a linear map $\ell:\mathbb{C}^{N}\rightarrow\mathbb{C}^{k}$ if $\ell\circ\tilde{P}$ is not injective then there are $x,y\in\mathbb{C}^{n}$ so that $F(x,y,1)\neq 0$ but $F(x,y,1)\in\ker(\ell)$ . We now claim that if $k\geq 2n+1$ then for a generic $\ell$ we have $\ker(\ell)\cap\mathrm{Im}(F)=\{0\}$ which would prove the theorem. To this end let

[TABLE]

(here $[\cdot]$ is a class in projective space ). It is easy to see that $S$ is projective. Let us note that we can assume without loss of generality that $S$ is irreducible(if not, reason like below on each irreducible component). Also let $\pi_{1},\pi_{2}$ be the projections of $S$ , that is:

[TABLE]

Then $\pi_{2}(S)=[\mathrm{im}(F)],$ therefore, $\dim(\pi_{2}(S))=\dim(\mathrm{im}(F))-1\leq 2n.$ By [17] corollary 11.13 we know that if we take $[w_{0}]\in\mathbb{P}^{N},$ then

[TABLE]

On the other hand, we know

[TABLE]

This implies that

[TABLE]

Next observe that

[TABLE]

Therefore if $\dim(\pi_{1}(S))<\dim(\mathbb{P}(\mathbb{C}^{k\times N}))=kN-1$ then $\ell\circ\tilde{P}$ is injective for a generic $\ell$ . Finally,

[TABLE]

so we require

[TABLE]

which means $k>2n.$ ∎

The main idea in the proof of Theorem 3.1 is very similar to that of Lemma 2.2 in that we want to show that a generic linear map of appropriate rank is injective when restricted to a particular algebraic variety which means that the kernel of the linear map needs to avoid differences of pairs of vectors that are on the variety. The main difference between these two results is that the varieties under consideration in the phase retrieval case they have a lot of structure, in particular they are projective, whereas ours are not. On the other hand, phase retrieval takes place over the real numbers (even in the complex case) which complicates the use of certain tools from algebraic geometry. We note that variants of these types of arguments have been used in other recovery problems [24, 22, 11].

3.2. Non-parallel maps induce Lipschitz invariant representations

As we anticipated, we will discuss other general theorems that let us make modifications to turn a map into a Lipschitz one, provided the map we started with satisfies a geometric condition. To that end, we introduce the following concept:

Definition 3.2.

Suppose $G$ acts on $\mathbb{C}^{n}$ and $F:\mathbb{C}^{n}\rightarrow\mathbb{C}^{N}$ is $G$ -invariant. We say $F$ has the non-parallel property if the following holds: If $\|x\|=\|y\|=1$ and $F(x)=\lambda F(y)$ for some $\lambda>0,$ then $x=gy$ for some $g\in G$ .

Remark 3.3.

Note that if $F:\mathbb{C}^{n}\to\mathbb{C}^{N}$ satisfies the non-parallel property, and $\ell:\mathbb{C}^{N}\to\mathbb{C}^{m}$ is a linear map such that

[TABLE]

then $\ell\circ F$ also satisfies the non-parallel property.

The following will be used to get a candidate for a Lipschitz map.

Definition 3.4.

For any map $F:\mathbb{C}^{n}\to\mathbb{C}^{N},$ we define $\Phi_{F}$ by

[TABLE]

Proposition 3.5.

*Suppose $G$ acts on $\mathbb{C}^{n}$ according to (1.1) and $F:\mathbb{C}^{n}\rightarrow\mathbb{C}^{N}$ is a $G$ -invariant map. Then $\Phi_{F}$ as defined in (3.1) satisfies

(a) $\Phi_{F}$ is also $G$ -invariant,

(b) If $F$ has the non-parallel property and if the induced map $\tilde{F}:\mathbb{C}^{n}/G\rightarrow\mathbb{C}^{N}$ is injective then the corresponding induced map $\tilde{\Phi}_{F}$ is also injective.*

Proof.

Throughout the proof, we write $\Phi$ instead $\Phi_{F}$ .

(a) Note that if $x\in\mathbb{C}^{n}$ and $g\in G,$ then

[TABLE]

Where we have used our general assumption (1.1). Since $F$ is invariant, we have that

[TABLE]

So $\Phi$ is invariant.

(b) Suppose $x,y\in\mathbb{C}^{n}$ are such that

[TABLE]

If $x=0,$ then

[TABLE]

which implies that $y=0=x.$ Now if $x\neq 0,$ we have that

[TABLE]

So in particular $F\left(\frac{x}{\|x\|}\right)$ and $F\left(\frac{y}{\|y\|}\right)$ are parallel. Since $F$ satisfies the non-parallel property, we have that $\frac{x}{\|x\|}=g\left(\frac{y}{\|y\|}\right)$ for some $g\in G$ . This implies that

[TABLE]

and since $\tilde{F}$ is injective we conclude $x=gy$ which means $\tilde{\Phi}$ is also injective. ∎

Remark 3.6.

Note that in the proof of (b) we really show that when $F$ has the non-parallel property then $\tilde{\Phi}_{F}(x)=\tilde{\Phi}_{F}(y)$ if and only if $\tilde{F}(x)=\tilde{F}(y)$ regardless of whether or not $\tilde{F}$ is injective.

In what follows, $\mathbb{S}$ will denote the unit sphere in $\mathbb{C}^{n}.$

Theorem 3.7.

Let $G$ be a unitary group acting on $\mathbb{C}^{n}$ according to (1.1). Let $H:\mathbb{C}^{n}\to\mathbb{C}^{N}$ be a $G$ -invariant map satisfying the non-parallel property. Assume $H$ is $C^{1}.$ Then the map $\Phi_{H}:\mathbb{C}^{n}\to\mathbb{C}^{N}$ defined in (3.1) is Lipschitz. Furthermore, the optimal Lipschitz constant $\alpha$ such that $\|\tilde{\Phi}_{H}([x])-\tilde{\Phi}_{H}([y])\|\leq\alpha d_{G}([x],[y])$ satisfies $\alpha\leq 3C,$ where

[TABLE]

Proof.

We will drop the subscript $H$ in order to ease the notation. Let $x,y\in\mathbb{C}^{n}.$ If $x=0$ and $y\neq 0,$ then

[TABLE]

Now if $x,y\neq 0,$ then

[TABLE]

Since $H$ is bounded on the sphere, we have that

[TABLE]

This implies that

[TABLE]

By symmetry on $x$ and $y,$ we have that

[TABLE]

We know that $H$ is differentiable when restricted to the unit sphere $\mathbb{S}$ . Suppose $x,y\in\mathbb{C}^{n}.$ Therefore we can use the mean value theorem to find

[TABLE]

Hence

[TABLE]

Then if we let

[TABLE]

we will have that

[TABLE]

If $\|y\|\leq\|x\|,$ then

[TABLE]

from where we obtain

[TABLE]

This implies that

[TABLE]

Since $\Phi$ is invariant, we know that for all $g\in G,$ we have that

[TABLE]

therefore

[TABLE]

This gives the Lipschitz bound. ∎

As mentioned in the introduction, an ideal representation should not only be injective but also control the distance between classes in terms of a lower Lipschitz bound. Because the representation we construct is made of linear combination of polynomials with no particular structure (as opposed to phase retrieval where the measurements take the particular form $|\langle x,\phi_{k}\rangle|^{2}_{k=1,\ldots,M}$ ), we cannot expect to have a lower Lipschitz bound at our level of generality. Moreover, we will see in the next section that when $G=\mathbb{Z}_{m},m\geq 3.$ the map $\Phi$ is not bi-Lipschitz.

In phase retrieval the map $\ell$ sends the measurements to $(\mathbb{R}_{+})^{N}$ for some $N,$ and since the map is homogeneous of degree 2, one can take the square root of each component without destroying separation while making the map bi-Lipschitz. Our separating measurements are essentially complex valued-the phases contain essential information one cannot ignore, and so we cannot appeal to a similar idea.

4. Applications to cyclic groups

In this section, we discuss a few applications of the theorems in the previous section. Recall that in the case of $G=\mathbb{Z}_{m}$ we can assume the action is given by the powers of a diagonal matrix $T=\mathrm{diag}(t_{1},...,t_{m})$ where each $t_{i}$ is an $m$ th root of unity.

Theorem 4.1.

Suppose we have the group $\mathbb{Z}_{m}$ acting on $\mathbb{C}^{n},$ then we have an injective map

[TABLE]

Moreover, $\tilde{\Phi}$ has a Lipschitz constant $3m\|\ell\|.$

This theorem will be a consequence of the following lemmas:

Lemma 4.2.

$F_{T}$ * defined in (2.3) satisfies the non-parallel property.*

Proof.

Suppose we have $x,y\in\mathbb{C}^{n}$ and $\lambda>0$ such that

[TABLE]

Let $\omega$ be the first $m$ th root of unity and

[TABLE]

We will prove that $\lambda F_{T}(y)=F_{T}(\tilde{y}).$ Let $S=\{i\,:\,y_{i}\neq 0\}.$ Note that

[TABLE]

To prove that $\lambda y_{j}^{a_{jk}}y_{k}^{b_{jk}}=\tilde{y}_{j}^{a_{jk}}\tilde{y}_{k}^{b_{jk}},$ there are two cases:

Case 1: One of $j,k$ is not in $S.$

In this case, one of $y_{j},y_{k}$ is [math] (by definition of $S$ ) and by the definition of $\tilde{y}_{j},$ one of $\tilde{y}_{j},\tilde{y}_{k}$ is $0.$ Therefore,

[TABLE]

Case 2: Both $j,k\in S.$

In this case, we know $y_{j},y_{k}\neq 0.$ We have $\lambda{y_{j}}^{m_{j}}={x_{j}}^{m_{j}},$ and so

[TABLE]

Then ${\left(\frac{x_{j}}{{\lambda}^{\frac{1}{m_{j}}}y_{j}}\right)}^{m_{j}}=1.$ Since $m_{j}|m,$ there is a $p_{j}$ such that $\frac{x_{j}}{{\lambda}^{\frac{1}{m_{j}}}y_{j}}={\omega}^{p_{j}},$ which implies that

[TABLE]

We know that $\lambda y_{j}^{a_{jk}}y_{k}^{b_{jk}}=x_{j}^{a_{jk}}x_{k}^{b_{jk}},$ which implies that

[TABLE]

and since $\frac{x_{j}}{y_{j}}={\omega}^{p_{j}}{\lambda}^{\frac{1}{m_{j}}},\frac{x_{k}}{y_{k}}={\omega}^{p_{k}}{\lambda}^{\frac{1}{m_{k}}},$ we have

[TABLE]

Taking the modulus on both sides yields $\lambda={\lambda}^{\frac{a_{jk}}{m_{j}}+\frac{b_{jk}}{m_{k}}},$ hence

[TABLE]

Therefore,

[TABLE]

which implies that $x\sim\tilde{y}.$ Thus for some $1\leq k\leq n$ we have that

[TABLE]

Since T is unitary, we have that

[TABLE]

Since $y\neq 0$ (in fact $\|y\|=1$ ), we know that $\sum_{j=1}^{n}{\lambda}^{\frac{2}{m_{j}}}{|y_{j}|}^{2}$ is increasing in $\lambda.$ Hence $\lambda=1.$ This implies that $\Phi(x)=\Phi(\tilde{y})=\Phi(y).$ Since $\tilde{\Phi}$ is injective, $x\sim y.$ This proves the non-parallel property.

∎

Lemma 4.3.

The map $H=\ell\circ F_{T},$ where $F_{T}:\mathbb{C}^{n}\to\mathbb{C}^{n(n+1)/2}$ is the map defined in (2.3) and $\ell$ is any map satisfying the conclusion of Theorem 3.1, satisfies the non-parallel property. In particular, the map $\Phi$ defined in (3.1) for this choice of $H,$ induces an injective map $\tilde{\Phi}.$ Furthermore, $\tilde{\Phi}$ satisfies the Lipschitz bound

[TABLE]

where $\|\ell\|$ is the operator norm of $\ell.$

Proof.

Combining remark 3.3, theorem 3.1, and proposition 4.2, we obtain that $\ell\circ F_{T}$ satisfies the non-parallel property. Now, using proposition 3.5, we get

[TABLE]

∎

Lemma 4.4.

If $F_{T}$ is defined as in (3.1), then

[TABLE]

Proof.

Note that

[TABLE]

But if $x\in\mathbb{S},$ then

[TABLE]

Since

[TABLE]

we have that

[TABLE]

∎

Proof of Theorem 4.1.

We start with the map $F_{T}:\mathbb{C}^{n}\to\mathbb{C}^{n(n+1)/2}$ given by (2.3). It is proven that this map is separating in [14], chapter 5. By Lemma 4.2, it satisfies the non-parallel property. We can apply Theorem 3.1 to reduce the dimension of the target space to $2n+1.$ For a generic $\ell:\mathbb{C}^{n(n+1)/2}\to\mathbb{C}^{2n+1},$ we have that

[TABLE]

is a separating map. Now, by Remark 3.3, we have that $\ell\circ F_{T}$ also satisfies the non-parallel property. Hence we can use Theorem 3.7 for $\ell\circ F_{T},$ and we obtain an injective map $\tilde{\Phi}:\mathbb{C}^{n}/\mathbb{Z}_{m}\to\mathbb{C}^{2n+1}$ with a Lipschitz bound:

[TABLE]

where

[TABLE]

Using Lemma 4.4, we directly deduce that

[TABLE]

∎

Before proceeding we make a few remarks.

Since the set of invariants we use in this case are monomials of the form $x_{i}^{a}x_{j}^{b}$ then it could make sense to arrange these in an $n\times n$ matrix. In fact, since we only have one monomial for each $(i,j)$ we can choose to make this matrix triangular, symmetric, or Hermitian if we so desire. As such this problem fits naturally into the broader context of matrix recovery problems. In this paper we do not need to exploit this point of view as our proofs do not benefit from it.

We now return to the original motivation of this work, which was to study translation invariant measurements on $\mathbb{C}^{n}$ . In this case $\mathbb{Z}_{n}$ acts on $\mathbb{C}^{n}$ via translation as in (1.4). In this case the matrix $T$ is not diagonal, but it is diagonalizable by the Fourier transform. In the Fourier domain our action is given by powers of the modulation matrix

[TABLE]

One of the main motivations that Mallat cites in [19] for the development of the scattering transform is that although the modulus of the Fourier transform is translation invariant, it is not stable. In our context by taking the modulus of the first $n$ entries of $F_{T}(x)$ , i.e., the measurements of the form $x_{i}^{m_{i}}$ , we can recover the modulus of the Fourier transform, but even these are not enough to get injectivity which is why we need the rest of the measurements.

4.1. The homogeneous case

In this subsection we will make a few remarks about the case when $F_{T}$ is homogeneous. We begin by showing that this only happens when $T=\omega I$ for some root of unity $\omega$ . Note that the case $\omega=-1$ which corresponds to real phase retrieval is a special case of this.

Proposition 4.5.

$F_{T}$ * is homogeneous if and only if $T=\omega I$ for some $m$ th root of unity $\omega$ .*

Proof.

If $T=\omega I$ then

[TABLE]

and so $F_{T}$ is homogeneous of degree $m$ .

Conversely, suppose $F_{T}$ is homogeneous of degree $m$ and $T=\mathrm{diag}(t_{1},...,t_{n})$ . We readily see that each $t_{i}$ must be a primitive $m$ th root of unity, if not we would have at least on pair of monomials of the form $x_{i}^{m_{i}}$ and $x_{j}^{m_{j}}$ with $m_{i}\neq m_{j}$ . Now observe that the monomial $x_{i}^{a}x_{j}^{b}$ must satisfy $a+b=m$ . Then, since $t_{i}$ and $t_{j}$ are both primitive $m$ th roots of unity there is a $k$ so that $t_{i}=t_{j}^{k}$ . Now by the division algorithm and the minimality of $a$ we see that $a=m\text{ mod }k$ and $a+kb=m$ . This implies $k=1$ and therefore $t_{i}=t_{j}=\omega$ . ∎

Note that in the proof of Theorem 3.1 we needed to introduce a new variable to homogenize the map $F_{T}$ , however when $T=\omega I$ this is not necessary. This means we can slightly improve the conclusion of Theorem 1.1 as follows:

Theorem 4.6.

Let $\omega$ be an $m$ -th root of unity and let $\mathbb{Z}_{m}$ act on $\mathbb{C}^{n}$ via $T=\omega I$ . Then there exists a $\Phi:\mathbb{C}^{n}\to\mathbb{C}^{2n}$ satisfying the same conclusions as Theorem 4.1.

As we noted above we can arrange our monomials into an $n\times n$ matrix. In this case one such matrix would be

[TABLE]

where $x^{m-1}$ denotes the vector whose components are the components of $x$ raised to the power $m-1$ . Note that this is a rank one matrix. This means that just as in the case of phase retrieval, any linear map $\ell$ that has the property that every nonzero matrix in $\ker(\ell)$ has rank at least 3 will satisfy the conclusion of Theorem 4.6. Therefore we can use such an $\ell$ to obtain the $\tilde{\Phi}$ in theorem 4.1. However, the set of rank (at most) two $n\times n$ matrices is a determinantal variety and is well known to have dimension $4n-4$ , so by the Projective Dimension Theorem any linear map whose kernel avoids this variety must have rank at least $4n-4$ whereas Theorem 3.1 says a generic linear map of rank $2n$ will yield the desired conclusion. This is because we only need $\ker(\ell)$ to avoid rank two matrices of the form $x^{m-1}x^{T}-y^{m-1}y^{T}$ which is a much smaller subvariety.

4.2. No lower Lipschitz bounds

A natural question is whether or not the map $\Phi$ satisfies a lower Lipschitz bound. We show that we can not expect this to be the case in general. The next proposition shows, $\Phi^{-1}$ is never Lipschitz when $G$ is cyclic and $n,|G|\geq 3.$

Proposition 4.7.

Let $m,n\geq 3.$ Let $T$ be a representation of $\mathbb{Z}_{m}$ acting on $\mathbb{C}^{n}.$ Then

[TABLE]

Proof.

Recall the definition of the map $F_{T}$ (2.3). Without loss of generality we may assume that $m_{i}\geq 2$ for all $i=1,\ldots,n.$ Since $m,n\geq 3,$ we see that there is a pair $(i,j)$ with $\max\{a_{ij},b_{ij}\}\geq 2.$ Again, without loss of generality, we assume $(1,2)$ is such a pair and that $a_{12}\geq 2.$

Let $x=(0,1,0,0,\ldots,0)$ and for $\varepsilon>0$ small consider the points $x_{\varepsilon}=(\varepsilon,\sqrt{1-\varepsilon^{2}},\ldots,0).$ We know that for sufficiently small $\varepsilon,$ $d_{\mathbb{Z}_{m}}([x_{\varepsilon}],[x])=\|x_{\varepsilon}-x\|,$ and therefore

[TABLE]

On the other hand, note that $\Phi$ is equal to $\ell\circ F_{T}$ on the unit sphere by definition. Also, any nonzero component of $\ell\circ F_{T}(x_{\varepsilon})$ is a linear combination of $x_{1}^{m_{1}},x_{1}^{a_{12}}x_{2}^{b_{12}},x_{2}^{m_{2}}.$ If $k$ is any such component, then

[TABLE]

which is $\mathcal{O}(\varepsilon^{2})$ because $m_{1},a_{12}\geq 2.$ This implies that

[TABLE]

The conclusion follows from (4.4) and (4.5).

∎

To illustrate the previous proposition, consider the example 5.2.1. in [14]: there $\mathbb{Z}_{12}$ acts on $\mathbb{C}^{5}$ and one has the explicit $\ell\circ F_{T}:\mathbb{C}^{5}\mapsto\mathbb{C}^{8}$ given by:

[TABLE]

In this example we can take $x=(0,0,0,1,0)$ and $x_{\varepsilon}=(0,0,0,\sqrt{1-\varepsilon^{2}},\varepsilon).$

Although $\tilde{\Phi}^{-1}$ is not Lipschitz, it is always continuous.

Proposition 4.8.

Under the same hypotheses of Proposition 3.5, the map $\tilde{\Phi}_{F}^{-1}$ is continuous.

Proof.

Let $x\in\mathbb{C}^{n}$ and $(x_{k})_{k\in\mathbb{N}}\subseteq\mathbb{C}^{n}$ be such that $\Phi_{F}(x_{k})\to\Phi_{F}(x),\mbox{ as }k\to\infty.$ Because $F$ is bounded away from zero on the sphere, we see that $\|x_{k}\|$ is bounded. Therefore, up to subsequence (not relabelled) $x_{k}$ converges to some $y\in\mathbb{C}^{n}.$ By continuity of $\Phi_{F},$ we see that $\Phi_{F}(y)=\Phi_{F}(x)$ which then implies $y\sim x.$ We have shown that every subsequence of $([x_{k}])_{k\in\mathbb{N}}$ possesses a subsequence converging to $[x],$ thus the whole sequence converges to $[x].$

∎

5. Conclusions, open questions and final remarks

As it can be seen from our analysis, effective $G$ -invariant representations can be built in finite dimensions by exploiting underlying algebraic and geometric properties of polynomial invariants. Though our transforms are general enough to cover many problems of interest, there are still some natural open questions regarding the construction of complete measurements in the finite dimensional setting.

A first question has to do with the use of polynomial invariants. In the end our transform does not preserve any of the features or the the algebraic structure of the maps leading up to it, so it would be very enlightening to explore ways to bypass the use of the invariant polynomials, and to appeal to the geometric non-parallel property to produce the final transform. A more analytical and more flexible approach would probably yield embeddings into even lower dimensions with much more explicit controls.

About the dimension reduction as we perform it here, it is certainly worth studying how to make a constructive choice of a linear map $\ell$ than that provided by Theorem 3.1. Even though our maps do not have a lower Lipschitz bound the choice of $\ell$ should play a significant role in any computation of the inverse map. More specifically, since we know that $\mathrm{ker}(\ell)$ must avoid nonzero vectors that are differences of elements of $\mathrm{Im}(F_{T})$ in order for $\tilde{\Phi}$ to be injective, then it seems intuitively clear that having these vectors “bounded away” from $\mathrm{ker}(\ell)$ in some sense should provide numerical stability. There are a variety of ways we could define ”bounded away” in this context. We expect properties similar to the nullspace property [13] and the restricted isometry property [9] to be of use here.

Another question that has to do with the algebraic approach employed here is how to treat more general group actions. Our assumption that $G$ is a finite group is essential for the separating polynomials to be actually separating in the sense that they discriminate orbits (see section 2.1 for more details). Also, another assumption that one should try to do without is that the action is unitary. This assumption is certainly convenient due to the handy representation and particular set of separating monomials, but it also plays a role in the rest of the construction and it affects the Lipschitz bound in a significant way because without it $\Phi$ would not be invariant. Thus, it should be clear that a different perspective is needed to treat more general situations.

One salient desired property that is absent in our construction is a quantitative control of the injectivity of $\Phi,$ for example our map does not come with a lower Lipschitz bound. Of course, this could be a matter of our use of polynomial invariants, but it could be that a much more delicate problem is at hand. We believe that even if complete sets of measurements could be constructed that make use of other types of invariants that would have a lower Lipschitz constant, the bounds obtained could be very bad and would probably go to [math] as $n\to\infty.$ This is known to be the case in phase retrieval [8].

Finally, one of the motivations behind this work was to provide an alternative to the scattering transform of Mallat [19] and the strategy preferred in the works [6, 5, 21, 18, 15, 10] based on neural networks; our main goal was to provide an approach better adapted to deal with finite dimensional problems. In [19] the transform obtained is non expansive, that is the Lipschitz constant is equal to $1.$ This is equivalent to, in our setting, having a bound independent of the dimension. We note that in principle, we could obtain non expansive maps simply by normalizing $\Phi$ (which depends on $n$ ) by the corresponding Lipschitz constant (which also depends on $n$ ). The challenge becomes then to understand the possible limits of these normalized transforms as $n\to\infty$ and how they relate to the scattering transform. The authors anticipate studying some of these problems in the future.

Acknowledgements. The work of A. Contreras was partially supported by a grant from the Simons Foundation # 426318.

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Radu Balan, Pete Casazza, and Dan Edidin. On signal reconstruction without phase. Applied and Computational Harmonic Analysis , 20(3):345–356, 2006.
2[2] Radu Balan and Dongmian Zou. On lipschitz analysis and lipschitz synthesis for the phase retrieval problem. Linear Algebra and its Applications , 496:152–181, 2016.
3[3] Afonso S Bandeira, Ben Blum-Smith, Joe Kileel, Amelia Perry, Jonathan Weed, and Alexander S Wein. Estimation under group actions: recovering orbits from invariants. ar Xiv preprint ar Xiv:1712.10163 , 2017.
4[4] Afonso S Bandeira, Jameson Cahill, Dustin G Mixon, and Aaron A Nelson. Saving phase: Injectivity and stability for phase retrieval. Applied and Computational Harmonic Analysis , 37(1):106–125, 2014.
5[5] Michael M Bronstein, Joan Bruna, Yann Le Cun, Arthur Szlam, and Pierre Vandergheynst. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine , 34(4):18–42, 2017.
6[6] Joan Bruna and Stéphane Mallat. Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence , 35(8):1872–1886, 2013.
7[7] Dmitri Burago and Yuri Burago. A course in metric geometry .
8[8] Jameson Cahill, Peter Casazza, and Ingrid Daubechies. Phase retrieval in infinite-dimensional hilbert spaces. Transactions of the American Mathematical Society, Series B , 3(3):63–76, 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Complete set of translation invariant measurements with Lipschitz bounds

Abstract.

1. Introduction

Theorem 1.1**.**

2. Background

2.1. Algebraic invariants

Theorem 2.1**.**

2.2. Phase retrieval

Lemma 2.2**.**

3. Main results

3.1. Dimension reduction

Theorem 3.1**.**

Proof.

3.2. Non-parallel maps induce Lipschitz invariant representations

Definition 3.2**.**

Remark 3.3**.**

Definition 3.4**.**

Proposition 3.5**.**

Proof.

Remark 3.6**.**

Theorem 3.7**.**

Proof.

4. Applications to cyclic groups

Theorem 4.1**.**

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

4.1. The homogeneous case

Proposition 4.5**.**

Proof.

Theorem 4.6**.**

4.2. No lower Lipschitz bounds

Proposition 4.7**.**

Proof.

Proposition 4.8**.**

Proof.

5. Conclusions, open questions and final remarks

Theorem 1.1.

Theorem 2.1.

Lemma 2.2.

Theorem 3.1.

Definition 3.2.

Remark 3.3.

Definition 3.4.

Proposition 3.5.

Remark 3.6.

Theorem 3.7.

Theorem 4.1.

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.

Proposition 4.5.

Theorem 4.6.

Proposition 4.7.

Proposition 4.8.