Phase Transition in the One-bit Johnson-Lindenstrauss Lemma

Amadou Bah; Bryson Kagy; Emily Smith

arXiv:1903.02123·math.FA·March 7, 2019

Phase Transition in the One-bit Johnson-Lindenstrauss Lemma

Amadou Bah, Bryson Kagy, Emily Smith

PDF

Open Access

TL;DR

This paper demonstrates a phase transition phenomenon in one-bit Johnson-Lindenstrauss embeddings, showing that a small increase in dimension m sharply increases the probability of a successful embedding, with bounds similar to the linear case.

Contribution

It establishes a phase transition in the probability of one-bit JL embeddings being RIP, matching bounds known for linear JL, and analyzes this using properties of Bernoulli variables.

Findings

01

Phase transition in embedding success probability with respect to m.

02

Bounds on m similar to linear JL Lemma.

03

Probabilistic analysis using Bernoulli variables.

Abstract

The Johnson-Lindenstrauss Lemma (J-L Lemma) is a cornerstone of dimension reduction techniques. We study it in the one-bit context, namely we consider the unit sphere $S^{N - 1}$ , with normalized geodesic metric, and map a finite set $X \subset S^{N - 1}$ into the Hamming cube $H_{m} = {0, 1}^{m}$ , with normalized Hamming metric. We find that for $0 < δ < 1$ , and $m > \frac{l n n}{2 δ ^{2}}$ there is a $δ$ -RIP from $X$ into $H_{m}$ . This is surprising as the value of $m$ is virtually identical to best known bound linear J-L Lemma. In both the linear and one-bit case, the maps are randomly constructed. We show that the probability of $B_{m}$ being a $δ$ -RIP satisfies a phase transition. It passes from probability of nearly zero to nearly one with a very small change in $m$ . Our proof relies on delicate properties of Bernoulli…

Equations166

m \geq 4 \frac{ln n}{\frac{δ ^{2}}{2} - \frac{δ ^{3}}{3}},

m \geq 4 \frac{ln n}{\frac{δ ^{2}}{2} - \frac{δ ^{3}}{3}},

(1 - δ) ∥ x - y ∥^{2} \leq ∥ f (x) - f (y) ∥^{2} \leq (1 + δ) ∥ x - y ∥^{2}

(1 - δ) ∥ x - y ∥^{2} \leq ∥ f (x) - f (y) ∥^{2} \leq (1 + δ) ∥ x - y ∥^{2}

d_{H_{m}} (x, y) = \frac{1}{m} # {i : x_{i} \neq = y_{i}} .

d_{H_{m}} (x, y) = \frac{1}{m} # {i : x_{i} \neq = y_{i}} .

d_{H_{m}} (B_{m} x, B_{m} y) = \frac{1}{m} i = 1 \sum m \frac{∣ sgn ( x \cdot θ _{i} ) - sgn ( y \cdot θ _{i} ) ∣}{2} .

d_{H_{m}} (B_{m} x, B_{m} y) = \frac{1}{m} i = 1 \sum m \frac{∣ sgn ( x \cdot θ _{i} ) - sgn ( y \cdot θ _{i} ) ∣}{2} .

d_{g eo} (x, y) = \frac{cos ^{- 1} ( x \cdot y )}{π} .

d_{g eo} (x, y) = \frac{cos ^{- 1} ( x \cdot y )}{π} .

d_{H_{m}} (B_{m} x, B_{m} y) = \frac{1}{m} i = 1 \sum m 1_{Wedge_{x y}} (θ_{j}) .

d_{H_{m}} (B_{m} x, B_{m} y) = \frac{1}{m} i = 1 \sum m 1_{Wedge_{x y}} (θ_{j}) .

d_{H_{m}} (B_{m} x, B_{m} y) - d_{geo} (x, y) = \frac{1}{m} i = 1 \sum m 1_{Wedge_{x y}} (θ_{j}) - d_{geo} (x, y) .

d_{H_{m}} (B_{m} x, B_{m} y) - d_{geo} (x, y) = \frac{1}{m} i = 1 \sum m 1_{Wedge_{x y}} (θ_{j}) - d_{geo} (x, y) .

∣ d_{H_{m}} (B_{m} x, B_{m} y) - d_{g eo} (x, y) ∣ \leq δ .

∣ d_{H_{m}} (B_{m} x, B_{m} y) - d_{g eo} (x, y) ∣ \leq δ .

d_{T V} (W, Poi (λ)) \leq min (1, \frac{1}{λ}) (Var (W) - λ + 2 i \in I \sum P_{i}^{2})

d_{T V} (W, Poi (λ)) \leq min (1, \frac{1}{λ}) (Var (W) - λ + 2 i \in I \sum P_{i}^{2})

d_{T V} (W, Poi (λ)) \leq min (1, \frac{1}{λ}) i \in I \sum P_{i}^{2} + j \in N_{i} \sum (P_{i} P_{j} + E (X_{i} X_{j}))

d_{T V} (W, Poi (λ)) \leq min (1, \frac{1}{λ}) i \in I \sum P_{i}^{2} + j \in N_{i} \sum (P_{i} P_{j} + E (X_{i} X_{j}))

m \geq \frac{ln \frac{n ^{2}}{2 ϵ}}{ln \frac{1}{δ}} .

m \geq \frac{ln \frac{n ^{2}}{2 ϵ}}{ln \frac{1}{δ}} .

m \geq 2 lo g_{2} n + lo g_{2} \frac{1}{2 ϵ} .

m \geq 2 lo g_{2} n + lo g_{2} \frac{1}{2 ϵ} .

P (B_{m} is not one-to-one) \leq x ∽ y \sum P (B_{m} x = B_{m} y with x \neq = y) \leq (2 n) δ^{m} .

P (B_{m} is not one-to-one) \leq x ∽ y \sum P (B_{m} x = B_{m} y with x \neq = y) \leq (2 n) δ^{m} .

m \geq \frac{ln \frac{n ^{2}}{2 ϵ}}{ln \frac{1}{δ}} .

m \geq \frac{ln \frac{n ^{2}}{2 ϵ}}{ln \frac{1}{δ}} .

m \geq 2 lo g_{2} n + lo g_{2} \frac{1}{2 ϵ} .

m \geq 2 lo g_{2} n + lo g_{2} \frac{1}{2 ϵ} .

lo g_{2} \frac{n ( n - 1 )}{2 ln \frac{1}{1 - \frac{ϵ _{1}}{1.01}}} \leq m

lo g_{2} \frac{n ( n - 1 )}{2 ln \frac{1}{1 - \frac{ϵ _{1}}{1.01}}} \leq m

m \leq lo g_{2} \frac{n ( n - 1 )}{2 ln \frac{1}{1 - \frac{ϵ _{2}}{0.99}}} .

m \leq lo g_{2} \frac{n ( n - 1 )}{2 ln \frac{1}{1 - \frac{ϵ _{2}}{0.99}}} .

P_{1-1} (m) = P (B_{m} is one-to-one) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - (2 n) \cdot 2^{- 2 m}, e^{- \frac{( 2 n )}{2 ^{m}}} + (2 n) \cdot 2^{- 2 m}] .

P_{1-1} (m) = P (B_{m} is one-to-one) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - (2 n) \cdot 2^{- 2 m}, e^{- \frac{( 2 n )}{2 ^{m}}} + (2 n) \cdot 2^{- 2 m}] .

W_{x ∽ y} = {1, 0, B_{m} x = B_{m} y otherwise.

W_{x ∽ y} = {1, 0, B_{m} x = B_{m} y otherwise.

d_{T V} (W, Poi (λ)) \leq η

d_{T V} (W, Poi (λ)) \leq η

η = min (1, \frac{1}{λ}) [Var (W) - λ + 2 x ∽ y \sum p^{2}] .

η = min (1, \frac{1}{λ}) [Var (W) - λ + 2 x ∽ y \sum p^{2}] .

Var (W) = λ - x ∽ y \sum p^{2} .

Var (W) = λ - x ∽ y \sum p^{2} .

Var (W)

Var (W)

1 - P (W \geq 1) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - η, e^{- \frac{( 2 n )}{2 ^{m}}} + η] .

1 - P (W \geq 1) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - η, e^{- \frac{( 2 n )}{2 ^{m}}} + η] .

η

η

∣ P (W \geq 1) - P (P o i (λ) \geq 1) ∣ \leq η

∣ P (W \geq 1) - P (P o i (λ) \geq 1) ∣ \leq η

1 - P (W \geq 1) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - η, e^{- \frac{( 2 n )}{2 ^{m}}} + η] .

1 - P (W \geq 1) \in [e^{- \frac{( 2 n )}{2 ^{m}}} - η, e^{- \frac{( 2 n )}{2 ^{m}}} + η] .

1 - e^{- \frac{( 2 n )}{2 ^{m}}} + η

1 - e^{- \frac{( 2 n )}{2 ^{m}}} + η

1 - e^{- \frac{( 2 n )}{2 ^{m}}} - η

1 - e^{- \frac{( 2 n )}{2 ^{m}}} - η

\frac{\frac{( 2 n )}{2 ^{m}}}{2} \leq \frac{( 2 n )}{2 ^{m}} - \frac{\frac{( 2 n ) ^{2}}{2 ^{2 m}}}{2} .

\frac{\frac{( 2 n )}{2 ^{m}}}{2} \leq \frac{( 2 n )}{2 ^{m}} - \frac{\frac{( 2 n ) ^{2}}{2 ^{2 m}}}{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Computational Geometry and Mesh Generation · Theoretical and Computational Physics

Full text

Phase Transition in the One-bit Johnson-Lindenstrauss Lemma

Amadou Bah

,

Bryson Kagy

and

Emily Smith

Abstract.

The Johnson-Lindenstrauss Lemma (J-L Lemma) is a cornerstone of dimension reduction techniques. We study it in the one-bit context, namely we consider the unit sphere $\mathbb{S}^{N-1}$ , with normalized geodesic metric, and map a finite set $\mathbf{X}\subset\mathbb{S}^{N-1}$ into the Hamming cube $\mathbb{H}_{m}=\{0,1\}^{m}$ , with normalized Hamming metric. We find that for $0<\delta<1$ , and $m>\frac{\ln n}{2\delta^{2}}$ there is a $\delta$ -RIP from $\mathbf{X}$ into $\mathbb{H}_{m}$ . This is surprising as the value of $m$ is virtually identical to best known bound linear J-L Lemma. In both the linear and one-bit case, the maps are randomly constructed. We show that the probability of $B_{m}$ being a $\delta$ -RIP satisfies a phase transition. It passes from probability of nearly zero to nearly one with a very small change in $m$ . Our proof relies on delicate properties of Bernoulli random variables.

Research conducted at during an REU sponsored by an NSF MCTP-Grant to the Georgia Institute of Technology

1. Introduction

Compressive sensing was first introduced as a practical application of signal processing and has since taken off and proven to be very useful for many aspects of modern life such as MRI scanning, cell phone imaging, electron microscopy, and many more [lustig2007sparse, fornasier2011compressive, binev2012compressed]. It has been previously shown by Johnson-Lindenstrauss [johnson1984extensions], that given a very high dimensional data set in $\mathbb{R}^{N}$ , it is possible, with little sacrifice, to map vectors from a subset of this $N$ -dimensional space, to a much lower, $m$ -dimensional space. Recently, Alon and Klartag [2016arXiv161000239A] studied the minimum number of bits required in order to maintain the Euclidean distance between data points. This differs from our results through the fact that we maintain the geodesic distance between points. The non-linear geodesic metric is basic to our considerations.

Dasgupta-Gupta [dasgupta1999elementary] provide the best quantitative bounds in the J-L Lemma. For any $0<\delta<1$ , any integer $n$ let

[TABLE]

then for any set of $n$ points in $\mathbb{R}^{N}$ there exist a map $f:\mathbb{R}^{N}\rightarrow\mathbb{R}^{m}$ such that for all $x,y\in\mathbb{R}^{N}$ we have:

[TABLE]

For comparison below, we remark that $\sqrt{1\pm\delta}\simeq 1\pm\delta/2$ .

We study the one bit context. Consider the unit sphere $\mathbb{S}^{N-1}$ with the normalized geodesic metric. We map finite $\mathbf{X}\subset\mathbb{S}^{N-1}$ into the $m$ dimensional Hamming cube $\mathbb{H}_{m}$ , with normalized Hamming metric. A main result is that for $0<\delta<1$ , and integer $n$ , let $m>2\frac{\ln n}{\delta^{2}}$ . For any set $\mathbf{X}\subset\mathbb{S}^{N-1}$ of cardinality $n$ , there is a $\delta$ -RIP from $\mathbf{X}$ into $\mathbb{H}_{m}$ . The counter-intuitive fact is that our bound for $m$ is *virtually identical to the one that holds for the linear J-L Lemma. *

We prove the One Bit J-L Lemma in §5. The simplier property of our random one-bit map being one to one is studied in §3. For special choices of $\mathbf{X}$ , we make a finer analysis of the one-to-one and RIP properties. They satisfy phase transitions that depend only weakly on the number of points we are mapping, see §6 and §4. Some background information is recalled in §2.

2. Background

We formalize below several definitions we will use throughout the paper.

Hamming Cube

$\mathbb{H}_{m}=\{0,1\}^{m}$ for all $x\in\mathbb{H}_{m}$ $x=x_{1}x_{2}...x_{n}$ where $x_{i}\in\{0,1\}$ . For all $x,y\in\mathbb{H}_{m}$ we have the normalized metric

[TABLE]

Random $m$ -dimensional One-Bit Map

Given $\{\theta_{j}\;:\;1\leq j\leq m\}$ be iid uniformly distributed random vectors in $\mathbb{S}^{N-1}$ . Define a map $B_{m}\;:\;\mathbb{S}^{N-1}\to\mathbb{H}_{m}$ by $\textbf{B}_{m}x=\{\textup{sgn}(x\cdot\theta_{j})\}_{j=1}^{m}$ . Observe that

[TABLE]

Geodesic Distance

Fix $x$ , $y$ on $\mathbb{S}^{N-1}$ . The geodesic distance $d_{geo}(x,y),$ is the shortest distance between the points $x$ and $y$ on the surface. This is given by

[TABLE]

Antipodal points are normalized to be distance one apart. Geodesic distance has this probabilistic interpretation: Let ${\textup{Wedge}}_{xy}=\{\theta\in\mathbb{S}^{N-1}:\textup{sgn}(x\cdot\theta)\neq\textup{sgn}(y\cdot\theta)\}.$ These are the $\theta$ which distinguish between $x$ and $y$ under the one-bit map. Selecting $\theta\in\mathbb{S}^{N-1}$ at random, the probability of being in ${\textup{Wedge}}_{xy}$ is $d_{geo}(x,y).$ This is an instance of the Crofton formula.

For the distance in (2.0.1), we then have

[TABLE]

The right hand side is an average of Bernoulli rvs. In particular, the difference between the Hamming and geodesic metrics is

[TABLE]

Standard deviation inequalities for Bernoulli rvs apply to the right hand side above.

The Restricted Isometry Property

$\textbf{B}_{m}:\textbf{X}\to\mathbb{H}_{m}$ has the $\delta$ -RIP if for all pairs $x,y\in\textbf{X}$ :

[TABLE]

Positively Associated Stein-Chen Approximation

For random variables to be positively associated, their covariance is positive, meaning they increase or decrease together.

[TABLE]

where $W$ is a sum of positively associated Bernoullis with parameter $P_{i}$ , $\lambda$ is $\mathbb{E}(W)$ , and $d_{TV}$ is total variation distance.

General Form of Stein-Chen Approximation

[arratia1990poisson]

[TABLE]

where $X_{i}$ are Bernoullis with parameter $P_{i},$ $W$ is a sum of all $X_{i}$ , $\lambda=\mathbb{E}(W)$ , $\mathscr{N}_{i}$ is the set of random variables that depend on $X_{i}$ , and $d_{TV}$ is total variation distance.

3. A One-to-One Mapping From the Unit Sphere to the Hamming Cube

We start with an analysis of a simpler property of $\textbf{B}_{m}$ being one-to-one.

Theorem 3.1.

Let $0<\delta$ , $\epsilon<1$ , and let $\textbf{X}\subset\mathbb{S}^{N-1}$ be a subset of n points with $d_{geo}(x,y)>1-\delta$ , where $x\neq y\in\bf{X}.$ The random $m$ -dimensional one-bit map $\textbf{B}_{m}$ : $\textbf{X}\to\mathbb{H}_{m}$ will be one-to-one with probability at least $1-\epsilon$ provided that

[TABLE]

In the special case when the points $x$ and $y$ are pairwise orthogonal, $d_{geo}(x,y)=\frac{1}{2}$ ,

[TABLE]

By the pigeonhole principle, m must be at least $\log_{2}n.$ Our result shows that if $m>2\log_{2}n,$ then the random $m$ -dimensional one-bit map is one-to-one with high probability.

Proof.

By the union bound, we know that:

[TABLE]

In this expression, $\displaystyle\sum_{x\backsim y}$ means the sum over all unordered pairs $x\backsim y$ where $x\neq y$ . Above, there are $\binom{n}{2}$ pairs $x\backsim y\in\bf{X}.$ The ith coordinates of $\textbf{B}_{m}x$ and $\textbf{B}_{m}y$ are equal with probability at most $\delta$ . The coordinates are independent, hence the inequality above. We require $\binom{n}{2}\cdot\delta^{m}\leq\epsilon$ , which is true if

[TABLE]

This condition is sufficient for $\textbf{B}_{m}$ to be one-to-one with probability $1-\epsilon$ . In the special case when $\bf{X}$ consists of pairwise orthogonal vectors, $d_{geo}(x,y)=\frac{1}{2}$ , the bound is

[TABLE]

∎

4. A Phase Transition in One-to-One Property

For a special class of X, we analyze the property of $\textbf{B}_{m}$ being one-to-one. We show that the probability passes through a phase transition. And the width of the phase transition is essentially independent of the cardinality of X.

Theorem 4.1.

Fix $0<\epsilon_{2}<\epsilon_{1}<1$ . Let $\bf{X}$ be $n$ pairwise orthogonal vectors in $\mathbb{S}^{N-1}$ , and let $P_{\textrm{1-1}}(m)$ be the probability that $\textbf{B}_{m}$ is one-to-one. Then for $n\geq 10$ , $1-\epsilon_{1}<P_{\textrm{1-1}}(m)$ when:

[TABLE]

and $P_{\textrm{1-1}}(m)<1-\epsilon_{2}$ when

[TABLE]

Additionally, the phase transition is bounded as follows:

[TABLE]

We will analyze this from the perspective of the birthday problem. To do this, we will count all pairs of points that $\textbf{B}_{m}$ map to the same point in the Hamming cube. Namely for $x,y\in\textbf{X}$ , let

[TABLE]

All $W_{x\backsim y}$ are i.i.d with probability $p=\frac{1}{2^{m}}$ and $W=\displaystyle\sum_{x\backsim y}W_{x\backsim y}$ is a sum of positively associated Bernoulli random variables. By the Stein-Chen approximation, $W$ is close to a Poisson distribution, in total variation, denoted $d_{TV}$ below. We make this precise below:

[TABLE]

where

[TABLE]

Lemma 4.2.

We claim

[TABLE]

Proof.

If the the $W_{x\backsim y}$ have the property $\mathbb{E}[W_{x\backsim y}W_{r\backsim s}]=\mathbb{E}[W_{x\backsim y}]\mathbb{E}[W_{r\backsim s}]$ then the $W_{x\backsim y}$ are pairwise independent. Assuming this, variance adds, and $Var(W)$ can be calculated:

[TABLE]

It remains to prove that $\mathbb{E}[W_{x\backsim y}W_{r\backsim s}]=\mathbb{E}[W_{x\backsim y}]\mathbb{E}[W_{r\backsim s}]$ . It will be sufficient to show that $\mathbb{E}[W_{x\backsim y}W_{r\backsim s}]=p^{2}$ . The only non-trivial case is when $x\backsim y$ and $r\backsim s$ share exactly one point. We will write this as $\mathbb{E}[W_{x\backsim y}W_{y\backsim s}]$ where y is the shared point. $\mathbb{E}[W_{x\backsim y}W_{y\backsim s}]$ is equal to $\mathbb{P}(\textbf{B}_{m}x=\textbf{B}_{m}y=\textbf{B}_{m}s)$ which means we have three distinct points on the sphere mapping to the same point on the Hamming cube. Thus $\mathbb{P}(\textbf{B}_{m}x=\textbf{B}_{m}y=\textbf{B}_{m}s)=p^{2}$ giving us that $\mathbb{E}[W_{x\backsim y}W_{r\backsim s}]=p^{2}$ .

∎

Lemma 4.3.

We claim that $\eta=\binom{n}{2}\cdot 2^{-2m}$ where $\eta$ is $\eqref{eq:3}$

Using this, we can bound $1-\mathbb{P}(W\geq 1)$ in the window

[TABLE]

Proof.

We can find an expression for $\eta$ using the variance of $W$ :

[TABLE]

We can bound $1-\mathbb{P}(W\geq 1)$ which is equal to $P_{\textrm{1-1}}(m)$ :

[TABLE]

∎

Solving For m. Fix $0<\epsilon_{2}<\epsilon_{1}<1$ , let X be $n$ pairwise orthogonal vectors in $\mathbb{S}^{N-1}$ , and let $P_{\textrm{1-1}}(m)$ be the probability that $\textbf{B}_{m}$ is one-to-one, then $1-\epsilon_{1}P_{\textrm{1-1}}(m)$ when:

[TABLE]

and $P_{\textrm{1-1}}(m)<1-\epsilon_{2}$ when:

[TABLE]

In order to ensure that $\eta$ is very small compared to the Poisson distribution, we want $\eta\leq 0.01(1-e^{-\frac{\binom{n}{2}}{2^{m}}}).$ If we fix $n$ and choose $m$ such that $\frac{\binom{n}{2}}{2^{m}}\leq 1$ , observe the inequality

[TABLE]

It is sufficient to bound $\eta$ as

[TABLE]

Manipulating this statement we get:

[TABLE]

Since we already assumed that $\frac{\binom{n}{2}}{2^{m}}\leq 1$ , we can rewrite this inequality as $\frac{1}{\binom{n}{2}}\leq 0.005$ and solve for $n$ : $n\geq 10.$ This means that if $n\geq 10$ , we have $\eta\leq 0.01(1-e^{-\frac{\binom{n}{2}}{2^{m}}})$ which allows us to rewrite our inequalities and gain bounds on $m$ :

[TABLE]

5. A Union Bound for the Restricted Isometry Property

This is the One Bit version of the Johnson-Lindenstrauss Lemma. In particular the quantitive bound on $m$ below is nearly identical to the best known bound in the linear Johnson-Lindenstrauss Lemma.

Theorem 5.1.

Fix $0<\delta<\frac{1}{2}$ and $0<\epsilon<1$ , let X be $n$ pairwise orthogonal vectors in $\mathbb{S}^{N-1}$ . The random $m$ dimensional one-bit map, $\textbf{B}_{m}:\textbf{X}\to\mathbb{H}_{m}$ , satisfies the $\delta$ -RIP with probability at least $1-\epsilon$ when

[TABLE]

In the Restricted Isometry Property (RIP), we want to preserve the pairwise distances between the points so that for all pairs $x,y\in\textbf{X}$

[TABLE]

To ensure that we satisfy the $\delta$ -RIP with probability at least $1-\epsilon$ , we have to be certain that the probability of failure, $\epsilon$ , is small:

[TABLE]

Using the Union bound,

[TABLE]

We will first analyze the probability for one pair $x\backsim y$ , namely:

[TABLE]

Lemma 5.2.

We claim that $\mathbb{P}(|d_{\mathbb{H}_{m}}(\textbf{B}_{m}x,\textbf{B}_{m}y)-d_{geo}(x,y)|\geq\delta)\leq 2e^{-2\delta^{2}m}$ for all pairs $x\backsim y.$

Proof.

The difference in the metrics is the average of the random variables:

[TABLE]

These are independent centered Bernoulli random variables that satisfy a large deviation inequality which is uniform in the Bernoulli parameter [hoeffding1963probability],

[TABLE]

∎

The Union Bound for the RIP: The last expression above provides a bound for the probability that one pair $x\backsim y$ fails the $\delta$ -RIP. Now, summing over all pairs,

[TABLE]

We can bound this probability with $\epsilon$ and solve for $m$ :

[TABLE]

For all $m$ greater than this bound, $m$ must satisfy the $\delta$ -RIP with probability at least $1-\epsilon.$

6. A Phase Transition for the Restricted Isometry Property

For special X, we analyze the property of $\textbf{B}_{m}$ is a $\delta$ -RIP. Again, the size of the window’s dependence on $\lvert\textbf{X}\rvert$ , is very weak. This time the dependence is in terms of $\ln\ln\lvert\textbf{X}\rvert$ .

Theorem 6.1.

Fix $0<\delta<\frac{1}{2}$ , fix $0<\epsilon_{2}<\epsilon_{1}<0.99$ . Let $\bf{X}$ be $n$ pairwise orthogonal vectors in $\mathbb{S}^{N-1}$ , and let $P_{\textrm{RIP}}(m)$ be the probability that $\textbf{B}_{m}$ satisfies $\delta-\textrm{RIP}$ . If $n\geq 800$ , then $1-\epsilon_{1}<P_{\textrm{RIP}}(m)$ when:

[TABLE]

and $P_{\textrm{RIP}}(m)<1-\epsilon_{2}$ when

[TABLE]

Additionally, the phase transition is bounded as follows:

[TABLE]

*where $\lambda_{1}^{\delta}=\frac{n(n-1)}{2}\cdot\frac{e^{\frac{-1}{6}}}{\sqrt{2\pi m}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}$ , $\lambda_{2}^{\delta}=\frac{n(n-1)}{2}\cdot\frac{e^{\frac{1}{12}}\sqrt{m}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]},$

and $\eta^{\delta}=\binom{n}{2}\left[(p^{\delta})^{2}+4(n-2)(p^{\delta})^{2}\right]$ where $p^{\delta}\leq\frac{e^{\frac{1}{12}}\sqrt{m}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}.$ *

The graph below shows a simulation of the RIP property with $\delta=0.2$ . The red line is the bound (6.1.1), the green line is (6.1.3). The jagged blue line is the simulated value of the probability of $\textbf{B}_{m}$ being a $0.2$ -RIP. The line is jagged, due to the discrete nature of the Hamming metric. The latter fact is of course a complication implicit in our proof.

We will again analyze the phase transition from the perspective of the birthday problem. To do this we will count all $x\backsim y\in\textbf{X}$ that fail the RIP property, namely:

[TABLE]

Then $W^{\delta}=\displaystyle\sum_{x\backsim y}W^{\delta}_{x\backsim y}$ is a sum of Bernoulli random variables. All $W^{\delta}_{x\backsim y}$ are i.i.d with probability $p^{\delta}=\mathbb{P}(|d_{\mathbb{H}_{m}}(\textbf{B}_{m}x,\textbf{B}_{m}y)-d_{geo}(x,y)|\geq\delta)$ . By the general form of the Stein-Chen approximation $W^{\delta}$ is close to a Poisson distribution in total variation. We make this precise below:

[TABLE]

where

[TABLE]

Lemma 6.2.

We claim that $\lambda^{\delta}=\binom{n}{2}\mathbb{P}(|Y-\frac{m}{2}|>m\delta)$ where $Y$ is $\textrm{Bin}(m,\frac{1}{2})$ .

Proof.

We know that $\lambda^{\delta}=\binom{n}{2}p^{\delta}.$ In this special case, the geodesic distances between the points in $\bf{X}$ is $\frac{1}{2},$ which reduces $p^{\delta}$ to:

[TABLE]

For each $i$ , $\frac{|\textup{sgn}(x\cdot\theta_{i})-\textup{sgn}(y\cdot\theta_{i})|}{2}$ is Bernoulli with parameter $d_{geo}(x,y)=\frac{1}{2}.$ The $\theta_{i}$ are independently so $Y=\sum_{i=1}^{m}\frac{|\textup{sgn}(x\cdot\theta_{i})-\textup{sgn}(y\cdot\theta_{i})|}{2}$ is $\textrm{Bin}(m,\frac{1}{2})$ and we can rewrite $p^{\delta}$ as:

[TABLE]

∎

Lemma 6.3.

We can bound $\lambda^{\delta}$ as such:

[TABLE]

where $\Lambda=\frac{\binom{n}{2}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}$ Additionally for $\delta<0.25$ , we can approximate this statement as:

[TABLE]

Proof.

As previously defined,

[TABLE]

Now we will use Sterling’s approximation,

[TABLE]

to obtain bounds for $\lambda^{\delta}$ , but for the upper bound, we will use the fact that $\displaystyle{e^{\frac{1}{12m}}\leq e^{\frac{1}{12}}}$ for $m\geq 1$ .

Let $A=\lceil\frac{m}{2}+m\delta\rceil$ and let us assess only the first term of the $\lambda^{\delta}$ sum since it is the largest.

[TABLE]

We can rewrite these bounds in terms of $\Lambda$ :

[TABLE]

Using this inequality for the first term in $\eqref{eq:1}$ , we gain a lower bound for $\lambda^{\delta}$ :

[TABLE]

For the upper bound, we have at most $m$ summands in $\eqref{eq:1}$ so the upper bound is:

[TABLE]

For $\delta<0.25$ we can simply these two statements above using the Taylor approximation for $\ln(1-x)$ , $\frac{-x}{1-x}\leq\ln(1-x)\leq-x$

[TABLE]

∎

Lemma 6.4.

We claim that: $\eta^{\delta}=\binom{n}{2}\left[(p^{\delta})^{2}+4(n-2)(p^{\delta})^{2}\right]$ where $\eta^{\delta}$ is given in $\eqref{eq:2}.$ Using this we can bound $1-\mathbb{P}(W^{\delta}\geq 1)$ in the window

[TABLE]

*where $\lambda_{1}^{\delta}=\binom{n}{2}\cdot\frac{e^{-\frac{1}{6}}}{\sqrt{2\pi m}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}$ and $\lambda_{2}^{\delta}=\binom{n}{2}\cdot\frac{e^{\frac{1}{12}}\sqrt{m}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}.$

Proof.

We recall:

[TABLE]

In order to estimate $\eta^{\delta}$ , we need to estimate $|\mathscr{N}_{x\backsim{y}}|$ . Because the only coordinates that are dependent on $x\backsim y$ are those that share exactly one coordinate with $x\backsim y$ , $|\mathscr{N}_{x\backsim{y}}|\leq 2(n-2)$ . There are two ways this can happen: either $r$ or $s$ shares a coordinate with $x\backsim y.$ There are $n-2$ ways to choose the remaining coordinates. Assuming pairwise independence, we can estimate the size of $\eta^{\delta}$ :

[TABLE]

We can now bound $1-\mathbb{P}(W^{\delta}\geq 1)$ which is equal to $\mathbb{P}(\textbf{B}_{m}\textrm{ is }\delta-\textrm{RIP})$ .

[TABLE]

where $\lambda_{1}^{\delta}=\binom{n}{2}\cdot\frac{e^{-\frac{1}{6}}}{\sqrt{2\pi m}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}$ and $\lambda_{2}^{\delta}=\binom{n}{2}\cdot\frac{e^{\frac{1}{12}}\sqrt{m}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}.$

∎

Lemma 6.5.

$W^{\delta}_{x\backsim y}$ * are pairwise independent for all pairs, $x\backsim y$ .*

Proof.

To show that $W^{\delta}_{x\backsim y}$ are pairwise independent, it is sufficient to show that $\mathbb{E}(W^{\delta}_{x\backsim y},W^{\delta}_{r\backsim s})=(p^{\delta})^{2}:$

[TABLE]

The only non-trivial case is when $x\backsim y$ and $r\backsim s$ share a common point. We can rewrite this probability as:

[TABLE]

where k is an element of $\mathbb{H}_{m}.$ After an orthogonal transformation, we can take $x,y,s$ to be the first three coordinates vectors $e_{1},e_{2},e_{3}.$ The distribution of the $\theta_{j}$ are unchanged. The signs of the coordinates of the $\theta_{j}$ are independent, so the events are independent. Because all of these events are independent, the probability can be written as:

[TABLE]

Because each of these probabilities is identical distributed to $\mathbb{P}\left(\left|d_{\mathbb{H}_{m}}(\textbf{B}_{m}x,\textbf{B}_{m}y)-\frac{1}{2}\right|\geq\delta\right).$ There are are $2^{m}$ elements in $\mathbb{H}_{m}$ , we get that $\mathbb{P}(W^{\delta}_{x\backsim y}=W^{\delta}_{r\backsim s}=1)=(p^{\delta})^{2}$ , as desired. ∎

Solving For $\bf{m}$ : Using the previous bounds on $\lambda^{\delta}$ ,

[TABLE]

where $\lambda_{1}^{\delta}=\binom{n}{2}\cdot\frac{e^{-\frac{1}{6}}}{\sqrt{2\pi m}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}$ and $\lambda_{2}^{\delta}=\binom{n}{2}\cdot\frac{e^{\frac{1}{12}}\sqrt{m}}{\sqrt{2\pi}}\cdot e^{m[\frac{-1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1-2\delta}{1+2\delta}]}.$ Fix $0<\epsilon_{2}<\epsilon_{1}<1$ , let X be $n$ pairwise orthogonal vectors in $\mathbb{S}^{N-1}$ , and let $P_{\textrm{RIP}}(m)$ be the probability that $\textbf{B}_{m}$ is one-to-one, then $1-\epsilon_{1}<P_{RIP}(m)$ when:

[TABLE]

and $P_{\textrm{RIP}(m)}\leq 1-\epsilon_{2}$ when

[TABLE]

In order to ensure that $\eta^{\delta}$ is very small compared to $1-e^{-\lambda_{2}^{\delta}}$ and $1-e^{-\lambda_{1}^{\delta}}$ , we want $\eta^{\delta}\leq 0.01(1-e^{-\lambda_{2}^{\delta}}).$ If we fix $n$ and choose $m$ such that $\lambda_{2}^{\delta}\leq 1,$ then using the inequality $\frac{\lambda_{2}^{\delta}}{2}\leq\lambda_{2}^{\delta}-\frac{(\lambda_{2}^{\delta})^{2}}{2}$ it is sufficient to bound $\eta^{\delta}$ as

[TABLE]

Manipulating this statement we gain:

[TABLE]

Because $1+2(n-2)\leq 4\frac{n(n-1)}{2}$ , we can rewrtite this inequality as:

[TABLE]

Since we assumed that $\lambda_{2}^{\delta}\leq 1$ , we can rewrite this inequality as $\frac{4}{n}\leq 0.005$ and solve for $n$ :

[TABLE]

This means that if $n\geq 800$ , we have $\eta^{\delta}\leq 0.01(1-e^{-\lambda_{2}^{\delta}})$ and $\eta^{\delta}\leq 0.01(1-e^{-\lambda_{1}^{\delta}})$ which allows us to rewrite our inequalities and get bounds on $m$ :

[TABLE]

and

[TABLE]

Let $q=\frac{1}{\frac{1}{2}\ln(1-4\delta^{2})+\delta\ln\frac{1+2\delta}{1-2\delta}}.$ We remark that $q$ is approximately $\frac{1}{2\delta^{2}}.$ Let $r=\ln\frac{n(n-1)}{2\sqrt{2\pi}}$ then:

[TABLE]

and

[TABLE]

This is a statement of the main theorem by inspection.

7. Acknowledgments

We would like to thank Dr. Michael Lacey and Dr. Robert Kesler for their assistance and mentorship. We would also like to thank the Georgia Institute of Technology and the NSF for their funding and support.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Phase Transition in the One-bit Johnson-Lindenstrauss Lemma

Abstract.

1. Introduction

2. Background

Hamming Cube

Random mmm-dimensional One-Bit Map

Geodesic Distance

The Restricted Isometry Property

Positively Associated Stein-Chen Approximation

General Form of Stein-Chen Approximation

3. A One-to-One Mapping From the Unit Sphere to the Hamming Cube

Theorem 3.1**.**

Proof.

4. A Phase Transition in One-to-One Property

Theorem 4.1**.**

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

5. A Union Bound for the Restricted Isometry Property

Theorem 5.1**.**

Lemma 5.2**.**

Proof.

6. A Phase Transition for the Restricted Isometry Property

Theorem 6.1**.**

Lemma 6.2**.**

Proof.

Lemma 6.3**.**

Proof.

Lemma 6.4**.**

Proof.

Lemma 6.5**.**

Proof.

7. Acknowledgments

References

Random $m$ -dimensional One-Bit Map

Theorem 3.1.

Theorem 4.1.

Lemma 4.2.

Lemma 4.3.

Theorem 5.1.

Lemma 5.2.

Theorem 6.1.

Lemma 6.2.

Lemma 6.3.

Lemma 6.4.

Lemma 6.5.