A dimension-free reverse logarithmic Sobolev inequality for   low-complexity functions in Gaussian space

Ronen Eldan; Michel Ledoux

arXiv:1903.07093·math.FA·March 19, 2019

A dimension-free reverse logarithmic Sobolev inequality for low-complexity functions in Gaussian space

Ronen Eldan, Michel Ledoux

PDF

Open Access

TL;DR

This paper introduces a new, dimension-free reverse logarithmic Sobolev inequality for low-complexity functions in Gaussian space, improving upon previous results and providing novel proofs and formulations.

Contribution

It presents a dimension-free version of the reverse logarithmic Sobolev inequality for low-complexity functions, enhancing prior work by Eldan (2018) with new proofs and forms.

Findings

01

Dimension-free inequality established

02

Improved bounds for low-complexity functions in Gaussian space

03

New proofs and formulations provided

Abstract

We discuss new proofs, and new forms, of a reverse logarithmic Sobolev inequality, with respect to the standard Gaussian measure, for low complexity functions, measured in terms of Gaussian-width. In particular, we provide a dimension-free improvement for a related result given in [Eldan '18].

Equations82

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,=\,\int_{{\mathbb{R}}^{n}}f\,d\nu\qquad\mbox{and}\qquad\mathcal{I}(\nu)\,=\,\int_{{\mathbb{R}}^{n}}|\nabla f|^{2}d\nu

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,=\,\int_{{\mathbb{R}}^{n}}f\,d\nu\qquad\mbox{and}\qquad\mathcal{I}(\nu)\,=\,\int_{{\mathbb{R}}^{n}}|\nabla f|^{2}d\nu

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,\leq\,\frac{1}{2}\,\mathcal{I}(\nu).

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,\leq\,\frac{1}{2}\,\mathcal{I}(\nu).

D (ν) = GW (K) = \int_{R^{n}} t \in K sup ⟨ y, t ⟩ d γ (y)

D (ν) = GW (K) = \int_{R^{n}} t \in K sup ⟨ y, t ⟩ d γ (y)

\frac{1}{2} I (ν) \leq D_{KL} (ν ∣∣ γ) + \frac{1}{2} M_{+} + D (ν)^{2/3} I (ν)^{1/3},

\frac{1}{2} I (ν) \leq D_{KL} (ν ∣∣ γ) + \frac{1}{2} M_{+} + D (ν)^{2/3} I (ν)^{1/3},

\frac{1}{2}\,\mathcal{I}(\nu)\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+M+\mathcal{D}(\nu).

\frac{1}{2}\,\mathcal{I}(\nu)\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+M+\mathcal{D}(\nu).

W_{2}^{2} (ν, γ) = in f \int_{R^{n} \times R^{n}} ∣ x - y ∣^{2} d π (x, y)

W_{2}^{2} (ν, γ) = in f \int_{R^{n} \times R^{n}} ∣ x - y ∣^{2} d π (x, y)

W_{2}^{2} (ν, γ ⋆ μ) \leq 16 n^{1/3} (M (ν) + D (ν))^{2/3} .

W_{2}^{2} (ν, γ ⋆ μ) \leq 16 n^{1/3} (M (ν) + D (ν))^{2/3} .

W_{2}^{2} (ν, γ ⋆ μ) \leq C (M (ν) + D (ν)),

W_{2}^{2} (ν, γ ⋆ μ) \leq C (M (ν) + D (ν)),

I (ν) = \int_{R^{n}} ∣\nabla f ∣^{2} d ν = \int_{R^{n}} ∣\nabla f ∣^{2} e^{f} d γ = \int_{R^{n}} ⟨ \nabla (e^{f}), \nabla f ⟩ d γ = - \int_{R^{n}} e^{f} L f d γ = - \int_{R^{n}} L f d ν

I (ν) = \int_{R^{n}} ∣\nabla f ∣^{2} d ν = \int_{R^{n}} ∣\nabla f ∣^{2} e^{f} d γ = \int_{R^{n}} ⟨ \nabla (e^{f}), \nabla f ⟩ d γ = - \int_{R^{n}} e^{f} L f d γ = - \int_{R^{n}} L f d ν

I (ν) = - \int_{R^{n}} Δ f d ν + \int_{R^{n}} ⟨ x, \nabla f ⟩ d ν \leq M + \int_{R^{n}} ⟨ x, \nabla f ⟩ d ν .

I (ν) = - \int_{R^{n}} Δ f d ν + \int_{R^{n}} ⟨ x, \nabla f ⟩ d ν \leq M + \int_{R^{n}} ⟨ x, \nabla f ⟩ d ν .

\int_{R^{n}} ⟨ x, \nabla f (x)⟩ d ν (x) = \int_{R^{n} \times R^{n}} ⟨ x, \nabla f (x)⟩ d π (x, y) = \int_{R^{n} \times R^{n}} ⟨ y, \nabla f (x)⟩ d π (x, y) + \int_{R^{n} \times R^{n}} ⟨ x - y, \nabla f (x)⟩ d π (x, y) .

\int_{R^{n}} ⟨ x, \nabla f (x)⟩ d ν (x) = \int_{R^{n} \times R^{n}} ⟨ x, \nabla f (x)⟩ d π (x, y) = \int_{R^{n} \times R^{n}} ⟨ y, \nabla f (x)⟩ d π (x, y) + \int_{R^{n} \times R^{n}} ⟨ x - y, \nabla f (x)⟩ d π (x, y) .

\int_{R^{n} \times R^{n}} ⟨ y, \nabla f (x)⟩ d π (x, y) \leq \int_{R^{n} \times R^{n}} t \in K sup ⟨ y, t ⟩ d π (x, y) = \int_{R^{n}} t \in K sup ⟨ y, t ⟩ d γ (y) .

\int_{R^{n} \times R^{n}} ⟨ y, \nabla f (x)⟩ d π (x, y) \leq \int_{R^{n} \times R^{n}} t \in K sup ⟨ y, t ⟩ d π (x, y) = \int_{R^{n}} t \in K sup ⟨ y, t ⟩ d γ (y) .

\begin{split}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}\langle x-y,&\nabla f(x)\rangle\,d\pi(x,y)\\ &\,\leq\,\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\pi(x,y)+\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}|x-y|^{2}d\pi(x,y)\\ &\,=\,\frac{1}{2}\int_{{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\nu(x)+\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}|x-y|^{2}d\pi(x,y).\\ \end{split}

\begin{split}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}\langle x-y,&\nabla f(x)\rangle\,d\pi(x,y)\\ &\,\leq\,\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\pi(x,y)+\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}|x-y|^{2}d\pi(x,y)\\ &\,=\,\frac{1}{2}\int_{{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\nu(x)+\frac{1}{2}\int_{{\mathbb{R}}^{n}\times{\mathbb{R}}^{n}}|x-y|^{2}d\pi(x,y).\\ \end{split}

\int_{{\mathbb{R}}^{n}}\langle x,\nabla f\rangle\,d\nu\,\leq\,\int_{{\mathbb{R}}^{n}}\sup_{t\in K}\langle y,t\rangle\,d\gamma(y)+\frac{1}{2}\int_{{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\nu(x)+\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma).

\int_{{\mathbb{R}}^{n}}\langle x,\nabla f\rangle\,d\nu\,\leq\,\int_{{\mathbb{R}}^{n}}\sup_{t\in K}\langle y,t\rangle\,d\gamma(y)+\frac{1}{2}\int_{{\mathbb{R}}^{n}}\big{|}\nabla f(x)\big{|}^{2}d\nu(x)+\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma).

\frac{1}{2} I (ν) \leq M + D (ν) + \frac{1}{2} W_{2}^{2} (ν, γ) .

\frac{1}{2} I (ν) \leq M + D (ν) + \frac{1}{2} W_{2}^{2} (ν, γ) .

\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma)\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}

\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma)\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,\leq\,\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma)+M+\mathcal{D}(\nu).

\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}\,\leq\,\frac{1}{2}\,\mathrm{W}_{2}^{2}(\nu,\gamma)+M+\mathcal{D}(\nu).

\int_{R^{n}} e^{Z} d γ \leq e^{\int_{R^{n}} s u p_{t \in K} ⟨ x, t ⟩ d γ} .

\int_{R^{n}} e^{Z} d γ \leq e^{\int_{R^{n}} s u p_{t \in K} ⟨ x, t ⟩ d γ} .

\int_{{\mathbb{R}}^{n}}Z\,d\nu\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+\log\int_{{\mathbb{R}}^{n}}e^{Z}d\gamma.

\int_{{\mathbb{R}}^{n}}Z\,d\nu\,\leq\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+\log\int_{{\mathbb{R}}^{n}}e^{Z}d\gamma.

\int_{{\mathbb{R}}^{n}}\Big{[}\langle x,\nabla f\rangle-\frac{1}{2}\,|\nabla f|^{2}\Big{]}d\nu\,\leq\,\int_{{\mathbb{R}}^{n}}Z\,d\nu\,\leq\,D_{\mathrm{KL}}\big{(}\nu\,||\,\gamma\big{)}+\int_{{\mathbb{R}}^{n}}\sup_{t\in K}\langle x,t\rangle\,d\gamma.

\int_{{\mathbb{R}}^{n}}\Big{[}\langle x,\nabla f\rangle-\frac{1}{2}\,|\nabla f|^{2}\Big{]}d\nu\,\leq\,\int_{{\mathbb{R}}^{n}}Z\,d\nu\,\leq\,D_{\mathrm{KL}}\big{(}\nu\,||\,\gamma\big{)}+\int_{{\mathbb{R}}^{n}}\sup_{t\in K}\langle x,t\rangle\,d\gamma.

\frac{1}{2C}\,\mathrm{W}_{2}^{2}(\mu^{\prime},\mu)\,\leq\,\mathrm{D_{KL}}\big{(}\mu^{\prime}\,||\,\mu\big{)}

\frac{1}{2C}\,\mathrm{W}_{2}^{2}(\mu^{\prime},\mu)\,\leq\,\mathrm{D_{KL}}\big{(}\mu^{\prime}\,||\,\mu\big{)}

\frac{1}{2}\,\mathrm{W}_{2}^{2}(\mu^{\prime},\mu)\,=\,\sup\bigg{[}\int_{{\mathbb{R}}^{n}}\varphi\,d\mu^{\prime}+\int_{{\mathbb{R}}^{n}}\psi\,d\mu\bigg{]}

\frac{1}{2}\,\mathrm{W}_{2}^{2}(\mu^{\prime},\mu)\,=\,\sup\bigg{[}\int_{{\mathbb{R}}^{n}}\varphi\,d\mu^{\prime}+\int_{{\mathbb{R}}^{n}}\psi\,d\mu\bigg{]}

φ (x) + ψ (y) \leq \frac{1}{2} ∣ x - y ∣^{2}

φ (x) + ψ (y) \leq \frac{1}{2} ∣ x - y ∣^{2}

\int_{R^{n}} e^{\frac{1}{C} φ} d μ \leq e^{- \frac{1}{C} \int_{R^{n}} ψ d μ} .

\int_{R^{n}} e^{\frac{1}{C} φ} d μ \leq e^{- \frac{1}{C} \int_{R^{n}} ψ d μ} .

⟨ x, t ⟩ - \frac{1}{2} ∣ t ∣^{2} = ⟨ y, t ⟩ + ⟨ x - y, t ⟩ - \frac{1}{2} ∣ t ∣^{2} \leq ⟨ y, t ⟩ + \frac{1}{2} ∣ x - y ∣^{2} .

⟨ x, t ⟩ - \frac{1}{2} ∣ t ∣^{2} = ⟨ y, t ⟩ + ⟨ x - y, t ⟩ - \frac{1}{2} ∣ t ∣^{2} \leq ⟨ y, t ⟩ + \frac{1}{2} ∣ x - y ∣^{2} .

\int_{R^{n}} e^{\frac{1}{C} Z} d μ \leq e^{\frac{1}{C} \int_{R^{n}} s u p_{t \in K} ⟨ x, t ⟩ d μ}

\int_{R^{n}} e^{\frac{1}{C} Z} d μ \leq e^{\frac{1}{C} \int_{R^{n}} s u p_{t \in K} ⟨ x, t ⟩ d μ}

Z(t,x)\,=\,{\mathbb{E}}\big{(}[e^{f}](x+B_{1-t}\big{)}.

Z(t,x)\,=\,{\mathbb{E}}\big{(}[e^{f}](x+B_{1-t}\big{)}.

X_{0} = 0, d X_{t} = d B_{t} + v_{t} d t

X_{0} = 0, d X_{t} = d B_{t} + v_{t} d t

{\mathbb{E}}\bigg{(}\int_{0}^{1}|v_{t}|^{2}dt\bigg{)}\,=\,2\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}.

{\mathbb{E}}\bigg{(}\int_{0}^{1}|v_{t}|^{2}dt\bigg{)}\,=\,2\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}.

\int_{{\mathbb{R}}^{n}}|x|^{2}d\nu-\int_{{\mathbb{R}}^{n}}|x|^{2}d\gamma\,\leq\,2\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+\mathcal{D}(\nu).

\int_{{\mathbb{R}}^{n}}|x|^{2}d\nu-\int_{{\mathbb{R}}^{n}}|x|^{2}d\gamma\,\leq\,2\,\mathrm{D_{KL}}\big{(}\nu\,||\,\gamma\big{)}+\mathcal{D}(\nu).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFatigue and fracture mechanics

Full text

A dimension-free reverse logarithmic Sobolev inequality for low-complexity functions in Gaussian space

Ronen Eldan

Weizmann Insitute, Israel Incumbent of the Elaine Blond career development chair. Supported by a European Research Commission Starting Grant and by an Israel Science Foundation grant no. 715/16

Michel Ledoux

University of Toulouse, France

Abstract

We discuss new proofs, and new forms, of a reverse logarithmic Sobolev inequality, with respect to the standard Gaussian measure, for low complexity functions, measured in terms of Gaussian-width. In particular, we provide a dimension-free improvement for a related result given in [5].

1 A reverse logarithmic Sobolev inequality

The recent work [5] has put forward a reverse logarithmic Sobolev inequality, with respect to the standard Gaussian measure, for low complexity functions measured in terms of Gaussian-width. To briefly recall this inequality, we take again the notation from [5].

Let $\gamma$ denote the standard Gaussian measure on ${\mathbb{R}}^{n}$ , and let $\nu$ be the probability measure $d\nu=e^{f}d\gamma$ where $f:{\mathbb{R}}^{n}\to{\mathbb{R}}$ is twice-differentiable. Let

[TABLE]

be respectively the Kullback-Leibler divergence (relative entropy) and the Fisher information of $\nu$ with respect to $\gamma$ , assumed to be finite in the following. In particular $\nu$ has a second moment. The standard logarithmic Sobolev inequality of L. Gross (cf. e.g. [11, 3]) ensures that

[TABLE]

Let

[TABLE]

be the Gaussian-width of the set $K=\{\nabla f(x);x\in{\mathbb{R}}^{n}\}$ . The quantity $\mathcal{D}(\nu)$ is a measure of the complexity of $\nu$ (rather the gradient-complexity of $f$ ). It is assumed there that $y\mapsto\sup_{t\in K}\,\langle y,t\rangle$ is integrable with respect to $\gamma$ . The following reverse logarithmic Sobolev inequality for measures of low complexity has been established in [5, Theorem 4]. Set $M=M(\nu):=-\inf_{x\in{\mathbb{R}}^{n}}\Delta f(x)$ , assumed to be finite. Then one has

[TABLE]

where $M_{+}:=\max(M,0)$ .

Our first theorem gives the following related bound.

Theorem 1.

In the preceding notation,**

[TABLE]

As discussed in [5], the inequality is sharp on the extremal functions $f(x)=\langle\alpha,x\rangle$ , $x\in{\mathbb{R}}^{n}$ , $\alpha\in{\mathbb{R}}^{n}$ , of the logarithmic Sobolev inequality which have complexity $M=\mathcal{D}(\nu)=0$ .

To compare between Theorem 1 and the bound (2), observe that the latter trivially holds true in the case that $\mathcal{I}(\nu)\leq\mathcal{D}(\nu)$ , thus we may generally assume that $\mathcal{D}(\nu)\leq\mathcal{D}(\nu)^{2/3}\,\mathcal{I}(\nu)^{1/3}$ . Unlike inequality (2), the bound of the theorem has the feature that both sides of the inequality are additive with respect to taking products and in this sense it is dimension-free. In Section 2 below, we give a slightly different form which improves on equation (2) and also essentially improves on Theorem 1.

Such a reverse logarithmic Sobolev inequality is of theoretical interest in the study of approximations of partition functions and of low-complexity Gibbs measures on product spaces (cf. [5, 1]). An analogous definition of low-complexity for Boolean functions was considered in [5], where it is shown that a low-complexity condition implies that the measure can be decomposed as a mixture of approximate product measures.

In fact, it was very recently shown ([7]) that if a measure $\nu$ satisfies a reverse logarithmic Sobolev inequality, then it is close, in transportation distance, to a mixture of translated Gaussian measures. The combination of such a result with Theorem 1 gives a structure theorem for measures of low-complexity, analogous to the one given in [5], but for the Gaussian setting. We formulate this as a corollary.

Recall the quadratic Kantorovich metric $\mathrm{W}_{2}(\nu,\gamma)$ between $\nu$ and $\gamma$ defined by

[TABLE]

where the infimum is taken over all couplings $\pi$ on ${\mathbb{R}}^{n}\times{\mathbb{R}}^{n}$ with respective marginals $\nu$ and $\gamma$ . A combination of Theorem 1 with [7, Theorem 5] gives,

Corollary 2.

In the preceding notation, there exists a probability measure $\mu$ such that**

[TABLE]

In particular, the above corollary gives a meaningful result whenever $M(\nu)+\mathcal{D}(\nu)=o(n)$ . It is also conjectured that a dimension-free analogue of [7, Theorem 5] should hold true, which, combined with our bound would imply the existence of a probability measure $\mu$ such that

[TABLE]

where $C>0$ is a universal constant.

The proof of (2) strongly relies on a construction coming from stochastic control theory, of an entropy-optimal coupling of the measure $\nu$ to a Brownian motion. We will come back to it in Section 2. In contrast, our proof of Theorem 1 follows a simple and direct approach.

Proof of Theorem 1.

By integration by parts with respect to the Gaussian measure $\gamma$ ,

[TABLE]

where $\mathrm{L}=\Delta f-\langle x,\nabla f\rangle$ is the Ornstein-Uhlenbeck operator. Therefore

[TABLE]

Let $\pi$ be a coupling on ${\mathbb{R}}^{n}\times{\mathbb{R}}^{n}$ with respective marginals $\nu$ and $\gamma$ . Then,

[TABLE]

Now, on the one hand,

[TABLE]

On the other hand, by the standard quadratic inequality,

[TABLE]

Taking the infimum over all couplings $\pi$ with respective marginals $\nu$ and $\gamma$ , it holds true that

[TABLE]

Therefore, together with (3),

[TABLE]

It remains to recall the quadratic transportation cost inequality by M. Talagrand (cf. [11, 3])

[TABLE]

and the proof is complete. ∎

Together with the logarithmic Sobolev inequality (1) $\mathrm{D_{KL}}(\nu||\gamma)\leq\frac{1}{2}\,\mathcal{I}(\nu)$ , the step (4) of the preceding proof actually also yields a reverse transportation cost inequality

[TABLE]

Theorem 1 may also be deduced from a classical integrability result for the supremum of a Gaussian process. Given a set $K\in{\mathbb{R}}^{n}$ such that $x\mapsto\sup_{t\in K}\langle x,t\rangle$ is integrable with respect to $\gamma$ , setting $Z=Z(x)=\sup_{t\in K}\big{[}\langle x,t\rangle-\frac{1}{2}|t|^{2}\big{]}$ , $x\in{\mathbb{R}}^{n}$ , it holds true that

[TABLE]

This inequality was originally put forward in [10, 12] in the context of concentration properties of suprema of Gaussian processes. Now, the classical entropic inequality (Gibbs variational principle) expresses that

[TABLE]

With $K=\{\nabla f(x);x\in{\mathbb{R}}^{n}\}$ , it therefore follows that

[TABLE]

Again, together with (3), this yields the conclusion of the theorem.

At the same time, the integrability inequality (7) may be seen as a consequence of the transportation cost inequality (5) and the Kantorovich duality. The argument actually works for any probability $\mu$ on the Borel sets of ${\mathbb{R}}^{n}$ satisfying the transportation cost inequality

[TABLE]

for some constant $C>0$ and every $\mu^{\prime}<\!\!<\mu$ ( $C=1$ for $\mu=\gamma$ ).

Namely, the Kantorovich duality (cf. [11]) expresses that

[TABLE]

where the supremum runs over the set of measurable functions $(\varphi,\psi)\in\mathrm{L}^{1}(\mu^{\prime})\times\mathrm{L}^{1}(\mu)$ satisfying

[TABLE]

for $d\mu^{\prime}$ -almost all $x\in{\mathbb{R}}^{n}$ and $d\mu$ -almost all $y\in{\mathbb{R}}^{n}$ . Given then a couple of functions $(\varphi,\psi)$ satisfying (9), the choice in (8) of $\frac{d\mu^{\prime}}{d\mu}=\frac{e^{g}}{\int_{{\mathbb{R}}^{n}}e^{g}d\mu}$ where $g=\frac{1}{C}[\varphi+\int_{{\mathbb{R}}^{n}}\psi\,d\mu]$ yields that $\log\int_{{\mathbb{R}}^{n}}e^{g}d\mu\leq 0$ , that is

[TABLE]

For every $x,y\in{\mathbb{R}}^{n}$ , and $t\in K$ ,

[TABLE]

Therefore, if $\varphi(x)=\sup_{t\in K}\big{[}\langle x,t\rangle-\frac{1}{2}|t|^{2}\big{]}$ , $x\in{\mathbb{R}}^{n}$ , then $\psi(y)=-\sup_{t\in K}\langle y,t\rangle$ is a valid candidate for (9). Hence

[TABLE]

which amounts to (7) when $\mu=\gamma$ .

2 Stochastic calculus and the Föllmer process

As mentioned above, the proof of (2) developed in [5] uses tools from stochastic control theory, and in particular the so-called Föllmer process [8] to achieve an entropy-optimal coupling of the measure $\nu$ to a Brownian motion. This argument has already been proved useful in the study of various functional inequalities [4, 9, 6].

To summarize a few facts from [9, 5], let ${(B_{t})}_{t\geq 0}$ be standard Brownian motion in ${\mathbb{R}}^{n}$ (starting from the origin) adapted to a filtration ${(\mathcal{F}_{t})}_{t\geq 0}$ . Set $v(t,x)=\nabla\log Z(t,x)$ , $t\in[0,1]$ , $x\in{\mathbb{R}}^{n}$ , where

[TABLE]

The Föllmer process ${(X_{t})}_{t\in[0,1]}$ solves the stochastic differential equation

[TABLE]

where $v_{t}=v(t,X_{t})$ . Amongst its relevant properties, the random variable $X_{1}$ has distribution $\nu$ , ${(v_{t})}_{t\in[0,1]}$ is a martingale, and

[TABLE]

The arguments developed in [5] thus make use of these properties towards a proof of the inequality (2). Now, actually, a small variation in the same spirit allows for the following inequality.

Theorem 3.

In the notation of Section 1, assume that $\nu$ has a finite second moment. Then

[TABLE]

Proof.

Note first that by integration by parts

[TABLE]

so that the inequality of the theorem amounts to

[TABLE]

Recall that $K=\{\nabla f(x);x\in{\mathbb{R}}^{n}\}$ . Arguing as for the proof of Theorem 1, for any coupling $\pi$ with respective marginals $\nu$ and $\gamma$ ,

[TABLE]

The inequality (10) would then follow if for some coupling $\pi$ ,

[TABLE]

But the point is that the Föllmer process actually produces an exact coupling for this identity to hold. Namely, by the definition and properties of ${(X_{t})}_{t\in[0,1]}$ , $X_{1}$ has law $\nu$ , $B_{1}$ has law $\gamma$ and

[TABLE]

Since ${(v_{t})}_{t\in[0,1]}$ is a martingale,

[TABLE]

from which the claim follows since $v_{1}=\nabla f(X_{1})$ . ∎

For the sake of intuition, let us consider an equivalent form of the bound provided by the theorem. Denote,

[TABLE]

the differential entropy of $\nu$ . It is straightforward to check that the theorem is equivalent to

[TABLE]

Note that, in the special case that $\nu$ has the form $\log\frac{d\nu}{dx}=\sup_{t\in K}\big{[}\langle x,t\rangle-\frac{1}{2}|t|^{2}\big{]}+\mathrm{const}$ , this bound becomes somewhat similar to the bound (7).

It remains to connect Theorem 3, or rather inequality (10), to Theorem 1. By the definition of the Fisher information (cf. (3))

[TABLE]

so that (10) expresses that

[TABLE]

While presented and established with the quantity $M=-\inf_{x\in{\mathbb{R}}^{n}}\Delta f(x)$ , the proof of Theorem 1 shows in the same way that

[TABLE]

Hence, if $\mathcal{D}(\nu)\geq\int_{{\mathbb{R}}^{n}}\Delta f\,d\nu$ (which is likely), the inequality (11) improves upon (12). On the other hand, it does not seem possible to reach (11) as simply as (12), and in any case, the inequality of Theorem 3, even up to a constant, may not be deduced from Theorem 1.

Bibliography12

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Austin. The structure of low-complexity Gibbs measures on product spaces. ar Xiv:1810.07278 (2018).
2[2] T. Austin. Multi-variate correlation and mixtures of product measures. Arxiv:1809.10272 (2018).
3[3] D. Bakry, I. Gentil, M. Ledoux. Analysis and geometry of Markov diffusion operators. Grundlehren der mathematischen Wissenschaften 348. Springer (2014).
4[4] C. Borell. Isoperimetry, log-concavity, and elasticity of option prices. In New directions in Mathematical Finance, 73–91, Wiley (2002).
5[5] R. Eldan. Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations. Geom. Funct. Anal. 28, 1548–1596 (2018).
6[6] R. Eldan, J. R. Lee. Regularization under diffusion and anti-concentration of the information content. Duke Math. J. 167, 969–993 (2018).
7[7] R. Eldan, J. Lehec, Y. Shenfeld. Stability of the logarithmic Sobolev inequality via the Föllmer Process. Arxiv: 1903.04522 (2019).
8[8] H. Föllmer. An entropy approach to the time reversal of diffusion processes. In Stochastic differential systems (Marseille-Luminy, 1984), Lecture Notes in Control and Inform. Sci. 69, 156–163 (1985). Springer.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A dimension-free reverse logarithmic Sobolev inequality for low-complexity functions in Gaussian space

Abstract

1 A reverse logarithmic Sobolev inequality

Theorem 1**.**

Corollary 2**.**

Proof of Theorem 1.

2 Stochastic calculus and the Föllmer process

Theorem 3**.**

Proof.

Theorem 1.

Corollary 2.

Theorem 3.