Further investigations of R\'enyi entropy power inequalities and an   entropic characterization of s-concave densities

Jiange Li; Arnaud Marsiglietti; James Melbourne

arXiv:1901.10616·math.PR·September 30, 2019

Further investigations of R\'enyi entropy power inequalities and an entropic characterization of s-concave densities

Jiange Li, Arnaud Marsiglietti, James Melbourne

PDF

Open Access

TL;DR

This paper explores the role of convexity in Renyi entropy power inequalities, showing their validity for s-concave densities and providing new entropic characterizations of such densities.

Contribution

It demonstrates the failure of Renyi entropy power inequalities for certain parameters and establishes their validity for s-concave densities, along with new convergence and characterization results.

Findings

01

Renyi entropy power inequality fails for r in (0,1) in general

02

s-concave densities satisfy Renyi entropy power inequalities

03

Convergence of Renyi entropies in CLT for specific density classes

Abstract

We investigate the role of convexity in R\'enyi entropy power inequalities. After proving that a general R\'enyi entropy power inequality in the style of Bobkov-Chistyakov (2015) fails when the R\'enyi parameter $r \in (0, 1)$ , we show that random vectors with $s$ -concave densities do satisfy such a R\'enyi entropy power inequality. Along the way, we establish the convergence in the Central Limit Theorem for R\'enyi entropies of order $r \in (0, 1)$ for log-concave densities and for compactly supported, spherically symmetric and unimodal densities, complementing a celebrated result of Barron (1986). Additionally, we give an entropic characterization of the class of $s$ -concave densities, which extends a classical result of Cover and Zhang (1994).

Equations247

h_{r} (X) = \frac{1}{1 - r} lo g \int_{R^{d}} f (x)^{r} d x .

h_{r} (X) = \frac{1}{1 - r} lo g \int_{R^{d}} f (x)^{r} d x .

N_{r} (X) = e^{2 h_{r} (X) / d} .

N_{r} (X) = e^{2 h_{r} (X) / d} .

N_{r} (X_{1} + \dots + X_{n}) \geq c i = 1 \sum n N_{r} (X_{i}) .

N_{r} (X_{1} + \dots + X_{n}) \geq c i = 1 \sum n N_{r} (X_{i}) .

N_{r} (X + Y)^{α} \geq N_{r} (X)^{α} + N_{r} (Y)^{α}

N_{r} (X + Y)^{α} \geq N_{r} (X)^{α} + N_{r} (Y)^{α}

N_{r} (X_{1} + \dots + X_{n}) < ε i = 1 \sum n N_{r} (X_{i}) .

N_{r} (X_{1} + \dots + X_{n}) < ε i = 1 \sum n N_{r} (X_{i}) .

f ((1 - λ) x + λ y) \geq ((1 - λ) f (x)^{s} + λ f (y)^{s})^{1/ s}

f ((1 - λ) x + λ y) \geq ((1 - λ) f (x)^{s} + λ f (y)^{s})^{1/ s}

N_{r} (X_{1} + \dots + X_{n}) \geq c i = 1 \sum n N_{r} (X_{i}) .

N_{r} (X_{1} + \dots + X_{n}) \geq c i = 1 \sum n N_{r} (X_{i}) .

c = r^{\frac{1}{1 - r}} (1 + \frac{1}{n ∣ r ^{'} ∣})^{1 + n ∣ r^{'} ∣} (k = 1 \prod d \frac{( 1 + k s ) ^{∣ r^{'} ∣ (n - 1)} ( 1 + \frac{k s}{r} ) ^{1 + ∣ r^{'} ∣}}{( 1 + k s ( 1 + \frac{1}{n ∣ r ^{'} ∣} ) ) ^{1 + n ∣ r^{'} ∣}})^{\frac{2}{d}},

c = r^{\frac{1}{1 - r}} (1 + \frac{1}{n ∣ r ^{'} ∣})^{1 + n ∣ r^{'} ∣} (k = 1 \prod d \frac{( 1 + k s ) ^{∣ r^{'} ∣ (n - 1)} ( 1 + \frac{k s}{r} ) ^{1 + ∣ r^{'} ∣}}{( 1 + k s ( 1 + \frac{1}{n ∣ r ^{'} ∣} ) ) ^{1 + n ∣ r^{'} ∣}})^{\frac{2}{d}},

N_{r} (X + Y)^{α} \geq N_{r} (X)^{α} + N_{r} (Y)^{α} .

N_{r} (X + Y)^{α} \geq N_{r} (X)^{α} + N_{r} (Y)^{α} .

r_{0}

r_{0}

α

C (s) = \frac{2}{d} k = 1 \sum d (lo g (1 + \frac{k s}{r}) + r lo g (1 + k s) - (r + 1) lo g (1 + \frac{k s ( r + 1 )}{2 r})) .

C (s) = \frac{2}{d} k = 1 \sum d (lo g (1 + \frac{k s}{r}) + r lo g (1 + k s) - (r + 1) lo g (1 + \frac{k s ( r + 1 )}{2 r})) .

N (X + Y)^{1/2} \leq N (X)^{1/2} + N (Y)^{1/2}

N (X + Y)^{1/2} \leq N (X)^{1/2} + N (Y)^{1/2}

h ((1 - λ) X + λY) \leq h (X) .

h ((1 - λ) X + λY) \leq h (X) .

X_{i} \sim f sup h_{r} (i = 1 \sum n λ_{i} X_{i}) = h_{r} (X_{1})

X_{i} \sim f sup h_{r} (i = 1 \sum n λ_{i} X_{i}) = h_{r} (X_{1})

Z_{n} = \frac{X _{1} + \dots + X _{n}}{n} .

Z_{n} = \frac{X _{1} + \dots + X _{n}}{n} .

\varphi_{X}(t)=\mathbb{E}\big{[}e^{i\langle t,X\rangle}\big{]},\quad t\in\mathbb{R}^{d}.

\varphi_{X}(t)=\mathbb{E}\big{[}e^{i\langle t,X\rangle}\big{]},\quad t\in\mathbb{R}^{d}.

h_{r} (Z) - 1 < h_{r} (Z_{n_{0}}) < h_{r} (Z) + 1.

h_{r} (Z) - 1 < h_{r} (Z_{n_{0}}) < h_{r} (Z) + 1.

\int_{R^{d}} ∣ φ_{Z_{n_{0}}} (t) ∣^{2} d t = \int_{R^{d}} ∣ φ_{X_{1}} (t / n_{0}) ∣^{2 n_{0}} d t < + \infty.

\int_{R^{d}} ∣ φ_{Z_{n_{0}}} (t) ∣^{2} d t = \int_{R^{d}} ∣ φ_{X_{1}} (t / n_{0}) ∣^{2 n_{0}} d t < + \infty.

\int_{R^{d}} ∣ φ_{X_{1}} (t) ∣^{ν} d t < + \infty.

\int_{R^{d}} ∣ φ_{X_{1}} (t) ∣^{ν} d t < + \infty.

∥ φ_{Z_{n_{0}}} ∥_{L^{r^{'}}} \leq \frac{1}{( 2 π ) ^{d / r^{'}}} ∥ ρ_{n_{0}} ∥_{L^{r}},

∥ φ_{Z_{n_{0}}} ∥_{L^{r^{'}}} \leq \frac{1}{( 2 π ) ^{d / r^{'}}} ∥ ρ_{n_{0}} ∥_{L^{r}},

\int_{R^{d}} ∣ φ_{X_{1}} (t) ∣^{ν} d t < + \infty.

\int_{R^{d}} ∣ φ_{X_{1}} (t) ∣^{ν} d t < + \infty.

n \to + \infty lim x \in R^{d} sup ∣ ρ_{n} (x) - ϕ_{Σ} (x) ∣ = 0,

n \to + \infty lim x \in R^{d} sup ∣ ρ_{n} (x) - ϕ_{Σ} (x) ∣ = 0,

\int_{∣ x ∣ > T} ρ_{n} (x) d x < ε,

\int_{∣ x ∣ > T} ρ_{n} (x) d x < ε,

\int_{∣ x ∣ > T} ρ_{n} (x)^{r} d x \leq M^{r - 1} \int_{∣ x ∣ > T} ρ_{n} (x) d x < M^{r - 1} ε .

\int_{∣ x ∣ > T} ρ_{n} (x)^{r} d x \leq M^{r - 1} \int_{∣ x ∣ > T} ρ_{n} (x) d x < M^{r - 1} ε .

\int_{∣ x ∣ > T} ρ_{n} (x)^{r} d x - \int_{∣ x ∣ > T} ϕ_{Σ} (x)^{r} d x < δ .

\int_{∣ x ∣ > T} ρ_{n} (x)^{r} d x - \int_{∣ x ∣ > T} ϕ_{Σ} (x)^{r} d x < δ .

n \to + \infty lim \int_{∣ x ∣ \leq T} ρ_{n} (x)^{r} d x - \int_{∣ x ∣ \leq T} ϕ_{Σ} (x)^{r} d x = 0.

n \to + \infty lim \int_{∣ x ∣ \leq T} ρ_{n} (x)^{r} d x - \int_{∣ x ∣ \leq T} ϕ_{Σ} (x)^{r} d x = 0.

n \to \infty lim h_{r} (Z_{n}) = h_{r} (Z),

n \to \infty lim h_{r} (Z_{n}) = h_{r} (Z),

ρ_{n} (x) \leq e^{- a_{n} ∣ x ∣ + b_{n}},

ρ_{n} (x) \leq e^{- a_{n} ∣ x ∣ + b_{n}},

\int_{R^{d}} ρ_{n} (x)^{r} d x \leq \int_{R^{d}} e^{- r (a_{n} ∣ x ∣ + b_{n})} d x < + \infty.

\int_{R^{d}} ρ_{n} (x)^{r} d x \leq \int_{R^{d}} e^{- r (a_{n} ∣ x ∣ + b_{n})} d x < + \infty.

ρ_{n} (0) > \frac{1}{2} ϕ_{Σ} (0),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDiffusion and Search Dynamics · Point processes and geometric inequalities · Sparse and Compressive Sensing Techniques

Full text

Further investigations of Rényi entropy power inequalities and an entropic characterization of s-concave densities

Jiange Li, Arnaud Marsiglietti, James Melbourne

Abstract

We investigate the role of convexity in Rényi entropy power inequalities. After proving that a general Rényi entropy power inequality in the style of Bobkov-Chistyakov (2015) fails when the Rényi parameter $r\in(0,1)$ , we show that random vectors with $s$ -concave densities do satisfy such a Rényi entropy power inequality. Along the way, we establish the convergence in the Central Limit Theorem for Rényi entropies of order $r\in(0,1)$ for log-concave densities and for compactly supported, spherically symmetric and unimodal densities, complementing a celebrated result of Barron (1986). Additionally, we give an entropic characterization of the class of $s$ -concave densities, which extends a classical result of Cover and Zhang (1994).

1 Introduction

Let $X$ be a random vector in $\mathbb{R}^{d}$ . Suppose that $X$ has the density $f$ with respect to the Lebesgue measure. For $r\in(0,1)\cup(1,\infty)$ , the Rényi entropy of order $r$ (or simply, $r$ -Rényi entropy) is defined as

[TABLE]

For $r\in\{0,1,\infty\}$ , the $r$ -Rényi entropy can be extended continuously such that the RHS of (1) is $\log|{\rm supp}(f)|$ for $r=0$ ; $-\int_{\mathbb{R}^{d}}f(x)\log f(x)dx$ for $r=1$ ; and $-\log\|f\|_{\infty}$ for $r=\infty$ . The case $r=1$ corresponds to the classical Shannon differential entropy. Here, we denote by $|{\rm supp}(f)|$ the Lebesgue measure of the support of $f$ , and $\|f\|_{\infty}$ represents the essential supremum of $f$ . The $r$ -Rényi entropy power is defined by

[TABLE]

In the following, we drop the subscript when $r=1$ .

The classical Entropy Power Inequality (henceforth, EPI) of Shannon [39] and Stam [41], states that the entropy power $N(X)$ is super-additive on the sum of independent random vectors. There has been recent success in obtaining extensions of the EPI from the Shannon differential entropy to $r$ -Rényi entropy. In [9, 10], Bobkov and Chistyakov showed that, at the expense of an absolute constant $c>0$ , the following Rényi EPI of order $r\in[1,\infty]$ holds

[TABLE]

Ram and Sason soon after gave a sharpened constant depending on the number of summands [36]. Madiman, Melbourne, and Xu sharpened constants in the $r=\infty$ case by identifying extremizers in [31, 32]. Savaré and Toscani [38] showed that a modified Rényi entropy power was concave along the solution of a nonlinear heat equation, which generalizes Costa’s concavity of entropy power [19]. Bobkov and Marsiglietti [11] proved the following variant of Rényi EPI

[TABLE]

for $r>1$ and some exponent $\alpha$ only depending on $r$ . It is clear that (3) holds for more than two summands. Improvement of the exponent $\alpha$ was given by Li [27].

One of our goals is to establish analogues of (2) and (3) when the Rényi parameter $r\in(0,1)$ . Both (2) and (3) can be derived from Young’s convolution inequality in conjunction with the entropic comparison inequality $h_{r_{1}}(X)\geq h_{r_{2}}(X)$ for any $0\leq r_{1}\leq r_{2}$ . The latter is an immediate consequence of Jensen’s inequality. When the Rényi parameter $r\in(0,1)$ , analogues of (2) and (3) require a converse of the entropic comparison inequality aforementioned. This technical issue prevents a general Rényi EPI of order $r\in(0,1)$ for generic random vectors. Our first result shows that a general Rényi EPI of the form (2) indeed fails for all $r\in(0,1)$ .

Theorem 1.

For any $r\in(0,1)$ and $\varepsilon>0$ , there exist independent random vectors $X_{1},\cdots,X_{n}$ in $\mathbb{R}^{d}$ , for some $d\geq 1$ and $n\geq 2$ , such that

[TABLE]

We have an explicit construction of such random vectors. They are essentially truncations of some spherically symmetric random vectors with finite covariance matrices and infinite Rényi entropies of order $r\in(0,1)$ . The key point is the convergence along the Central Limit Theorem (henceforth, CLT) for Rényi entropies of order $r\in(0,1)$ ; that is, the $r$ -Rényi entropy of their normalized sum converges to the $r$ -Rényi entropy of a Gaussian. This implies that, after appropriate normalization, the LHS of (4) is finite, but the RHS of (4) can be as large as possible. The entropic CLT has been studied for a long time. A celebrated result of Barron [3] shows the convergence in the CLT for Shannon entropy (see [26] for a multidimensional setting). The recent work of Bobkov and Marsiglietti [12] studies the convergence in the CLT for Rényi entropy of order $r>1$ for real-valued random variables (see also [7] for convergence in Rényi divergence, which is not equivalent to convergence in Rényi entropy unless $r=1$ ). In Section 2, we establish the analogue of [12, Theorem 1.1] in higher dimensions and we prove convergence along the CLT for Rényi entropies of order $r\in(0,1)$ for a large class of densities.

As mentioned above, the reverse entropic comparison inequality prevents Rényi EPIs of order $r\in(0,1)$ for generic random vectors. However, a large class of random vectors with the so-called $s$ -concave densities do satisfy such a reverse entropic comparison inequality. Our next results show that Rényi EPI of order $r\in(0,1)$ holds for such densities. This extends the earlier work of Marsiglietti and Melbourne [33, 34] for log-concave densities (which corresponds to the $s=0$ case).

Let $s\in[-\infty,\infty]$ . A function $f\colon\mathbb{R}^{d}\to[0,\infty)$ is called $s$ -concave if the inequality

[TABLE]

holds for all $x,y\in\mathbb{R}^{d}$ such that $f(x)f(y)>0$ and $\lambda\in(0,1)$ . For $s\in\{-\infty,0,\infty\}$ , the RHS of (5) is understood in the limiting sense; that is $\min\{f(x),f(y)\}$ for $s=-\infty$ , $f(x)^{1-\lambda}f(y)^{\lambda}$ for $s=0$ , and $\max\{f(x),f(y)\}$ for $s=\infty$ . The case $s=0$ corresponds to log-concave functions. The study of measures with $s$ -concave densities was initiated by Borell in the seminal work [13, 14]. One can think of $s$ -concave densities, in particular log-concave densities, as functional versions of convex sets. There has been a recent stream of research on a formal parallel relation between functional inequalities of $s$ -concave densities and geometric inequalities of convex sets.

Theorem 2.

For any $s\in(-1/d,0)$ and $r\in(-sd,1)$ , there exists $c=c(s,r,d,n)$ such that for all independent random vectors $X_{1},\cdots,X_{n}$ with $s$ -concave densities in $\mathbb{R}^{d}$ , we have

[TABLE]

In particular, one can take

[TABLE]

where $r^{\prime}=r/(r-1)$ is the Hölder conjugate of $r$ .

Theorem 3.

Given $s\in(-1/d,0)$ , there exist $0<r_{0}<1$ and $\alpha=\alpha(s,r,d)$ such that for $r\in(r_{0},1)$ and independent random vectors $X$ and $Y$ in $\mathbb{R}^{d}$ with $s$ -concave densities,

[TABLE]

In particular, one can take

[TABLE]

where

[TABLE]

Owing to the convexity, random vectors with $s$ -concave densities also satisfy a reverse EPI, which was first proved by Bobkov and Madiman [8]. This can be seen as the functional lifting of Milman’s well known reverse Brunn-Minkowski inequality [35]. Motivated by Busemann’s theorem [17] in convex geometry, Ball, Nayar and Tkocz [2] conjectured that the following reverse EPI

[TABLE]

holds for any symmetric log-concave random vector $(X,Y)\in\mathbb{R}^{2}$ . The $r$ -Rényi entropy analogue was asked in [30], and the $r=2$ case was soon verified in [27]. It was also observed in [27] that the $r$ -Rényi entropy analogue is equivalent to the convexity of $p$ -cross-section body in convex geometry introduced by Gardner and Giannopoulos [23]. The equivalent linearization of (6) reads as follows. Let $(X,Y)$ be a symmetric log-concave random vector in $\mathbb{R}^{2}$ such that $h(X)=h(Y)$ . Then for any $\lambda\in[0,1]$ we have

[TABLE]

Cover and Zhang [20] proved the above inequality under the stronger assumption that $X$ and $Y$ have the same log-concave distribution. They also showed that this provides a characterization of log-concave distributions on the real line. The following theorem extends Cover and Zhang’s result from log-concave densities to a more general class of $s$ -concave densities. This gives an entropic characterization of $s$ -concave densities and implies a reverse Rényi EPI for random vectors with the same $s$ -concave density.

Theorem 4.

Let $r>1-1/d$ . Let $f$ be a probability density function on $\mathbb{R}^{d}$ . For any fixed integer $n\geq 2$ , the identity

[TABLE]

holds for all $\lambda_{i}\geq 0$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ if and only if the density $f$ is $(r-1)$ -concave.

The paper is organized as follows. In Section 2, we explore the convergence along the CLT for $r$ -Rényi entropies. For $r>1$ , the convergence is fully characterized for densities on $\mathbb{R}^{d}$ , while for $r\in(0,1)$ sufficient conditions are obtained for a large class of densities. More precisely, we prove the convergence for log-concave densities and for compactly supported, spherically symmetric and unimodal densities. As an application, we prove in Section 3 that a general $r$ -Rényi EPI fails when $r\in(0,1)$ , thus establishing Theorem 1. We also complement this result by proving Theorems 2 and 3. In the last section, we provide an entropic characterization of the class of $s$ -concave densities, and include a reverse Rényi EPI as an immediate consequence.

2 Convergence along the CLT for Rényi entropies

Let $\{X_{n}\}_{n\in\mathbb{N}}$ be a sequence of independent identically distributed (henceforth, i.i.d.) centered random vectors in $\mathbb{R}^{d}$ with finite covariance matrix. We denote by $Z_{n}$ the normalized sum

[TABLE]

An important tool used to prove various forms of CLT is the characteristic function. Recall that the characteristic function of a random vector $X$ is defined by

[TABLE]

Before providing sufficient conditions for the convergence along the CLT for Rényi entropy of order $r\in(0,1)$ , we first extend [12, Theorem 1.1] to higher dimensions.

Theorem 5.

Let $r>1$ . Let $X_{1},\cdots,X_{n}$ be i.i.d. centered random vectors in $\mathbb{R}^{d}$ . We denote by $\rho_{n}$ the density of $Z_{n}$ defined in (7). The following statements are equivalent.

$h_{r}(Z_{n})\to h_{r}(Z)$ * as $n\to+\infty$ , where $Z$ is a Gaussian random vector with mean [math] and the same covariance matrix as $X_{1}$ .* 2. 2.

$h_{r}(Z_{n_{0}})$ * is finite for some integer $n_{0}$ .* 3. 3.

$\int_{\mathbb{R}^{d}}|\varphi_{X_{1}}(t)|^{\nu}\,dt<+\infty$ * for some $\nu\geq 1$ .* 4. 4.

$Z_{n_{0}}$ * has a bounded density $\rho_{n_{0}}$ for some integer $n_{0}$ .*

Proof.

$1\Longrightarrow 2$ : Assume that $h_{r}\left(Z_{n}\right)\to h_{r}(Z)$ as $n\to+\infty$ . Then there exists an integer $n_{0}$ such that

[TABLE]

Since $h_{r}(Z)$ is finite, we conclude that $h_{r}(Z_{n_{0}})$ is finite as well.

$2\Longrightarrow 3$ : Assume that $h_{r}(Z_{n_{0}})$ is finite for some integer $n_{0}$ . Then $Z_{n_{0}}$ has a density $\rho_{n_{0}}\in L^{r}(\mathbb{R}^{d})$ .

Case 1: If $r\geq 2$ , we have $\rho_{n_{0}}\in L^{2}(\mathbb{R}^{d})$ . Using Plancherel’s identity, we have $\varphi_{Z_{n_{0}}}\in L^{2}(\mathbb{R}^{d})$ . It follows that

[TABLE]

For $\nu=2n_{0}$ , we have

[TABLE]

Case 2: If $r\in(1,2)$ , we apply the Hausdorff-Young inequality to obtain

[TABLE]

where $r^{\prime}$ is the conjugate of $r$ such that $1/r+1/r^{\prime}=1$ . Hence, for $\nu=r^{\prime}n_{0}$ , we have

[TABLE]

$3\Longrightarrow 4$ : Since $\int_{\mathbb{R}^{d}}|\varphi_{X_{1}}(t)|^{\nu}\,dt<+\infty$ for some $\nu\geq 1$ , one may apply Gnedenko’s local limit theorems (see [24]), which is valid in arbitrary dimensions (see [5]). In particular, we have

[TABLE]

where $\phi_{\Sigma}$ denotes the density of a Gaussian random vector with mean [math] and the same covariance matrix as $X_{1}$ . We deduce that there exists an integer $n_{0}$ and a constant $M>0$ such that $\rho_{n}\leq M$ for all $n\geq n_{0}$ .

$4\Longrightarrow 1$ : Since $\rho_{n_{0}}$ is bounded, then $\rho_{n_{0}}\in L^{2}$ , and we deduce by Plancherel’s identity that $\int_{\mathbb{R}^{d}}|\varphi_{X_{1}}(t)|^{\nu}\,dt<+\infty$ for $\nu=2n_{0}$ . Hence, (8) holds and there exists $M>0$ such that $\rho_{n}\leq M$ for all $n\geq n_{0}$ . Let us show that $\int_{\mathbb{R}^{d}}\rho_{n}(x)^{r}dx\to\int_{\mathbb{R}^{d}}\phi_{\Sigma}(x)^{r}dx$ as $n\to+\infty$ , where $\phi_{\Sigma}$ denotes the density of a Gaussian random vector with mean [math] and the same covariance matrix as $X_{1}$ . By the CLT, for any $\varepsilon>0$ , there exists $T>0$ such that for all $n$ large enough,

[TABLE]

which implies that

[TABLE]

The function $\phi_{\Sigma}$ satisfies similar inequalities. Hence, for any $\delta>0$ , there exists $T>0$ such that for all $n$ large enough,

[TABLE]

On the other hand, by (8), for all $T>0$ , the function $\rho_{n}^{r}(x){\bf 1}_{\{|x|\leq T\}}$ converges everywhere to $\phi_{\Sigma}^{r}(x){\bf 1}_{\{|x|\leq T\}}$ as $n\to+\infty$ . Since $\rho_{n}^{r}(x){\bf 1}_{\{|x|\leq T\}}$ is dominated by the integrable function $M^{r}{\bf 1}_{\{|x|\leq T\}}$ , one may use the Lebesgue dominated theorem to conclude that

[TABLE]

∎

*Remark 6**.*

Theorem 5 fails for $r\in(0,1)$ . For example, one can consider i.i.d. random vectors with a bounded density $\rho(x)$ such that $\int_{\mathbb{R}^{d}}\rho(x)^{r}dx=+\infty$ (e.g., Cauchy-type distributions). The implication $4\Longrightarrow 2$ (and thus $4\Longrightarrow 1$ ) will not hold since by Jensen inequality $h_{r}(Z_{n})\geq h_{r}(X_{1}/\sqrt{n})=\infty$ for all $n\geq 1$ . As observed by Barron [3], the implication $1\Longrightarrow 4$ does not necessarily hold in the Shannon entropy case $r=1$ .

The following result yields a sufficient condition for convergence along the CLT to hold for Rényi entropies of order $r\in(0,1)$ for a large class of random vectors in $\mathbb{R}^{d}$ .

Theorem 7.

Let $r\in(0,1)$ . Let $X_{1},\cdots,X_{n}$ be i.i.d. centered log-concave random vectors in $\mathbb{R}^{d}$ . Then we have $h_{r}(Z_{n})<+\infty$ for all $n\geq 1$ , and

[TABLE]

where $Z_{n}$ is the normalized sum in (7) and $Z$ is a Gaussian random vector with mean [math] and the same covariance matrix as $X_{1}$ .

Proof.

Since log-concavity is preserved under independent sum, $Z_{n}$ is log-concave for all $n\geq 1$ . Hence, for all $n\geq 1$ , $Z_{n}$ has a bounded log-concave density $\rho_{n}$ , which satisfies

[TABLE]

for all $x\in\mathbb{R}^{d}$ , and for some constants $a_{n}>0$ , $b_{n}\in\mathbb{R}$ possibly depending on the dimension (see, e.g., [16]). Hence, for all $n\geq 1$ , we have

[TABLE]

We deduce that $h_{r}(Z_{n})<+\infty$ for all $n\geq 1$ .

The boundedness of $\rho_{n}$ implies that (8) holds, and thus there exists an integer $n_{0}$ such that for all $n\geq n_{0}$ ,

[TABLE]

where $\Sigma$ is the covariance matrix of $X_{1}$ (and thus does not depend on $n$ ). Moreover, since $\rho_{n}$ is log-concave, one has for all $x\in\mathbb{R}^{d}$ that

[TABLE]

Hence, for all $T>0$ , we have

[TABLE]

where the last inequality follows from Markov’s inequality and the fact that

[TABLE]

Hence, for every $\varepsilon>0$ , one may choose a positive number $T$ such that for all $n$ large enough,

[TABLE]

and hence

[TABLE]

On the other hand, from (8), we conclude as in the proof of Theorem 5 that for all $T>0$ ,

[TABLE]

∎

A function $f\colon\mathbb{R}^{d}\to\mathbb{R}$ is called unimodal if the super-level sets $\{x\in\mathbb{R}^{d}:f(x)>t\}$ are convex for all $t\in\mathbb{R}$ . Next, we provide a convergence result for random vectors in $\mathbb{R}^{d}$ with unimodal densities under additional symmetry assumptions. First, we need the following stability result.

Proposition 8.

The class of spherically symmetric and unimodal random variables is stable under convolution.

Proof.

Let $f_{1}$ and $f_{2}$ be two spherically symmetric and unimodal densities. By assumption, $f_{i}$ satisfy that $f_{i}(Tx)=f_{i}(x)$ for an orthogonal map $T$ and $|x|\leq|y|$ implies $f_{i}(x)\geq f_{i}(y)$ . By the layer cake decomposition, we write

[TABLE]

Apply Fubini’s theorem to obtain

[TABLE]

Notice that by the spherical symmetry and decreasingness of $f_{i}$ , the super-level set

[TABLE]

is an origin symmetric ball. Thus we can write the integrand in (9) as

[TABLE]

This quantity is clearly dependent only on $|x|$ , giving spherical symmetry. In addition, as the convolution of two log-concave functions, ${\bf 1}_{L_{\lambda_{1}}}\star{\bf 1}_{L_{\lambda_{2}}}$ is log-concave as well. It follows that for every $\lambda_{1},\lambda_{2}$ , and $|x|\leq|y|$ we have

[TABLE]

Integrating this inequality completes the proof. ∎

Let us establish large deviation and pointwise inequalities for compactly supported, spherically symmetric and unimodal densities.

Theorem 9 (Hoeffding [25]).

Let $X_{1},\cdots,X_{n}$ be independent random variables with mean 0 and bounded in $(a_{i},b_{i})$ , respectively. One has for all $T>0$ ,

[TABLE]

The following result is Hoeffding’s inequality in higher dimensions.

Lemma 10.

Let $X_{1},\cdots,X_{n}$ be centered independent random vectors in $\mathbb{R}^{d}$ satisfying $\mathbb{P}(|X_{i}|>R)=0$ for some $R>0$ . One has for all $T>0$ that

[TABLE]

Proof.

Let $X_{i,j}$ be the $j$ -th coordinate of the random vector $X_{i}$ . Then we have

[TABLE]

where inequality (10) follows from the pigeon-hole principle, (11) from a union bound, and (12) follows from applying Theorem 9 to $X_{1,j}+\cdots+X_{n,j}$ and $(-X_{1,j})+\cdots+(-X_{n,j})$ . ∎

We deduce the following pointwise estimate for unimodal spherically symmetric and bounded random variables.

Corollary 11.

Let $X_{1},\cdots,X_{n}$ be i.i.d. random vectors with spherically symmetric unimodal density supported on the Euclidean ball $B_{R}=\{x:|x|\leq R\}$ for some $R>0$ . Let $\rho_{n}$ denote the density of the normalized sum $Z_{n}$ . Then there exists $c_{d}>0$ such that for all $n\geq 1$ and $|x|>2$ ,

[TABLE]

Proof.

Stating Lemma 10 in terms of $\rho_{n}$ , we have

[TABLE]

Since the class of spherically symmetric unimodal random variables is stable under independent summation by Proposition 8, $\rho_{n}$ is spherically symmetric and unimodal, so that

[TABLE]

where $B_{|x|}$ represents the Euclidean ball of radius $|x|$ centered at the origin and $\omega_{d}$ is the volume of the unit ball. Note that

[TABLE]

since $t\mapsto t^{d}-(t-1)^{d}$ is increasing, so that (14) follows. Now applying (13) we have

[TABLE]

and our result holds with

[TABLE]

∎

We are now ready to establish a convergence result for bounded spherically symmetric unimodal random vectors.

Theorem 12.

Let $r\in(0,1)$ . Let $X_{1},\cdots,X_{n}$ be i.i.d. random vectors in $\mathbb{R}^{d}$ with a spherically symmetric unimodal density with compact support. Then we have

[TABLE]

where $Z_{n}$ is the normalized sum in (7) and $Z$ is a Gaussian random vector with mean [math] and the same covariance matrix as $X_{1}$ .

Proof.

Let us denote by $\rho_{n}$ the density of $Z_{n}$ . Since $\rho_{1}$ is bounded, one may apply (8) together with Lebesgue dominated convergence to conclude that for all $T>0$ ,

[TABLE]

On the other hand, by Corollary 11, one may choose $T>0$ such that for all $n\geq 1$ ,

[TABLE]

and hence

[TABLE]

∎

3 Rényi EPIs of order $r\in(0,1)$

A striking difference between Rényi EPIs of orders $r\in(0,1)$ and $r\geq 1$ is the lack of an absolute constant. Indeed, it was shown in [10] that for $r\geq 1$ Rényi EPI of the form (2) holds for generic independent random vectors with an absolute constant $c\geq\frac{1}{e}r^{\frac{1}{r-1}}$ . In the following subsection, we show that such a Rényi EPI does not hold for $r\in(0,1)$ .

3.1 Failure of a generic Rényi EPI

Definition 13.

For $r\in[0,\infty]$ , we define $c_{r}$ as the largest number such that for all $n,d\geq 1$ and any independent random vectors $X_{1},\cdots,X_{n}$ in $\mathbb{R}^{d}$ , we have

[TABLE]

Then we can rephrase Theorem 1 as follows.

Theorem 14.

For $r\in(0,1)$ , the constant $c_{r}$ defined in (15) satisfies $c_{r}=0$ .

The motivating observation for this line of argument is the fact that for $r\in(0,1)$ , there exist distributions with finite covariance matrices and infinite $r$ -Rényi entropies. One might anticipate that this could contradict the existence of an $r$ -Rényi EPI, as the CLT forces the normalized sum of i.i.d. random vectors $X_{1},\cdots,X_{n}$ drawn from such a distribution to become “more Gaussian”. Heuristically, one anticipates that $N_{r}(X_{1}+\cdots+X_{n})/n=N_{r}(Z_{n})$ should approach $N_{r}(Z)$ for large $n$ , where $Z_{n}$ is the normalized sum in (7) and $Z$ is a Gaussian vector with the same covariance matrix as $X_{1}$ , while $\sum_{i=1}^{n}N_{r}(X_{i})/n=N_{r}(X_{1})$ is infinite.

Proof of Theorem 14.

Let us consider the following density

[TABLE]

with $p,R>0$ and $C_{R}$ implicitly determined to make $f_{R,p,d}$ a density. Since the density is spherically symmetric, its covariance matrix can be rewritten as $\sigma_{R}^{2}I$ for some $\sigma_{R}>0$ , where $I$ is the identity matrix. Computing in spherical coordinates one can check that $\lim_{R\to\infty}C_{R}$ is finite for $p>d$ , and we can thus define a density $f_{\infty,p,d}$ . What is more, when $p>d+2$ , the limiting density $f_{\infty,p,d}$ has a finite covariance matrix, and has finite Rényi entropy if and only if $p>d/r$ .

For fixed $r\in(0,1)$ , we take $p\in(d^{*}+2,d^{*}/r]$ , where $d^{*}=\min\{d\in\mathbb{N}:d>2r/(1-r)\}$ guarantees the existence of such $p$ . In this case, the limiting density $f_{\infty,p,d^{*}}$ is well defined and it has finite covariance matrix $\sigma_{\infty}^{2}I$ , but the corresponding $r$ -Rényi entropy is infinite. Now we select independent random vectors $X_{1},\cdots,X_{n}$ from the distribution $f_{R,p,d^{*}}$ . Since $f_{R,p,d^{*}}$ is a spherically symmetric and unimodal density with compact support, we can apply Theorem 12 to conclude that

[TABLE]

where $Z_{n}$ is the normalized sum in (7) and $Z_{Id}$ is the standard $d$ -dimensional Gaussian. Since $\lim_{R\to\infty}\sigma_{R}=\sigma_{\infty}<\infty$ , we can take $R$ large enough such that $|\sigma_{R}^{2}-\sigma_{\infty}^{2}|\leq 1$ . Then we can take $n$ large enough such that

[TABLE]

Since the limiting density $f_{\infty,p,d^{*}}$ has infinite $r$ -Rényi entropy, given $M>0$ , we can take $R$ large enough such that

[TABLE]

Combining (16) and (17), we conclude that for inequality (15) to hold we must have

[TABLE]

for all $M>0$ . Then the statement follows from taking the limit $M\to\infty$ . ∎

*Remark 15**.*

Random vectors in our proof has identical $s$ -concave density with $s\leq-r/d$ . In the following section, we provide a complementary result by showing that Rényi EPI of order $r\in(0,1)$ does hold for $s$ -concave densities when $-r/d<s<0$ .

3.2 Rényi EPIs for $s$ -concave densities

As showed above, a generic Rényi EPI of the form (2) fails for $r\in(0,1)$ . In this part, we establish Rényi EPIs of the forms (2) and (3) for an important class of random vectors with $s$ -concave densities (see (5)).

Following Lieb [29], we prove Theorems 2 and 3 by showing their equivalent linearizations. The following linearization of (2) and (3) is due to Rioul [37]. The $c=1$ case was used in [27].

Theorem 16 ([37]).

Let $X_{1},\cdots,X_{n}$ be independent random vectors in $\mathbb{R}^{d}$ . The following statements are equivalent.

There exist a constant $c>0$ and an exponent $\alpha>0$ such that

[TABLE] 2. 2.

For any $\lambda_{1},\cdots,\lambda_{n}\geq 0$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ , one has

[TABLE]

where $H(\lambda)\triangleq H(\lambda_{1},\cdots,\lambda_{n})$ is the discrete entropy defined as

[TABLE]

Inequality (19) is the linearized form of inequality (18). One of the ingredients used to establish (19) is Young’s sharp convolution inequality [4, 15]. Its information-theoretic formulation was given in [21], which we recall below. We denote by $r^{\prime}$ the Hölder conjugate of $r$ such that $1/r+1/r^{\prime}=1$ .

Theorem 17 ([15, 21]).

Let $r>0$ . Let $\lambda_{1},\cdots,\lambda_{n}\geq 0$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ , and let $r_{1},\cdots,r_{n}$ be positive reals such that $\lambda_{i}=r^{\prime}/r_{i}^{\prime}$ . For any independent random vectors $X_{1},\cdots,X_{n}$ in $\mathbb{R}^{d}$ , one has

[TABLE]

The second ingredient is a comparison between Rényi entropies $h_{r}$ and $h_{r_{i}}$ . When $r>1$ , we have $1<r_{i}<r$ , and Jensen’s inequality implies that $h_{r}\leq h_{r_{i}}$ . In this case, one can deduce (19) from (20) with $h_{r_{i}}$ replaced by $h_{r}$ . However, when $r\in(0,1)$ , the order of $r$ and $r_{i}$ are reversed, i.e., $0<r<r_{i}<1$ , and we need a reverse entropy comparison inequality. The so-called $s$ -concave densities do satisfy such a reverse entropy comparison inequality. The following result of Fradelizi, Li, and Madiman [22] serves this purpose.

Theorem 18 ([22]).

Let $s\in\mathbb{R}$ . Let $f\colon\mathbb{R}^{d}\to[0,+\infty)$ be an integrable $s$ -concave function. The function

[TABLE]

is log-concave for $r>\max\{0,-sd\}$ , where

[TABLE]

We deduce the following Rényi entropic comparison for random vectors with $s$ -concave densities.

Corollary 19.

Let $X$ be a random vector in $\mathbb{R}^{d}$ with a $s$ -concave density. For $-sd<r<q<1$ , we have

[TABLE]

Proof.

Write $q=(1-\lambda)\cdot r+\lambda\cdot 1$ . Using the log-concavity of the function $G$ in Theorem 18, we have

[TABLE]

The above inequality can be rewritten in terms of entropy power as follows

[TABLE]

The desired statement follows from taking the logarithm of both sides of the above inequality. ∎

Theorem 17 together with Corollary 19 yields the following Rényi EPI with a single Rényi parameter $r\in(0,1)$ for $s$ -concave densities.

Theorem 20.

Let $s\in(-1/d,0)$ and $r\in(-sd,1)$ . Let $X_{1},\cdots,X_{n}$ be independent random vectors in $\mathbb{R}^{d}$ with $s$ -concave densities. For all $\lambda=(\lambda_{1},\cdots,\lambda_{n})\in[0,1]^{n}$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ , we have

[TABLE]

where

[TABLE]

Proof.

Let $r_{i}$ be defined by $\lambda_{i}=r^{\prime}/r_{i}^{\prime}$ , where $r^{\prime}$ and $r_{i}^{\prime}$ are Hölder conjugates of $r$ and $r_{i}$ , respectively. Combining Theorem 17 with Corollary 19, we have

[TABLE]

Notice that $C(r)=r^{d}D(r)$ , where $C(r)$ is given in (21) and $D(r)=(1+s/r)\cdots(1+sd/r)$ . Thus,

[TABLE]

Using the identities $1/(1-r)=1-r^{\prime}$ and $\lambda_{i}/(1-r_{i})=\lambda_{i}-r^{\prime}$ , we have

[TABLE]

The last identity follows from $1/r_{i}=1-\lambda_{i}/r^{\prime}$ . Using (24) and (23), the RHS of (22) can be written as

[TABLE]

This concludes the proof. ∎

Having Theorems 16 and 20 at hand, we are ready to prove Theorems 2 and 3.

3.2.1 Proof of Theorem 2

Put Theorems 16 and 20 together. Then it suffices to find $c$ such that the following inequality

[TABLE]

holds for all $\lambda=(\lambda_{1},\cdots,\lambda_{n})\in[0,1]^{n}$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ . Hence, we can set

[TABLE]

where the infimum runs over all $\lambda=(\lambda_{1},\cdots,\lambda_{n})\in[0,1]^{n}$ such that $\sum_{i=1}^{n}\lambda_{i}=1$ . For fixed $r$ , both $A(\lambda)$ and $g_{k}(\lambda)$ are sum of one-dimensional convex functions of the form $(1+x)\log(1+x)$ . Furthermore, both $A(\lambda)$ and $g_{k}(\lambda)$ are permutation invariant. Hence, the minimum is achieved at $\lambda=(1/n,\cdots,1/n)$ . This yields the numerical value of $c$ in Theorem 2.

3.2.2 Proof of Theorem 3

The following lemma in [33] serves us in the proof of Theorem 3.

Lemma 21 ([33]).

Let $c>0$ . Let $L,F:[0,c]\to[0,\infty)$ be twice differentiable on $(0,c]$ , continuous on $[0,c]$ , such that $L(0)=F(0)=0$ and $L^{\prime}(c)=F^{\prime}(c)=0$ . Let us also assume that $F(x)>0$ for $x>0$ , that $F$ is strictly increasing, and that $F^{\prime}$ is strictly decreasing. Then $\frac{L^{\prime\prime}}{F^{\prime\prime}}$ increasing on $(0,c)$ implies that $\frac{L}{F}$ is increasing on $(0,c)$ as well. In particular,

[TABLE]

Proof of Theorem 3.

Apply Theorems 16 and 20 with $n=2$ . Then it suffices to find $\alpha$ such that for all $\lambda\in[0,1]$ we have

[TABLE]

where

[TABLE]

We can set

[TABLE]

We will show that the optimal value is achieved at $\lambda=1/2$ . Since the function is symmetric about $\lambda=1/2$ , it suffices to show that

[TABLE]

is increasing on $[0,1/2]$ . It has been shown in [27] that $-A(\lambda)/H(\lambda)$ is increasing on $[0,1/2]$ . We will show that for each $k=1,\cdots,n$ the function $-g_{k}(\lambda)/H(\lambda)$ is also increasing on $[0,1/2]$ . One can check that $-g_{k}(\lambda)$ and $H(\lambda)$ satisfy the conditions in Lemma 21. Hence, it suffices to show that $-g_{k}^{\prime\prime}(\lambda)/H^{\prime\prime}(\lambda)$ is increasing on $[0,1/2]$ . Elementary calculation yields that

[TABLE]

Define $x=\frac{\lambda}{|r^{\prime}|}$ and $y=\frac{1-\lambda}{|r^{\prime}|}=\frac{1}{|r^{\prime}|}-x$ . Then one can check that

[TABLE]

Hence, we have

[TABLE]

where

[TABLE]

Since $s,r^{\prime}<0$ , it suffices to show that $W(x)$ is increasing on $[0,\frac{1}{2|r^{\prime}|}]$ . We rewrite $W$ as follows

[TABLE]

where

[TABLE]

We will show that both $W_{1}(x)$ and $W_{2}(x)$ are increasing on $[0,\frac{1}{2|r^{\prime}|}]$ .

Now let us focus on $W_{1}$ . Since $y=\frac{1}{|r^{\prime}|}-x$ , one can check that

[TABLE]

Let us denote

[TABLE]

The condition $r>-sd$ implies that $a,b\geq 0$ . With these notations, we have

[TABLE]

The last identity follows from

[TABLE]

Since $a,b\geq 0$ and $x\in[0,\frac{1}{2|r^{\prime}|}]$ , it suffices to show that

[TABLE]

Using (28) and (29), we have

[TABLE]

Then the desired statement follows from that $s>-1/d$ and $r>-sd$ . We conclude that $W_{1}$ is increasing on $[0,\frac{1}{2|r^{\prime}|}]$ .

It remains to show that $W_{2}(x)$ is increasing on $[0,\frac{1}{2|r^{\prime}|}]$ . Recall the definition of $W_{2}(x)$ in (27), one can check that

[TABLE]

where $a$ and $b$ are defined in (28) and (29), and

[TABLE]

Since

[TABLE]

it suffices to show that $T(x)\geq 0$ for $[0,\frac{1}{2|r^{\prime}|}]$ . Using the identity

[TABLE]

one can check that

[TABLE]

where

[TABLE]

Notice that $U^{\prime}(x)\equiv 0$ , which implies that $U(x)$ is a constant. Since $a,b\geq 0$ , we have

[TABLE]

Hence, $T^{\prime}(x)\leq 0$ , i.e., $T(x)$ is decreasing. Therefore, since $a=b$ when $x=\frac{1}{2|r^{\prime}|}$ , we have

[TABLE]

It suffices to have

[TABLE]

which is equivalent to

[TABLE]

This finishes the proof that every $-g_{k}(\lambda)/H(\lambda)$ is also increasing on $[0,1/2]$ . Then the numerical value of $\alpha$ in Theorem 3 follows from setting $\lambda=1/2$ in (25). ∎

*Remark 22**.*

Our optimization argument heavily relies on the fact that $-A(\lambda)/H(\lambda)$ and $-g_{k}(\lambda)/H(\lambda)$ are monotonically increasing for $\lambda\in[0,1/2]$ . As observed in [27], the monotonicity of $-A(\lambda)/H(\lambda)$ does not depend on the value of $r$ . Numerical examples show that $-g_{k}(\lambda)/H(\lambda)$ , even the whole quantity in (26), is not monotone when $r$ is small. This is one of the reasons for the restriction $r>r_{0}$ .

*Remark 23**.*

Note that the condition $r>-sd$ of Theorem 18 can be rewritten as

[TABLE]

We do not know whether Theorem 3 holds when

[TABLE]

4 An entropic characterization of $s$ -concave densities

Let $X$ and $Y$ be real-valued random variables (possibly dependent) with the identical density $f$ . Cover and Zhang [20] proved that

[TABLE]

holds for every coupling of $X$ and $Y$ if and only if $f$ is log-concave. This yields an entropic characterization of one-dimensional log-concave densities. We will extend Cover and Zhang’s result to Rényi entropies of random vectors with $s$ -concave densities (defined in (5)), which particularly include log-concave densities as a special case. This was previously proved in [28] when $f$ is continuous.

Firstly, we introduce some classical variations of convexity and concavity which will be needed in our proof.

Definition 24.

Let $\lambda\in(0,1)$ be fixed. A function $f\colon\mathbb{R}^{d}\to\mathbb{R}$ with convex support is called almost $\lambda$ -convex if the following inequality

[TABLE]

holds for almost every pair $x,y$ in the domain of $f$ . We say that $f$ is $\lambda$ -convex if the above inequality holds for every pair $x,y$ in the domain of $f$ . Particularly, for $\lambda=1/2$ , it is usually called mid-convex or Jensen convex. We say that $f$ is convex if $f$ is $\lambda$ -convex for any $\lambda\in(0,1)$ .

One can define almost $\lambda$ -concavity, $\lambda$ -concavity and concavity by reversing inequality (30). Adamek [1, Theorem 1] showed that an almost $\lambda$ -convex function is identical to a $\lambda$ -convex function except on a set of Lebesgue measure 0. (To apply the theorem there, one can take the ideals $\mathcal{I}_{1}$ and $\mathcal{I}_{2}$ as the family of sets with Lebesgue measure 0 in $\mathbb{R}^{d}$ and $\mathbb{R}^{2d}$ , respectively). In general, $\lambda$ -convexity is not equivalent to convexity, as it is not a strong enough notion to imply continuity, at least not in a logical framework that accepts the axiom of choice. Indeed, counterexamples can be constructed using a Hamel basis for $\mathbb{R}$ as a vector space over $\mathbb{Q}$ . However, in the case that $f$ is Lebesgue measurable, a classical result of Blumberg [6] and Sierpinski [40] (see also [18] in more general setting) shows that $\lambda$ -convexity implies continuity, and thus convexity.

Theorem 25.

Let $s>-1/d$ and we define $r=1+s$ . Let $f$ be a probability density on $\mathbb{R}^{d}$ . The following statements are equivalent.

The density $f$ is $s$ -concave. 2. 2.

For any $\lambda\in(0,1)$ , we have $h_{r}(\lambda X+(1-\lambda)Y)\leq h_{r}(X)$ for any random vectors $X$ and $Y$ with the identical density $f$ . 3. 3.

We have $h_{r}\left(\frac{X+Y}{2}\right)\leq h_{r}(X)$ for any random vectors $X$ and $Y$ with the identical density $f$ .

Proof.

We only prove the statement for $s>0$ , or equivalently $r>1$ . The proof for $-1/d<s<0$ , or equivalently $1-1/d<r<1$ , is similar and sketched below.

$1\Longrightarrow 2$ : The proof is taken from [28]. We include it for completeness. Let $g$ be the density of $\lambda X+(1-\lambda)Y$ . Then we have

[TABLE]

This is equivalent to the desired statement. Identity (31) follows from the assumption that $X$ and $Y$ have the same distribution. In inequality (4), we use the concavity of $f^{r-1}$ and the fact that $\frac{1}{1-r}\log x$ is decreasing when $r>1$ . Inequality (4) follows from Hölder’s inequality and the fact that $\frac{1}{1-r}\log x$ is decreasing when $r>1$ . For $1-1/d<r<1$ , the statement follows from the same argument in conjunction with the convexity of $f^{r-1}$ , the converse of Hölder’s inequality and the fact that $\frac{1}{1-r}\log x$ is increasing when $0<r<1$ .

$2\Longrightarrow 3$ : Obvious by taking $\lambda=\frac{1}{2}$ .

$3\Longrightarrow 1$ : We will prove the statement by contradiction. We first show an example borrowed from Cover and Zhang [20] to illustrate the “mass transferring” argument used in our proof. Consider the density $f(x)=3/2$ in the intervals $(0,1/3)$ and $(2/3,1)$ . It is clear that $f$ is not $(r-1)$ -concave. The joint distribution of $(X,Y)$ with $Y\equiv X$ is supported on the diagonal line $y=x$ . The Radon-Nikodym derivative $g$ with respect to the one-dimensional Lebesgue measure on the line $y=x$ exists and is shown in Figure 1. We remove some “mass” from the diagonal line $y=x$ to the lines $y=x-2/3$ and $y=x+2/3$ . The new Radon-Nikodym derivative $\hat{g}$ is shown in Figure 2. Let $(\hat{X},\hat{Y})$ be a pair of random variables whose joint distribution possesses this new Radon-Nikodym derivative. It is easy to see that $\hat{X}$ and $\hat{Y}$ still have the same density $f$ . But $\hat{X}+\hat{Y}$ is uniformly distributed on $(0,2)$ , and thus $h_{r}(\hat{X}+\hat{Y})=\log 2$ . One can check that $h_{r}(2X)=\log(4/3)$ .

Now we turn to the general case. Suppose that $f$ is not $(r-1)$ -concave, i.e., $f^{r-1}$ is not concave (for $r>1$ ). We claim that there exists a set $A\subseteq\mathbb{R}^{2d}$ of positive Lebesgue measure on $\mathbb{R}^{2d}$ such that the inequality

[TABLE]

holds for all $(x,y)\in A$ . Otherwise, the converse of (34) holds for almost every pair $(x,y)$ , and thus $f^{r-1}$ is an almost mid-concave function (i.e., 1/2-concave). By Theorem 1 in [1], $f^{r-1}$ is identical to a mid-concave function except on a set of Lebesgue measure 0. Without changing the distribution, we can modify $f$ such that $f^{r-1}$ is mid-concave. Using the equivalence of mid-concavity and concavity (under the Lebesgue measurability), after modification, $f^{r-1}$ is concave, i.e., $f$ is $(r-1)$ -concave. This contradicts our assumption. Hence, there exists such a set $A$ with positive Lebesgue measure on $\mathbb{R}^{2d}$ . Then there exists $y$ such that (34) holds for a set of $x$ with positive Lebesgue measure on $\mathbb{R}^{d}$ . We rephrase this statement in a form suitable for our purpose. There is $x_{0}\neq 0$ such that the set

[TABLE]

has positive Lebesgue measure on $\mathbb{R}^{d}$ . For $\epsilon>0$ , we denote by $\Lambda(\epsilon)$ a ball of radius $\epsilon$ whose intersection with $\Lambda$ has positive Lebesgue measure on $\mathbb{R}^{d}$ . Consider $(X,Y)$ such that $X\equiv Y$ , where $X$ and $Y$ have the identical density $f$ . Let $g(x,y)$ be the Radon-Nikodym derivative of $(X,Y)$ with respect to the $d$ -dimensional Lebesgue measure on the “diagonal line” $y=x$ . Now we build a new density $\hat{g}$ by translating a small amount of “mass” from “diagonal points” $(x-x_{0},x-x_{0})$ and $(x+x_{0},x+x_{0})$ to “off-diagonal points” $(x-x_{0},x+x_{0})$ and $(x+x_{0},x-x_{0})$ . To be more precise, we define the new joint density $\hat{g}$ as

[TABLE]

where $\delta>0$ and ${\bf 1}_{S}$ is the indicator function of the set $S$ . The function $\hat{g}$ is supported on the “diagonal line” $y=x$ and “off-diagonal segments” $\{(x-x_{0},x+x_{0}):x\in\Lambda(\epsilon)\}$ and $\{(x+x_{0},x-x_{0}):x\in\Lambda(\epsilon)\}$ , which are disjoint for sufficiently small $\epsilon>0$ . (This is similar to Figure 2). When $\delta>0$ is small enough, $\hat{g}(x,y)$ is non-negative everywhere. Furthermore, our construction preserves the “total mass”. Hence, the function $\hat{g}(x,y)$ is indeed a probability density with respect to the $d$ -dimensional Lebesgue measure on the “diagonal line” and two “off-diagonal segments”. Let $(\hat{X},\hat{Y})$ be a pair with the joint density $\hat{g}(x,y)$ . The marginals $\hat{X}$ and $\hat{Y}$ have the same distribution as that of $X$ , since the “positive mass” on “off-diagonal points” complements the “mass deficit” on “diagonal points” when we project in the $x$ and $y$ directions. We claim that $\frac{\hat{X}+\hat{Y}}{2}$ has larger entropy than $\hat{X}$ . One can check that the density of $\frac{\hat{X}+\hat{Y}}{2}$ is

[TABLE]

Let $\Omega$ denote the union of $\Lambda(\epsilon)$ , $\Lambda(\epsilon)+x_{0}$ and $\Lambda(\epsilon)-x_{0}$ . Then we have

[TABLE]

Since $x_{0}\neq 0$ , for $\epsilon>0$ small enough, $\Omega$ is the union of disjoint translates of $\Lambda(\epsilon)$ . When $\delta>0$ is sufficiently small, we have

[TABLE]

where inequality (37) follows from the observation that for $x\in\Lambda(\epsilon)\subset\Lambda$ (see (35)) the derivative of the integrand at $\delta=0$ is

[TABLE]

Since $r>1$ , (36) together with (38) implies that

[TABLE]

This is contradictory to our assumption. Hence, $f$ has to be $(r-1)$ -concave. For $1-1/d<r<1$ , we redefine the set $\Lambda$ by reversing inequality (35), and inequality (37) will be also reversed. We will arrive at the same conclusion. ∎

*Remark 26**.*

The proof of $1\Longrightarrow 2$ is an immediate consequence of Theorem 3.36 in [30]. The theorem there draws heavily on the ideas of [42], where a related study, deriving the Schur convexity of Rényi entropies under the assumption of exchangeability and $s$ -concavity of the random variables, generalizing Yu’s results in [43] on the entropies of sums of i.i.d. log-concave random variables. Although we state Theorem 25 for two random vectors, the argument also works for more than two random vectors. Hence, it implies the seemingly stronger Theorem 4.

As an immediate consequence of Theorem 25, we have the following reverse Rényi EPI for random vectors with the same distribution.

Corollary 27.

Let $s>-1/d$ and let $r=1+s$ . Let $X$ and $Y$ be (possibly dependent) random vectors in $\mathbb{R}^{d}$ with the same density $f$ being $s$ -concave. Then we have

[TABLE]

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Adamek. Almost λ 𝜆 \lambda -convex and almost Wright-convex functions. Mathematica Slovaca , 53(1):67–73, 2003.
2[2] K. Ball, P. Nayar and T. Tkocz. A reverse entropy power inequality for log-concave random vectors. Studia Math , 235(1):17–30, 2016.
3[3] A. R. Barron. Entropy and the central limit theorem. Ann. Probab. , 14:336–342, 1986.
4[4] W. Beckner. Inequalities in Fourier analysis. Ann. of Math. (2) , 102(1):159–182, 1975.
5[5] R. N. Bhattacharya and R. Ranga Rao. Normal approximation and asymptotic expansions. John Wiley & Sons, Inc. Also: Soc. for Industrial and Appl. Math., Philadelphia, 2010 , 1976.
6[6] H. Blumberg, On convex functions. Trans. Amer. Math. Soc. , 20, 40–44, 1919.
7[7] S. G. Bobkov, G. P. Chistyakov and F. Götze. Rényi divergence and the central limit theorem. Ann. Probab. 47 (2019), no. 1, pp. 270-323.
8[8] S. Bobkov and M. Madiman. Reverse Brunn-Minkowski and reverse entropy power inequalities for convex measures. J. Funct. Anal. , 262:3309–3339, 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Further investigations of Rényi entropy power inequalities and an entropic characterization of s-concave densities

Abstract

1 Introduction

Theorem 1**.**

Theorem 2**.**

Theorem 3**.**

Theorem 4**.**

2 Convergence along the CLT for Rényi entropies

Theorem 5**.**

Proof.

Remark 6*.*

Theorem 7**.**

Proof.

Proposition 8**.**

Proof.

Theorem 9** (Hoeffding [25]).**

Lemma 10**.**

Proof.

Corollary 11**.**

Proof.

Theorem 12**.**

Proof.

3 Rényi EPIs of order r∈(0,1)r\in(0,1)r∈(0,1)

3.1 Failure of a generic Rényi EPI

Definition 13**.**

Theorem 14**.**

Proof of Theorem 14.

Remark 15*.*

3.2 Rényi EPIs for sss-concave densities

Theorem 16** ([37]).**

Theorem 17** ([15, 21]).**

Theorem 18** ([22]).**

Corollary 19**.**

Proof.

Theorem 20**.**

Proof.

3.2.1 Proof of Theorem 2

3.2.2 Proof of Theorem 3

Lemma 21** ([33]).**

Proof of Theorem 3.

Remark 22*.*

Remark 23*.*

4 An entropic characterization of sss-concave densities

Definition 24**.**

Theorem 25**.**

Proof.

Remark 26*.*

Corollary 27**.**

Theorem 1.

Theorem 2.

Theorem 3.

Theorem 4.

Theorem 5.

*Remark 6**.*

Theorem 7.

Proposition 8.

Theorem 9 (Hoeffding [25]).

Lemma 10.

Corollary 11.

Theorem 12.

3 Rényi EPIs of order $r\in(0,1)$

Definition 13.

Theorem 14.

*Remark 15**.*

3.2 Rényi EPIs for $s$ -concave densities

Theorem 16 ([37]).

Theorem 17 ([15, 21]).

Theorem 18 ([22]).

Corollary 19.

Theorem 20.

Lemma 21 ([33]).

*Remark 22**.*

*Remark 23**.*

4 An entropic characterization of $s$ -concave densities

Definition 24.

Theorem 25.

*Remark 26**.*

Corollary 27.