An Entropy Power Inequality for Discrete Random Variables
Ehsan Nekouei, Mikael Skoglund, Karl Henrik Johansson

TL;DR
This paper establishes a new entropy power inequality for discrete random variables, showing that the sum of their entropy powers is bounded by twice the entropy power of their sum, with a proof leveraging perturbation and continuous inequalities.
Contribution
The paper introduces the first entropy power inequality for discrete variables, extending concepts from continuous entropy power inequalities to the discrete setting.
Findings
The inequality is tight for certain distributions.
The proof uses perturbation with continuous variables.
Provides a new tool for analyzing discrete entropy.
Abstract
Let denote the entropy power of the discrete random variable where denotes the discrete entropy of . In this paper, we show that for two independent discrete random variables and , the entropy power inequality holds and it can be tight. The basic idea behind the proof is to perturb the discrete random variables using suitably designed continuous random variables. Then, the continuous entropy power inequality is applied to the sum of the perturbed random variables and the resulting lower bound is optimized.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Communication Security Techniques · Limits and Structures in Graph Theory · Distributed Sensor Networks and Detection Algorithms
An Entropy Power Inequality for Discrete Random Variables
Ehsan Nekouei, Mikael Skoglund and Karl H. Johansson School of electrical engineering and computer science, KTH Royal Institute of Technology, Stockholm, Sweden. {nekouei,skoglund,kallej}@kth.se. This work is supported by the Knut and Alice Wallenberg Foundation, the Swedish Foundation for Strategic Research and the Swedish Research Council.
Abstract
Let denote the entropy power of the discrete random variable where denotes the discrete entropy of . In this paper, we show that for two independent discrete random variables and , the entropy power inequality holds and it can be tight. The basic idea behind the proof is to perturb the discrete random variables using suitably designed continuous random variables. Then, the continuous entropy power inequality is applied to the sum of the perturbed random variables and the resulting lower bound is optimized.
Index Terms:
Discrete entropy power inequality.
I Introduction
The continuous entropy power inequality [1], [2], [3] asserts that for two independent absolutely continuous random variables (rvs) and , the following inequality holds
[TABLE]
where and denote the continuous entropy power and the differential entropy functionals, respectively. In the information theory literature, substantial efforts have been dedicated to obtaining an analogue of (1) for discrete rvs. In general, the discrete counterpart of (1), where the differential entropy is replaced by the discrete entropy, does not hold for discrete rvs. Classes of discrete rvs which satisfy the discrete version of (1) have been studied in the literature. Let denote a binomial distribution with trials and success probability . Harremoës and Vignat [4] showed that the discrete version of (1) holds for two binomial rvs distributed according to and with and . Sharma et al., [5] proved that this result holds for when and are sufficiently large.
The authors of [6] showed that the discrete version of (1) holds for two independent and uniformly distributed rvs. A variant of the entropy power inequality for ultra log-concave discrete rvs has been derived in [7] using Rényi’s thinning operation. It worth mentioning that lower bounds on the entropy of a sum of independent discrete rvs have been investigated extensively in the literature. The interested reader is referred to [8], [9] and references therein for more information on this line of research.
In this paper, we derive a discrete entropy power inequality, which is analogous to the continuous entropy power inequality and holds for the sum of two arbitrarily distributed, independent discrete rvs. More specifically, it is shown that for two independent discrete rvs and , we have
[TABLE]
regardless of their distributions, where and denote the discrete entropy power and the discrete entropy, respectively.
I-A Notation and Organization of The Paper
Let denote a generic continuous random variable taking values on . The differential entropy of and its (continuous) entropy power are defined as
[TABLE]
where denotes the probability density function (pdf) of . For a generic discrete random variable , its discrete entropy and entropy power are defined as
[TABLE]
The rest of this paper is organized as follows. Next section presents our main result along with the key steps of its proof. Detailed proofs of the steps are presented in Section III.
II The Main Result
The following theorem establishes an entropy power inequality for the sum of two independent discrete rvs.
Theorem 1
Consider two independent discrete rvs and . Then, we have
[TABLE]
Moreover, the equality is achieved when the “effective” support sets of and are singletons.
Theorem 1 establishes an upper bound on the sum of entropy powers of two independent discrete rvs. According to this result, the sum of the entropy powers of two independent discrete rvs is always less than twice of the entropy power of their sum. Also, the inequality is tight when each rv only takes one value from its support set with probability one. Note that the difference between the two sides of (2) becomes small when the probability mass function of each rv is highly concentrated around one element of its support set.
II-A Proof of Theorem 1
The proof of Theorem 1 relies on perturbing the discrete rvs by carefully chosen continuous rvs, applying the continuous entropy power inequality to the sum of perturbed rvs, and optimizing the lower bound obtained in step . In this subsection, Theorem 1 is proved using four key lemmas.
Let denote a discrete rv taking values in and denote the minimum spacing between its atoms, i.e, . Also, let denote a real-valued rv, independent of , with almost surely (a.s.). We assume that is absolutely continuous with respect to the Lebesgue measure on the real line and has finite differential entropy.
The following lemma derives an expression for the differential entropy of . Its proof is presented in Subsection III-A.
Lemma 1
The differential entropy of can be written as
[TABLE]
where and denote the differential entropy and the discrete entropy, respectively.
Let and denote independent discrete rvs, and denote their sum. Let , and denote the minimum spacing of , and , respectively. Next lemma derives an upper bound on the minimum spacing of . The proof of this result is straightforward and is skipped.
Lemma 2
We have .
According to this lemma, the minimum spacing between the atoms of is not larger than those of and .
Let and be independent and identically distributed (iid) absolutely continuous rvs which are independent of and ; and take values in . Let denote the common probability density function (pdf) of and (with respect to the Lebesgue measure on the real line) and assume it has finite differential entropy. Consider the rvs and which are obtained by perturbing and using and . From Lemmas 1 and 2, we have
[TABLE]
Moreover, using Lemma 1 and the fact that a.s., we have
[TABLE]
The equalities (II-A) and (5) are used to establish an inequality on the entropy power of in Lemma 3. This lemma is proved in Subsection III-B by applying the continuous entropy power inequality to the sum of the perturbed rvs and .
Lemma 3
Let denote the set of pdfs defined on and have finite differential entropies. Then, we have
[TABLE]
where and are two independent absolutely continuous rvs with pdf .
Next lemma characterizes the lower bound in Lemma 3. The proof of this lemma is relegated to Subsection III-C.
Lemma 4
[TABLE]
The proof of Theorem 1 follows from Lemmas 3 and 4.
III Proofs of Lemmas
III-A Proof of Lemma 1
Let denote the pdf of . Then, the pdf of can be written as . The assumption implies that the size of the support set of is less than the minimum spacing of . This observation implies that the pdf of is composed of non-overlapping components. Using the definition of the differential entropy, we have
[TABLE]
where follows from the fact that the components of the pdf of are non-overlapping and from the fact that the differential entropy is shift-invariant.
III-B Proof of Lemma 3
Using the entropy power inequality for continuous rvs [1], we have
[TABLE]
where follows from equalities (II-A) and (5) and follows from the fact that and are identically distributed. Hence, we have
[TABLE]
Inequality (6) holds for any pdf defined on with a finite differential entropy. Thus, we have
[TABLE]
III-C Proof of Lemma 4
Using the entropy power inequality for continuous rvs, we have
[TABLE]
for all independent and identically distributed rvs and with the common pdf in . Thus, we have
[TABLE]
To show the other direction, let denote the pdf of a Gaussian rv with zero mean and variance . Let denote the pdf obtained by truncating outside , i.e.,
[TABLE]
where is the normalizing factor. Let and be two independent rvs distributed according to . Then, we have
[TABLE]
where follows from the fact that belongs to and follows from the entropy maximizing property of Gaussian distributions. The variance of can be upper bounded as
[TABLE]
Moreover, the differential entropy of can be written as
[TABLE]
Using (III-C) and (III-C), we have
[TABLE]
Note that and . The term can be upper bounded as
[TABLE]
where follows from the fact that for [10]. Thus, we have . The term can be written as
[TABLE]
which implies that . Thus, we have .
For a given , we can find small enough such that and . Thus, we have
[TABLE]
for sufficiently small. The desired result follows from the fact that is arbitrary.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] C. E. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal , vol. 27, no. 3, pp. 379–423, 7 1948.
- 2[2] A. Stam, “Some inequalities satisfied by the quantities of information of Fisher and Shannon,” Information and Control , vol. 2, no. 2, pp. 101 – 112, 1959.
- 3[3] N. Blachman, “The convolution inequality for entropy powers,” IEEE Transactions on Information Theory , vol. 11, no. 2, pp. 267–271, April 1965.
- 4[4] P. Harremoës and C. Vignat, “An entropy power inequality for the binomial family,” Journal of Inequalities in Pure & Applied Mathematics] , vol. 4, no. 5, pp. 1–6, 2003.
- 5[5] N. Sharma, S. Das, and S. Muthukrishnan, “Entropy power inequality for a family of discrete random variables,” in IEEE International Symposium on Information Theory Proceedings , July 2011, pp. 1945–1949.
- 6[6] J. O. Woo and M. Madiman, “A discrete entropy power inequality for uniform distributions,” in IEEE International Symposium on Information Theory , June 2015, pp. 1625–1629.
- 7[7] O. Johnson and Y. Yu, “Monotonicity, thinning, and discrete versions of the entropy power inequality,” IEEE Transactions on Information Theory , vol. 56, no. 11, pp. 5387–5395, Nov 2010.
- 8[8] S. Haghighatshoar, E. Abbe, and I. E. Telatar, “A new entropy power inequality for integer-valued random variables,” IEEE Transactions on Information Theory , vol. 60, no. 7, pp. 3787–3796, July 2014.
