This paper establishes the existence and uniqueness of the Augustin center and bounds for channels with convex constraints, introduces related capacities and radii, and derives sphere packing bounds for various memoryless channels.
Contribution
It introduces Augustin-Legendre capacity, center, and radius, proving their equivalence to Renyi-Gallager entities, and derives sphere packing bounds with polynomial prefactors for specific channel families.
Findings
01
Existence of a unique Augustin center for channels with convex constraints.
02
Equivalence of Augustin-Legendre and Renyi-Gallager capacities and centers.
03
Sphere packing bounds with polynomial prefactors for certain memoryless channels.
Abstract
For any channel with a convex constraint set and finite Augustin capacity, existence of a unique Augustin center and associated Erven-Harremoes bound are established. Augustin-Legendre capacity, center, and radius are introduced and proved to be equal to the corresponding Renyi-Gallager entities. Sphere packing bounds with polynomial prefactors are derived for codes on two families of channels: (possibly non-stationary) memoryless channels with multiple additive cost constraints and stationary memoryless channels with convex constraints on the empirical distribution of the input codewords.
Equations125
\vspace−.1cmX
\vspace−.1cmX
Peav
Peav
infx∈Xρ(x)
infx∈Xρ(x)
Γρ
Γρ
ρ(x)
ρ(x)
maxt:t≤nC21,Wt,ϱn
maxt:t≤nC21,Wt,ϱn
∨m∈Mnρ[1,n](Ψt(m))
∨m∈Mnρ[1,n](Ψt(m))
Cα0,W[1,n],ϱn+εln2n
Peav(n)
Peav(n)
Dα(w∥q)
Dα(w∥q)
dνdvαw,q
dνdvαw,q
Iα(p;W)
Iα(p;W)
Tα,p(q)
Tα,p(q)
dνdμα,p
dνdμα,p
D1(W∥q∣p)−I1(p;W)
D1(W∥q∣p)−I1(p;W)
Dα(W∥q∣p)−Iα(p;W)
Dα(W∥q∣p)−Iα(p;W)
Tα,p(qα,p)
→∞limqα,p−Tα,p(qα,pg)
qα,p
qα,p
Cα,W,A
Cα,W,A
Cα,W,A
Cα,W,A
p∈Asupq∈P(Y)infDα(W∥q∣p)
p∈Asupq∈P(Y)infDα(W∥q∣p)
Cα,W,A
Cα,W,A
Cα,W,A−Iα(p;W)
Cα,W,A−Iα(p;W)
supp∈ADα(W∥q∣p)
supp∈ADα(W∥q∣p)
Cϕ,W,A−Cα,W,A
Cϕ,W,A−Cα,W,A
Cα,W,A
Cα,W,A
Cα,W,ϱ
Cα,W,ϱ
Cα,W,ϱ+λα,W,ϱ⋅(ϱ~−ϱ)
Cα,W,ϱ+λα,W,ϱ⋅(ϱ~−ϱ)
Cα,W,ϱ
Cα,W,ϱ
Iαλ(p;W)
Iαλ(p;W)
fα,p(ϱ)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
For any channel with a convex constraint set and finite Augustin capacity,
existence of a unique Augustin center and associated Erven-Harremoes bound
are established.
Augustin-Legendre capacity, center, and radius are introduced and proved to be equal
to the corresponding Renyi-Gallager entities.
Sphere packing bounds with polynomial prefactors are derived for codes on two families
of channels:
(possibly non-stationary) memoryless channels with multiple additive cost constraints
and
stationary memoryless channels with convex constraints on the empirical distribution
of the input codewords.
I Introduction
Augustin [2], [3] derived the sphere packing bound
for the product channels without assuming the stationarity.
Assuming that order ½ Renyi capacity of the component channels are O(lnn),
we have derived the sphere packing bound for product channels with a prefactor that is polynomial
in the block length n, [12, Theorem LABEL:B-thm:productexponent].
In this manuscript, we derive analogous results for two families of memoryless channels.
As we have done for the product channels in [12], we first derive a non-asymptotic
outer bound for codes on a given memoryless channel, then we derive our asymptotic result using
this bound.
In [3, Chapter VII], Augustin pursued an analysis similar to ours and derived the
sphere packing bound for memoryless channels with cost constraints [3, §36].
In addition, Augustin established the connection between the exponent of Gallager’s inner bound for the cost
constrained channels [8, Thm 8] and the sphere packing exponent [3, §35].
Our results surpass Augustin’s results in two ways:
•
Augustin assumes the cost function to be
bounded.111The issue here is not a matter of rescaling: certain conclusions of
Augustin’s analysis are not correct when cost functions are not bounded.
This hypothesis excludes certain important and interesting cases such as the Gaussian
channels.
Hence, Augustin’s analysis in [3] does not imply the sphere packing bounds
derived by Shannon [15] and Ebert [6].
We don’t assume the cost function to be bounded. Thus, Theorem 1 establishes
the sphere packing bound for a wider class of channels including the Gaussian channels with
multiple antennas.222Shannon’s approximation error terms in [15] are considerably
better than ours. But his derivation relies heavily on the geometry of the output space.
Our derivation, on the other hand, is oblivious towards it.
It is even possible to handle certain fading scenarios and additional per antenna power
constraints.
•
The best asymptotic bound implied by Augustin’s non-asymptotic bound
[3, Thm 36.6] is of the form
Peav(n)≥O(en1)e−Esp(lnLnMn−O(n),W[1,n],ϱn).
In Theorem 1 we replace
O(en1) by O(nτ1) by O(n) to [math].
For stationary memoryless channels with finite input sets,
the sphere packing bound is well-known [4, Ch. 10], [5].
For such a channel, one first chooses the most populous constant composition sub-code and then
derives the sphere packing bound for the code using the sphere packing bound for the constant
composition sub-code.333Haroutunian [9] was the first one to give a
complete proof of the sphere packing bound for constant composition codes.
Recently, Altug and Wagner [1] sharpened the prefactor of the bound for channels with
finite output sets.
This technique, however, fails when the input set of the channel is infinite.
We show that a sphere packing bound similar to Theorem 1 holds
for codes on stationary memoryless channels with convex constraints on the empirical distribution
of the input codewords.
In the rest of this section, we describe our model and notation
and state our main asymptotic result.
In Section II, we introduce and analyze Augustin information, mean,
capacity, and center as purely measure theoretic concepts.
The role of these concepts in our analysis is analogous to the role of corresponding
Renyi concepts in [11], [12].
In Section III, we investigate the cost constrained Augustin
capacity more closely and introduce the concepts of Augustin-Legendre information
and Renyi-Gallager information, together with the associated means, capacities,
centers, and radii.
Our main aim in Section III is to express the cost constrained
Augustin capacity and center in terms of Augustin-Legendre capacity and center.
In Section IV, we derive non-asymptotic outer bounds for
codes on two families memoryless channels.
I-A Model and Notation
For any set X, P(X) is the set of all probability mass functions
that are non-zero only on finitely many members of X;
M+(X) is the set of all non-zero mass functions with the same property.
For any measurable space (Y,Y),
P(Y) is the set of all probability measures
and M+(Y) is set of all finite measures.
For any μ,q∈M+(Y),
μ≤q iff μ(E)≤q(E)∀E∈Y.
Similarly, for any μ,q∈ℜℓ, μ≤q iff μ≤q∀∈{1,…,ℓ}.
For any μ,q∈ℜℓ,
μ⋅q≜∑=1ℓμq.
For any ℓ∈Z+, \mathds1∈ℜℓ is the vector
whose all entries are one.
For any S⊂ℜℓ
we denote the interior of S by intS.
For any set S in a vector space
we denote the convex hull of S by chS.
A channelW is a function from the input setX to the set of all probability
measures on the output space(Y,Y).
A channel W:X→P(Y) is a product channel for a finite index set T
iff there exist channels Wt:Xt→P(Yt) for all t∈T
satisfying W(x)=∏t∈T⊗Wt(xt)
for all x∈X
where
[TABLE]
A product channel is stationary iff all Wt’s are identical.
If X⊂∏t∈T⊗Xt then W is a
memoryless channel.
An (M,L)channel code on W:X→P(Y)
is an ordered pair (Ψ,Θ) composed of an encoding functionΨ:M→X
and a decoding function444Recall that for any encoder Ψ a deterministic MAP
decoder obtains minimum Peav among all, possibly non-deterministic, decoders.
Θ:Y→M where M≜{1,2,…,M},
M≜{L:L⊂M\mboxand∣L∣=L},
and Θ is a measurable as a function from
the measurable space (Y,Y).
Given an (M,L) channel code (Ψ,Θ) on W:X→P(Y)the average error probabilityPeav and
the conditional error probabilityPem for m∈M
are given by
[TABLE]
A cost functionρ is a function from the input set to ℜ≥0ℓ for
some ℓ∈Z+.
We assume without loss of generality that555Augustin [3, §33] has the following additional
hypothesis: ∨x∈Xρ(x)≤\mathds1.
[TABLE]
Let Γρ be the set of feasible cost constraints for P(X):
[TABLE]
Then Γρ is a convex set with non-empty interior.
A cost function ρ for a product channel W is said to be additive iff
there exists a ρt:Xt→ℜ≥0ℓ for each t∈T such that
[TABLE]
An encoding function Ψ, hence the corresponding code, is said to satisfy
the cost constraint ϱ iff ∨m∈Mρ(Ψ(m))≤ϱ.
A code on a product channel W:∏t∈T⊗Xt→P(Y)
is said to satisfy an empirical distribution constraint A⊂P(X1)
iff the empirical distribution, i.e. type or composition, of Ψ(m) is in A
for all m∈M.
I-B Main Result
Assumption 1**.**
{(Wt,ρt,ϱt)}t∈Z+ is an ordered sequence of
channels with associated cost functions and cost constraints satisfying the following
condition: ∃n0∈Z+,K∈ℜ+ s.t.
[TABLE]
for all ∀n≥n0
where ρ[1,n](x[1,n])=∑t=1nρt(xt).
Theorem 1**.**
Let {(Wt,ρt,ϱt)}t∈Z+ be a sequence
satisfying Assumption 1,
α0,α1 be orders satisfying 0<α0<α1<1
and ε∈ℜ≥0.
Then for any sequence of codes {(Ψt,Θt)}t∈Z+
on the product channels {W[1,n]}n∈Z+ satisfying
[TABLE]
there exists a τ∈ℜ+ and an n1≥n0 such that
[TABLE]
where Esp(R,W,ϱ)=supα∈(0,1)α1−α(Cα,W,ϱ−R).
Theorem 1 follows from Lemma 12 and Lemma 13,
through an analysis similar to the one in [12, §LABEL:B-sec:proof:thm:productexponent].
An asymptotic result similar to Theorem 1 for codes on stationary memoryless channels with
convex empirical distribution constraints can be proved using Lemma 12 and
the bound given in equation (10).
II The Augustin Information and Capacity
∀α∈ℜ+,w,q∈M+(Y), the order α
Renyi divergence is
[TABLE]
where ν is any measure s.t. w≺ν,q≺ν.
If Dα(w∥q)<∞ then the order α tilted probability measurevαw,q is
[TABLE]
II-A The Augustin Information and Mean
Definition 1**.**
For any α∈ℜ+, W:X→P(Y), and p∈P(X)the order α Augustin information for the prior p is
[TABLE]
where Dα(W∥q∣p)≜∑x∈Xp(x)Dα(W(x)∥q).
Whenever it exists, the uniqueness of qα,p∈P(Y) satisfying
Iα(p;W)=Dα(W∥qα,p∣p)
follows from the strict convexity of Dα(w∥q) in q,
i.e. [7, Thm 12].
Such a qα,p is called the order α Augustin mean for the prior p.
If ∣Y∣<∞ then P(Y) is compact and the existence of
qα,p follows from the lower semicontinuity of
Dα(w∥q) in q, i.e [11, Lem LABEL:A-lem:divergencelsc], and
the extreme value theorem [10, Ch3§12.2].
Lemma 1 asserts the existence of a unique qα,p for arbitrary channels
and describes qα,p via the identities it has to
satisfy.
Part (a) is well known;
part (b) is due to666[3, 34.2]
claims eq. (4) for q1,pg instead of qα,pg.
We could not confirm the correctness of Augustin’s proof of [3, 34.2], see [13].
Augustin [3, 34.2].
A generalization of Lemma 1 for all α∈ℜ+ is proved in [13].
Definition 2**.**
For any α∈ℜ+, W:X→P(Y), and p∈P(X),
•
Tα,p(⋅):{q∈M+(Y):Dα(W∥q∣p)<∞}→P(Y) is
[TABLE]
Furthermore,
Tα,p+1(q)≜Tα,p(Tα,p(q))
for ∈Z+.
Furthermore, if a q∈P(Y) satisfying q1,p≺q
is a fixed point of Tα,p(⋅) then q=qα,p.
3. (c)
If α∈(0,1], W is a product channel for a finite index set T,
and p is of the form ∏t∈T⊗pt
for pt∈P(Xt) then
[TABLE]
II-B The Constrained Augustin Capacity and Center
Definition 3**.**
For any α∈ℜ+, W:X→P(Y), and
A⊂P(X),
the order α Augustin capacity of W for constraint set A is
[TABLE]
Using the definition of Iα(p;W) we get
[TABLE]
Proofs of the propositions presented in this subsection can be found in [13].
They are very similar to the proofs of the corresponding claims in
[11, §LABEL:A-sec:capacity, §LABEL:A-sec:center, §LABEL:A-sec:constrainedcapacity]
for Renyi capacity;
we invoke Lemma 1 instead of [11, Lem LABEL:A-lem:information:def].
Lemma 2**.**
.
For any W:X→P(Y) and A⊂P(X)
(a)
Cα,W,A:(0,1]→[0,∞]*
is increasing and continuous*
2. (b)
α1−αCα,W,A:(0,1)→[0,∞]*
is decreasing and continuous*
3. (c)
∃α∈(0,1)* s.t. Cα,W,A<∞ iff
Cϕ,W,A<∞∀ϕ∈(0,1).*
Theorem 2**.**
∀α∈(0,1],W:X→P(Y), and
convex A⊂P(X),
[TABLE]
If Cα,W,A<∞ then
∃!qα,W,A∈P(Y),
called the order α Augustin center of W for the constraint set A,
such that
[TABLE]
If lim→∞Iα(p();W)=Cα,W,A<∞
for a {p()}∈Z+⊂A then
{qα,p()}∈Z+
is a Cauchy sequence for the total variation metric on P(Y)
and qα,W,A is its unique limit point.
Lemma 1 and Theorem 2
imply for all α∈(0,1], p∈A that
[TABLE]
Using Lemma 1 and Theorem 2 we can prove the following
Erven-Harremoes bound for Augustin capacity.
Lemma 3**.**
For any α∈(0,1],W:X→P(Y), and
convex A⊂P(X) s.t.
Cα,W,A<∞, and q∈P(Y)
[TABLE]
Erven-Harremoes bound, the continuity of Cα,W,A in α, and
Pinsker’s inequality imply the continuity of qα,W,A in α for the
total variation topology on P(Y).
Lemma 4**.**
For any η∈(0,1],W:X→P(Y),
convex A⊂P(X)
s.t. Cη,W,A<∞,
and α, ϕ satisfying 0<α<ϕ≤η,
[TABLE]
Furthermore, qα,W,A:(0,η]→P(Y)
is continuous in α for the total variation topology on P(Y).
Lemma 5**.**
For any α∈(0,1], product channel W for a finite index set T,
convex sets At⊂P(Xt) for each t∈T, and
A=ch{∏t∈T⊗pt:pt∈P(Xt)∀t∈T}
[TABLE]
Furthermore, if Cα,W,A<∞ then
qα,W,A=∏t∈T⊗qα,Wt,At.
III The Cost Constrained Augustin Capacity
With a slight abuse of notation we define the cost constrained Augustin capacity as
[TABLE]
where A(ϱ)≜{p∈P(X):∑xp(x)ρ(x)≤ϱ}.
Note that Theorem 2 and Lemmas 3 and 4
hold for Cα,W,ϱ because A(ϱ) is a convex set.
We denote Augustin center by qα,W,ϱ.
Lemma 6**.**
For any α∈(0,1], W:X→P(Y), ρ:X→ℜ≥0ℓ,
(a)
Cα,W,ϱ:Γρ→[0,∞]* is increasing and concave in ϱ.
It is either infinite ∀ϱ∈intΓρ
or finite and continuous on intΓρ.*
2. (b)
If Cα,W,ϱ<∞ for a ϱ∈intΓρ then
∃λα,W,ϱ∈ℜ≥0ℓ s.t.
[TABLE]
The set of all such λα,W,ϱ’s for an α is convex and compact.
Lemma 7**.**
For any α∈(0,1], product channel W for a finite index set T, additive
cost function ρ:X→ℜ≥0ℓ satisfying
ρ(x)=∑t∈Tρt(xt) for some
ρt:Xt→ℜ≥0ℓ
and ϱ∈Γρ
[TABLE]
If ∃{ϱt}t∈T s.t.
Cα,W,ϱ=∑t∈TCα,Wt,ϱt
and Cα,W,ϱ<∞
then qα,W,ϱ=∏t∈T⊗qα,Wt,ϱt.
Since Augustin capacity is concave in the cost constraint by Lemma 6-(a),
Cα,W,ϱ=∑t∈TCα,Wt,∣T∣ϱ
whenever W is stationary and ρt=ρ1 for all t∈T.
Alternatively, if Γρt’s are closed and
Cα,Wt,ϱ’s are upper semicontinuous functions of ϱ
on Γρt’s then we can use the extreme value theorem for the upper
semicontinuous functions to establish the
existence of a {ϱt}t∈T s.t.
Cα,W,ϱ=∑t∈TCα,Wt,ϱt.
However, such an existence assertion does not hold in general.
III-A The A-L Information, Capacity, Center, and Radius
This subsection is a generalization of parts of [4, Ch. 8],
which is confined to ∣X∣∨∣Y∣<∞, α=1, and
ℓ=1 case.
For any α∈ℜ+, W:X→P(Y),
cost function ρ:X→ℜ≥0ℓ,
λ∈ℜ≥0ℓ, and p∈P(X)the order α Augustin-Legendre (A-L) information for prior p and
Lagrange multiplier λ is
[TABLE]
We call Iαλ(p;W) A-L information because of the convex
conjugate pair fα,p:ℜ≥0ℓ→(−∞,∞]
and fα,p∗:ℜ≤0ℓ→ℜ:
[TABLE]
Thus one can write Cα,W,ϱ in terms of Iαλ(p;W) as
[TABLE]
Iαλ(p;W) is convex, decreasing and continuous in λ.
Furthermore, by Lemma 1 for α∈(0,1] we have:
[TABLE]
For any α∈(0,1], W:X→P(Y),
ρ:X→ℜ≥0ℓ,
and λ∈ℜ≥0ℓ,
the A-L capacityCα,Wλ
and
the A-L radiusSα,Wλ
are given by
[TABLE]
Using the definition of Iαλ(p;W), Iα(p;W) and Sα,Wλ we get
[TABLE]
Lemma 8**.**
For any α∈(0,1], W:X→P(Y),
ρ:X→ℜ≥0ℓ,
(a)
Cα,Wλ* is convex, decreasing and lower semicontinuous in λ
on ℜ≥0ℓ and continuous in λ on
{λ:∃ϵ>0s.t.Cα,Wλ−ϵ\mathds1<∞}.*
2. (b)
Cα,W,ϱ≤infλ≥0Cα,Wλ+λ⋅ϱ*
for all ϱ∈Γρ.*
3. (c)
Cα,W,ϱ=infλ≥0Cα,Wλ+λ⋅ϱ*
if either ∣X∣<∞ or ϱ∈intΓρ.*
4. (d)
If ∃ϱ∈intΓρ s.t. Cα,W,ϱ<∞
then ∀ϱ∈intΓρ∃λ∈ℜ≥0ℓ
s.t. Cα,W,ϱ=Cα,Wλ+λ⋅ϱ.
5. (e)
If Cα,W,ϱ=Cα,Wλ+λ⋅ϱ<∞
for a (ϱ,λ)∈Γρ×ℜ≥0ℓ, and
lim→∞Iα(p();W)=Cα,W,ϱ
for a {p()}∈Z+⊂A(ϱ) then
lim→∞Iαλ(p();W)=Cα,Wλ.
If ∃λ∈ℜ≥0 s.t. Cα,Wλ<∞ then
Cα,W,ϱ<∞∀ϱ∈Γρ
by Lemma 8-(a).
However, the converse claim is not true.
There are cases for which Cα,W,ϱ is finite for all ϱ∈Γρ, yet
Cα,Wλ is infinite for λ small
enough.777In [3, §33-§35], Augustin considers bounded ρ’s
of the form ρ:X→[0,1]ℓ. In that case, it is easy to see that
if ∃ϱ∈intΓρ s.t. Cα,W,ϱ<∞ then
supϱ∈ΓρCα,W,ϱ=Cα,W,\mathds1<∞ and
Cα,Wλ<∞ for all λ∈ℜ≥0ℓ.
The equality given in (c) might not hold
if ϱ∈Γρ∖intΓρ and ∣X∣=∞.
Theorem 3**.**
∀α∈(0,1], W:X→P(Y), ρ:X→ℜ≥0ℓ,
λ∈ℜ≥0ℓ,
[TABLE]
If Cα,Wλ<∞ then ∃!qα,Wλ∈P(Y),
called the order α A-L center of W for the Lagrange multiplier λ,
such that
[TABLE]
If lim→∞Iαλ(p();W)=Cα,Wλ<∞
for a {p()}∈Z+⊂P(X) then corresponding
{qα,p()}∈Z+is a Cauchy
sequence for the total variation metric on P(Y) and qα,Wλ is its unique
limit point.
Lemma 9**.**
If α∈(0,1], W:X→P(Y), ρ:X→ℜ≥0ℓ,
ϱ∈Γρ s.t. Cα,W,ϱ<∞ and λ∈ℜ≥0ℓ s.t.
Cα,W,ϱ=Cα,Wλ+λ⋅ϱ then qα,W,ϱ=qα,Wλ.
Lemma 10**.**
∀α∈(0,1], product channel W for finite index set T,
and ρ satisfying
ρ(x)=∑t∈Tρt(xt) for some
ρt:Xt→ℜ≥0ℓ,
[TABLE]
If Cα,Wλ<∞ then qα,Wλ=∏t∈T⊗qα,Wtλ.
Recall that the product structure assertion for qα,W,ϱ in Lemma 7,
was qualified by the existence of a {ϱt}t∈T satisfying
∑t∈TCα,Wt,ϱt=Cα,W,ϱ<∞.
In Lemma 10, on the other hand, the product structure assertion for qα,Wλ
is qualified only by Cα,Wλ<∞.
III-B The R-G Information, Mean, Capacity, and Center
For any α∈ℜ+∖{1}, W:X→P(Y),
cost function ρ:X→ℜ≥0ℓ,
λ∈ℜ≥0ℓ, and p∈P(X)the order α Renyi-Gallager (R-G) information for prior p and
Lagrange multiplier λ is
[TABLE]
The order α R-G capacity for Lagrange multiplier λ is
[TABLE]
Using the definition of Iαgλ(p;W) and Cα,Wgλ we get
[TABLE]
Using the concavity of log function and Jensen’s inequality one can show that
Iαλ(p;W)≥Iαgλ(p;W) for
α∈(0,1) and
Iαλ(p;W)≤Iαgλ(p;W) for
α∈(1,∞).
On the other hand, one can show by substitution that
∀q∈P(Y) and α∈ℜ+∖{1},
[TABLE]
where qα,pgλ is the R-G mean given in terms of μα,pλ as follows,
[TABLE]
For λ=0\mathds1, R-G information and mean are equal to the corresponding
Renyi information and mean analyzed in [11].
Following a similar analysis one can show that a minimax theorem similar
to [11, Thm LABEL:A-thm:minimax] holds for R-G quantities:
For any
w=w1⊗⋯⊗wn,
q=q1⊗⋯⊗qn,
κ≥3, α∈(0,1), if
q(E)≤(\nicefrac116n)e−D1(vαw,q∥q)−α3gκ
for E∈Y and
gκ≜(∑t=1nEvαw,q[lndqtdwt∼−Evαw,q[lndqtdwt∼]κ])κ1
then
w(Y∖E)≥(\nicefrac116n)e−D1(vαw,q∥w)−(1−α)3gκ.
Lemma 11 is in the spirit of [16, Thm 5], but instead of
Chebyshev ineq, it relies on Berry-Essen Thm via [12, Lem LABEL:B-lem:berryesseenN].
Our sphere packing bounds are expressed in terms of the averaged Augustin capacity
and888Note Cα,W,ϱϵ=Cα,W,A(ϱ)ϵ and
Espϵ(R,W,ϱ)=Espϵ(R,W,A(ϱ)).
averaged sphere packing exponent: for all ϵ∈(0,1) and R∈ℜ+:
[TABLE]
Lemma 12**.**
For any α∈(0,1], W:X→P(Y), A⊂P(X) s.t.
C\nicefrac12,W,A∈ℜ+, ϕ∈(0,1),
R∈[Cϕ,W,A,C1,W,A) and ϵ∈(0,ϕ).
Then 0≤Espϵ(R,W,A)−Esp(R,W,A)≤ϕ−ϵϵϕR.
Proof of Lemma 12 is identical to that of
[12, Lem LABEL:B-lem:avspherepacking].
Lemma 13**.**
For any product channel W for the index set {1,…,n},
cost function ρ satisfying ρ(x)=∑t∈Tρt(xt)
for ρt:Xt→ℜ≥0ℓ,
ϱ∈intΓρ, and integers M, L satisfying
[TABLE]
for a κ≥3, an α0∈(0,1), an ϵ1∈(0,1)
and an ϵ2∈(0,1)
satisfying ϵ1(n−1)(1−α0)(1−ϵ1)≥1,
any (M,L) channel code (Ψ,Θ) on W
satisfying ∨m∈Mρ(Ψ(m))≤ϱ satisfies
[TABLE]
Proof Sketch.
Since ϱ∈intΓρ,
∀α∈(0,1)∃λα,W,ϱ∈ℜ≥0ℓ s.t.
Cα,W,ϱ=Cα,Wλα,W,ϱ+λα,W,ϱ⋅ϱ
by Lem.8-(d).
Then qα,W,ϱ=qα,Wλα,W,ϱ by Lemma 9.
Furthermore,
qα,Wλα,W,ϱ=∏t⊗qα,Wtλα,W,ϱ
by Lemma 10.
Then qα,Wtλα,W,ϱ:(0,1)→P(Yt) is continuous
in α for the total variation topology on P(Yt)
because qα,W,ϱ is by Lemma 4.
Then q⋅,Wtλ⋅,ϱ is a transition probability from
((0,1),B((0,1))) to (Yt,Yt).
We define qα,Wtϵ as the Yt marginal of
the probability measure uα,ϵ∘q⋅,Wtλ⋅,ϱ
where uα,ϵ is the uniform probability distribution on (α−ϵα,α+ϵ(1−α)):
[TABLE]
Let Ψt(m) be the Yt marginal of Ψ(m)
and qα,t, qα, vαm be
[TABLE]
By [11, Lem LABEL:A-lem:divergenceQ-(LABEL:A-divergenceQ-RM,LABEL:A-divergenceQ-Qconvexity)],
Lemma 10 and lnτ≤τ−1 we have
[TABLE]
Using Lemma 9, [11, Lem LABEL:A-lem:divergenceQ-(LABEL:A-divergenceQ-order)],
[7, Prop 2], Theorem 2, ρ(Ψ(m))≤ϱ
and the definition of Cα,W,Aϵ we get
[TABLE]
Let (Ψt(m))∼ be the component of Ψt(m)
that is absolutely continuous in qα,t.
Furthermore, let
ξα,tm and ξαm be
[TABLE]
Then using
[12, Lem LABEL:B-lem:MomentBound],
[11, Lem LABEL:A-lem:divergenceQ-(LABEL:A-divergenceQ-RM,LABEL:A-divergenceQ-order)],
[7, Prop 2],
and
Theorem 2 we get
[TABLE]
Then using the definition of γ we get
[TABLE]
On the other hand, ∀m∈M,α∈(0,1) by the definition of vαm
[TABLE]
Thus we can bound D1(vαm∥qα) using the non-negativity of the Renyi divergence, i.e. [7, Thm 8],
and equation (7) as
D1(vαm∥qα)≤1−ϵ2nϵ2+Cα,W,ϱϵ1. Hence,
[TABLE]
D1(vαm∥qα) is continuous in α by
[12, Lem LABEL:B-lem:tilting], then
by the intermediate value theorem [14, 4.23]
∀m∈M∃αm∈(α0,1) s.t.
[TABLE]
Lemma 13 follows from Lemma 11 through a pigeon hole argument
similar to the one invoked in [12, eq (LABEL:B-eq:augustinM-9)-(LABEL:B-eq:augustinM-10)].
∎
If W is stationary and memoryless Lemma 13 can be proved
∀ϱ∈Γρ
by setting qα,Wtϵ=∫uα,ϵ∘qϕ,Wt,nϱdϕ.
Furthermore, bound given in (10) can be obtained for codes satisfying a convex empirical
distribution constraint A⊂P(X1) by setting
qα,Wtϵ=∫uα,ϵ∘qϕ,Wt,Adϕ
and qα,t=(1−ϵ2)qα,Wtϵ1+ϵ2q21,Wt,BA
where BA≜P({x∈X1:∃p∈A\mboxs.t.p(x)>0}).
[TABLE]
Acknowledgment
Author would like to thank Fatma Nakiboğlu and Mehmet Nakiboğlu for their hospitality;
this work simply would not have been possible without it.
Bibliography16
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] Y. Altuğ and A. B. Wagner. Refinement of the sphere-packing bound: Asymmetric channels. IEEE Trans. on Information Theory , 60(3):1592–1614, March 2014.
2[2] U. Augustin. Error estimates for low rate codes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete , 14(1):61–88, 1969.
4[4] I. Csiszár and J. Körner. Information theory: coding theorems for discrete memoryless systems . Cambridge University Press, 2011.
5[5] M. Dalai. Classical and classical-quantum sphere packing bounds: Rényi vs Kullback and Leibler. In International Zurich Seminar on Communications , pages 198–202, 2016.
6[6] P. M. Ebert. Error Bounds For Parallel Communication Channels . 1966. ( https://hdl.handle.net/1721.1/4295 ).
7[7] T. van Erven and P. Harremoes. Renyi divergence and Kullback-Leibler divergence. IEEE Trans. on Information Theory , 60(7):3797–3820, July 2014.
8[8] R. G. Gallager. A simple derivation of the coding theorem and some applications. IEEE Trans. on Information Theory , 11(1):3–18, Jan. 1965.