The Method of Types for the AWGN Channel
Sergey Tridenski, Anelia Somekh-Baruch

TL;DR
This paper provides new ways to calculate error and decoding bounds for a communication channel using a method involving finite alphabets.
Contribution
The paper introduces alternative derivations for error and decoding bounds using the method of types with sub-exponential complexity.
Findings
Sphere-packing upper bound on the optimal block error exponent is derived using the method of types.
A lower bound on the optimal correct-decoding exponent is also derived using similar techniques.
The method uses finite alphabets with sizes dependent on block length n.
Abstract
For the discrete-time AWGN channel with a power constraint, we give an alternative derivation for the sphere-packing upper bound on the optimal block error exponent and an alternative derivation for the analogous lower bound on the optimal correct-decoding exponent. The derivations use the method of types with finite alphabets of sizes depending on the block length n and with the number of types sub-exponential in n.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —Israel Science Foundation (ISF)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Communication Security Techniques · Cellular Automata and Applications · DNA and Biological Computing
1. Introduction
We study reliability of the discrete-time additive white Gaussian noise (AWGN) channel with a power constraint imposed on blocks of its inputs. Consider the capacity of this channel, found by Shannon:
where is the channel noise variance and is the power constraint. This capacity corresponds to the maximum of the mutual information over , under the power constraint on , where w stands for the channel transition probability density function (PDF) and is the channel input PDF. Let us briefly recall the technicalities [1] of how expression (1) is obtained from the mutual information:
Here and , the operator denotes the expectation, and D is the Kullback–Leibler divergence between two probability densities. The maximum in (2) is attained by the Gaussian PDF with zero mean and variance , which simultaneously gives and brings the divergence to zero, which is its lowest possible value [1] [Equation (8.57)]. In this paper we consider the optimal exponents in the block error/correct-decoding probability of the AWGN channel. We propose explanations, similar to (2), both for Shannon’s sphere-packing converse bound on the optimal error exponent [2] [Equations (3), (4) and (11)] (see also [3] [Equation (47)] with [4] [Equations (7.5.34) and (7.5.32)]) and for Oohama’s converse bound on the optimal correct-decoding exponent [5] [Equation (4)].
In the case of discrete memoryless channels, the mutual information enters into the expressions for correct-decoding and error exponents through the method of types [6,7]. For the moment, without any particular meaning attached to it, let us rewrite the sphere-packing constant-composition exponent [6] [Equation (5.19)] with the PDF’s:
where denotes the Gaussian density with variance , which maximizes (2), and is the information rate. When is Gaussian, the minimum (3) allows an explicit solution by the method of Lagrange multipliers. The minimizing solution of (3) is Gaussian [8] [Equation (65)], Ref. [9] [Equation (327)], and we obtain that (3) is the same as Shannon’s converse bound on the error exponent [2] [Equations (3), (4) and (11)], Refs. [8,9] in the limit of a large block length:
Then it turns out that of (3) and the y-marginal PDF of the product [8] [Equation (63)] play the same roles in the derivation of the converse bound as w and , respectively, in the maximization (2).
In this paper, in order to derive expressions similar to (3), we extend the method of types [1] [Chapter 11.1], Ref. [10] to include countable alphabets consisting of uniformly spaced real numbers, with the help of power constraints on types. The countable alphabets depend on the block length n and the number of types satisfying the power constraints is kept sub-exponential in n. The latter idea is inspired by a different subject—of “runs” in a binary sequence. If we treat every “run” of ones or zeros in a binary sequence as a separate symbol from the countable alphabet of run-lengths, then the number of different empirical distributions of such symbols in a binary sequence of length n is equivalent to the number of different partitions of the integer n into the sum of positive integers, which is [11] [Equation (5.1.2)]. Thus, it is sub-exponential, and the method of types can be extended to that case. In our present case, however, the types are empirical distributions of uniformly quantized real numbers in quantized versions of real channel input and output vectors of length n. The quantized versions serve only for classification of channel input and output vectors and not for the communication itself. The uniform quantization step is different for the quantized versions of channel inputs and outputs, and in both cases it is chosen to be a decreasing function of n.
Via expressions similar to (2), the proposed derivations demonstrate that, in order to achieve the converse bounds on the correct-decoding and error exponents, it is necessary for the types of the quantized versions of codewords to converge to the Gaussian distribution in the characteristic function (CF), or, equivalently, in cumulative distribution function (CDF).
The contributions of the current paper are twofold: we successfully apply the method of types to derive converse bounds on both the error exponent [12] and the correct-decoding exponent [13] of the AWGN channel. This underscores the advantage of the method of types.
In Section 2 and Section 3, we describe the communication system and introduce other definitions. In Section 4 we present the main results of the paper, which consist of two theorems and a proposition. Section 5 provides an extension to the method of types. The results of Section 5 are then applied in all the sections that follow. In Section 6 we prove a converse lemma that is then used for derivation of both the correct-decoding and error exponents in Section 7 and Section 8, respectively. Section 9 connects between the PDFs and types.
Notation
Countable alphabets consisting of real numbers are denoted by , . The set of types with denominator n over is denoted by . Capital ‘ ’ denotes probability mass functions, which are types , , , . The type class and the support of a type are denoted by and , respectively. The expectation with respect to a probability distribution is denoted by . Small ‘p’ denotes probability density functions , , , . Thin letters x, y represent real values, while thick letters , represent real vectors. Capital letters X, Y represent random variables; boldface represents a random vector of length n. The conditional type class of given is denoted by . The quantized versions of variables are denoted by a superscript ‘q’: , , . Small w stands for a conditional PDF, and stands for a discrete positive measure which does not necessarily add up to 1. All information-theoretic quantities such as joint and conditional entropies , , the mutual information , , , the Kullback–Leibler divergence , , and the information rate R are defined with respect to the logarithm to a base , denoted as . It is assumed that . The natural logarithm is denoted as ln. The cardinality of a discrete set is denoted by , while the volume of a continuous region is denoted by . The complementary set of a set A is denoted by . Logical “or” and “and” are represented by the symbols ∨ and ∧, respectively. Gaussian distributions are denoted by , while stands for the discrete uniform distribution. In the Appendix B, represents the rounded down version of the PDF .
2. Communication System
We consider communication over the time-discrete additive white Gaussian noise channel with real channel inputs and channel outputs and a transition probability density
Communication is performed by blocks of n channel inputs. Let denote a nominal information rate. Each block is used for transmission of one out of M messages, where , for some logarithm base . The encoder is a deterministic function , which converts a message into a transmitted block, such that
where , for all . The set of all the codewords , , constitutes a codebook . Each codeword in satisfies the power constraint:
The decoder is another deterministic function , which converts the received block of n channel outputs into an estimated message or, possibly, to a special error symbol ‘0’:
where each set is either an open region or the empty set, and the regions are disjoint: for . Observe that the maximum-likelihood decoder with open decision regions , defined for as
is a special case of (6). Note that the formal definition of includes the undesirable possibility of for .
3. Definitions
For each n, we define two discrete countable alphabets and as one-dimensional lattices:
For each n, we define also a discrete positive measure (not necessarily a distribution), which will approximate the channel w:
Denoting by a class of functions continuous on an open subset , we define
The set (11), defined for a given n, will be used only in the derivation of the correct-decoding exponent, while the following set of Lipschitz continuous functions will be used only in the derivation of the error exponent:
Note that is a convex set and also each function is bounded and cannot exceed .
With a parameter , we define the following Gaussian probability density functions [8,9]:
The first property of the following lemma shows that is the y-marginal PDF of the product .
Lemma 1(Properties of (13)–(18)). The following properties hold:
and for any two jointly distributed random variables , such that , , and , it holds that
Here (17) combined with (14) corresponds to [8] [Equation (64)], (15) and (20) can be found in [8] [Equation (65)], while (14), (21) correspond respectively to [9] [Equations (302) and (328)].
Proof **of Lemma 1.**The first property (19) can be verified using (14), (15), (17). Then (20) can be obtained from (15), (17), (19). Property (21) follows by (17) and (19). It can be verified from (14) that is a positive monotonically decreasing function of , such that . Then we get (22) and (23). From (21) and (22) we see that for all , which gives (24). Equality (25) can be obtained using (21). Then, using (22) and (23), we obtain (26). □
The following expressions will describe our results for the error and correct-decoding exponents:
The following identity can be obtained using (13), (15), (16), (18) and (20):
We note also that , as , which can be verified using the properties (15), (17) and (21). It can be verified that the expression inside the supremum of (27) is equivalent to the expression for the Gaussian random-coding error exponent of Gallager before the maximization over [4] [Equations (7.4.24) and (7.4.28)]. Therefore, with the supremum over , the expression (27) coincides with the converse sphere-packing bound of Shannon (4).
4. Main Results
In this section we present two theorems and a proposition. The two theorems give converse bounds on the optimal error exponent and correct-decoding exponent of the AWGN, respectively. The bounds are asymptotic in the limit of a large block length n, and are given by the expressions (27) and (28), accordingly. The proofs, leading to these expressions, are also presented in this section, while some of their technical details are encapsulated in the form of lemmas that are taken care of by the rest of the sections in this paper. In the course of the two proofs leading to (27) and (28) we obtain expressions analogous to (2), which, in exactly the same manner as the maximization of (2), allow us to draw conclusions about the asymptotically optimal codeword types, achieving the converse bounds. This is made precise in the remark below, after the proof of the second theorem. The section is concluded with the proposition that brings together the bounds of the two theorems in a parametric form.
The proof of the first theorem relies on Lemmas 13 and 19, which appear in Section 7 and Section 9, respectively.
Theorem 1(Error exponent). Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then
where is defined by (27), decoder functions g are defined by (6), and codebooks satisfy (5).
Proof. Starting from Lemma 13, we can write the following sequence of inequalities:
where(a) holds for any by Lemma 13 with . Note also that in (31) denotes the Kullback–Leibler divergence between the probability distribution and the positive measure defined in (10), which is not a probability distribution but only approximates the channel w.(b) follows by Lemma 19 for the alphabet parameters and .(c) holds for all with the possible exception of the single point on R-axis where (32) may transition between a finite value and . To verify the equality, let us compare the infimum in (32) and the supremum over in (33) as functions of , for a given . First, it can be verified that the supremum of (33) is the closure of the lower convex envelope of the infimum of (32). Second, it can be checked by the definition of convexity, that the infimum of (32) itself is a convex (∪) function of R. Then they coincide for all values of R, except possibly for the single point where they both jump to . This property carries over to the external ‘lim sup max’ as well.(d) follows because by (24) and (26) function satisfies the conditions under the infimum of (33).(e) holds as equality inside the supremum over , separately for each . In (34) by we denote the corresponding marginal PDF of the product and use the definitions (29). Then (e) follows by the definitions of and in (13) and (16), and by their properties (15) and (20).(f) follows by the non-negativity of the divergence, and by the condition under the maximum of (34), since for .In conclusion, according to (c) we obtain that the inequality between (30) and (35), as functions of R, holds for all , except possibly for the single point , where the jump to in (35) occurs. Therefore, taking the limit as , we obtain that (30) is upper-bounded for all by
which is the same as (27). □
The second theorem relies on Lemmas 17 and 20, which appear in Section 8 and Section 9, respectively.
Theorem 2(Correct-decoding exponent). Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then
where is defined by (28), decoder functions g are defined by (6), and codebooks satisfy (5).
Proof. Starting from Lemma 17, for each we can choose a different parameter , such that there is equality between (74) and (28). Then by (80) we obtain
With the choice , the first term in the minimum can be lower-bounded as follows:
where:(a) follows by Lemma 20 with .(b) holds for , because for any such , and because , where is the Gaussian PDF defined in (16).(c) holds as an identity inside the infimum by the definitions (13), (16) and (29), and properties (15) and (20).(d) holds if and , because then by (26) and (11) the function satisfies the conditions under the infimum and achieves the infimum.(e) follows by the condition under the minimum of (36) since for .In conclusion, since (37) is the lower bound for any and , we obtain
□
Remark 1. Observe that neither the inequality (f) in the proof of Theorem 1 nor the inequality (e) in the proof of Theorem 2 can be met with equality unless . Furthermore, neither the inequality (f) in the proof of Theorem 1 nor the inequality (b) in the proof of Theorem 2 can be met with equality unless , where is the y-marginal PDF of . This is similar to (2). Accordingly, since is Gaussian, while is a convolution of with the Gaussian PDF , the type must converge to the Gaussian distribution with zero mean and variance in CF (it follows because the expression for the characteristic function of the zero-mean Gaussian distribution also has a Gaussian form) and CDF in order to achieve the exponents of Theorems 1 and 2. In both proofs the type represents the histograms of codewords, i.e., the empirical distributions of their quantized versions.
The functions and given by (27) and (28) can be expressed parametrically [4,5,8] as follows:
Proposition 1(Parametric representations of and ). For every there exists a unique , such that
For every there exists a unique , such that
The correct-decoding exponent representation of (38) is equivalent to [5] [Equation (22)] and appears in [14] [Equations (25) and (26)], while the error exponent representation of (39) is equivalent to [4] [Equations (7.4.30) and (7.4.31)] and appears in [8] [Equations (70) and (71)] and [9] [Equations (329) and (330)]. Here we present an alternative proof of Proposition 1 in the vein of the proofs of Theorems 1 and 2.
Proof. Let us denote . Then for we can write a sandwich proof:
where denotes the set of all bivariate non-degenerate Gaussian PDF’s. Here (a) follows similarly to the inequality (b) in Theorem 2; (b) is an identity; (c) follows because is Gaussian and achieves the infimum; and (d) is a lower bound on the supremum at . Finally, since the RHS of (41) is further lower-bounded by the infimum (40), we conclude that .For , besides let us define . Then
Here (a) follows due to the inequality under the first infimum; (b) is an identity; (c) follows because is Gaussian and achieves the infimum; and (d) follows because ; (e) is a lower bound on the supremum at . Since the RHS of (43) is lower-bounded by the infimum (42), we obtain . From using (17) and (21) we obtain . Hence for every the parameter is unique. □
5. Method of Types
In this section we extend the method of types [1] to include the countable alphabets of uniformly spaced reals (9) by using power constraints on the types. The method of types in the form of the results of this section is then used in the rest of the paper. It allows us to establish converse bounds in terms of types in Section 6, Section 7 and Section 8 and is used in the Appendix B dedicated to the proof of Lemma 18 of Section 9, connecting between PDFs and types.
5.1. Alphabet Size
Consider all the types satisfying the power constraint . Let denote the subset of the alphabet used by these types. In particular, every letter must satisfy , while by the definition of a type we have . This gives . Then is finite and by (7) we obtain
Lemma 2(Alphabet size). * .*
5.2. Size of a Type Class
For let us define
Lemma 3(Support of a joint type). Let be a joint type, such that and . Then
The proof is given in the Appendix A.
Lemma 4(Support of a type). Let and be types, such that and . Then
The proof for is given in the Appendix A.
For , the parameters , replace, respectively, , .
Lemma 5(Size of a type class). Let be a joint type, such that and . Then
where , , and .
Proof. Observe that the standard type-size bounds (see, e.g., [6] [Lemma 2.3], Ref. [1] [Equation (11.16)]) can be rewritten as
Here can be replaced with its upper bound of Lemma 3. This gives (44). The remaining bounds of (45) and (46) are obtained similarly using Lemma 4. □
Since it holds for any that , and similarly for , as a corollary of the previous lemma we also obtain
Lemma 6(Size of a conditional type class). Let be a joint type, such that and . Then for and respectively
where , , and are defined as in Lemma 5.
5.3. Number of Types
Let be the set of all the types satisfying the power constraint . Then its cardinality can be upper-bounded as follows:
where (a) follows by the definition of preceding Lemma 2, (b) follows by [1] [Equation (11.6)], and (c) follows by Lemma 2. This bound is sub-exponential in n for . This can be also further improved and made sub-exponential in n for all using Lemma 4, as follows.
Lemma 7(Number of types).
where and .
Proof. Denoting and , we can upper-bound as follows
Substituting for k and ℓ their upper bounds of Lemma 2 (with ) and Lemma 4, we obtain (51). □
Similarly, let denote the set of all the joint types , such that and . Then its cardinality can be bounded as follows.
Lemma 8(Number of joint types).
where and .
Proof. Denoting and , we repeat the steps of (52) and use the bounds of Lemma 2 and Lemma 3 to obtain (53). □
6. Converse Lemma
In this section we prove a converse Lemma 10, which is then used both for the error exponent in Section 7 and for the correct-decoding exponent in Section 8.
In order to determine exponents in channel probabilities, it is convenient to take hold of the exponent in the channel probability density. Let be a vector of n channel inputs and let be its quantized version, with components
Similarly, let be a vector of n channel outputs and let be its quantized version, with for all . Then we have the following
Lemma 9(PDF exponent). Let and be two channel input and output vectors, with their respective quantized versions , such that . Then
Proof. The exponent can be equivalently rewritten as
Defining , we observe that
The second term on the RHS is bounded as:
where (a) follows because , (b) follows by Jensen’s inequality for the concave (∩) function , and (c) follows by the condition of the lemma. The third term is bounded as
Since the exponent with the quantized versions , in turn, can also be rewritten similarly to (55), the result of the lemma follows by (55)–(58). □
The following lemma will be used both for the upper bound on the error exponent and for the lower bound on the correct-decoding exponent.
Lemma 10(Conditional probability of correct decoding). Let be a joint type, such that , , and , and let be a codebook, such that the quantized versions (54) of its codewords , , are all of the type , that is:
Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Let . Then
where , and , as , depending only on α, β, , , , and .
Proof. First, from the single code we create an ensemble of codes, where each member code has the same probability of error/correct-decoding as the original code . Then we upper bound the ensemble average probability of correct decoding.Considering the codebook as an matrix, we permute its n columns. This produces a set of codebooks: , . The quantized versions of all the codewords of each codebook belong to the same type class . In accordance with , we permute also the n coordinates of each of the decision regions in the definition (6) of the decoder g, obtaining open sets and creating in this way an ensemble of codes , .Let denote the random channel-input and channel-output vectors, respectively, when using the code with an index . Let and denote their respective quantized versions. Since the additive channel noise is i.i.d., permutation of components does not change the distribution of the noise vector , and we obtain
Suppose that one of the codes , , is used for communication with probability , chosen independently of the sent message J and of the channel noise. Let be the random variable denoting the index of this code. Then, using (59) and (60) we obtain
In what follows, we upper bound the RHS of (61) with an added condition that :
The total number of codes in the ensemble can be rewritten as
Given , the total number of all the codewords in the ensemble such that their quantized versions belong to the same conditional type class (counted as distinct if the codewords belong to different ensemble member codes or represent different messages) is given by
Let denote the number of the codewords in a codebook such that their quantized versions belong to . Given that , the channel output vector falls into a hypercube region of :
For any such that and any open region , by Lemma 9 we obtain
Then, since all the codes and messages are equiprobable, the conditional probability of the code with the index ℓ is upper-bounded as
For , let be the indices of all the codewords in the codebook with their quantized versions in . Given that indeed the codebook has been used for communication, similarly to (65), by (64) the conditional probability of correct decoding can be upper-bounded as
where the second inequality follows because the decision regions are disjoint. Summing up over all the codes, we finally obtain:
where (a) follows by (65) and (66), (b) follows by (62) and (63), and (c) follows by (45) of Lemma 5 and (48) of Lemma 6. □
In the next two sections, we derive converse bounds on the error and correct-decoding exponents in terms of types.
7. Error Exponent
The end result of this section is given by Lemma 13 and represents a converse bound on the error exponent by the method of types.
Lemma 11(Error exponent of mono-composition codebooks). Let be a type, such that , and let be a codebook, such that the quantized versions (54) of its codewords , , are all of the type , that is:
*Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then for any parameter *
where , and , as , depending only on α, β, , , and .
Proof. For a joint type with the marginal type , such that , we have also
Then with for any , we obtain
where (a) follows by Lemma 9, and (b) follows by (49) of Lemma 6 and (10). This gives
Now we are ready to apply Lemma 10:
where (a) follows by (69) and Lemma 10, and (b) holds for . □
Lemma 12(Type constraint). For any there exists , such that for any and any codeword , satisfying the power constraint (5), the quantized version of that codeword, defined by (54), satisfies the power constraint (5) within ϵ, that is with replaced by .
The proof is the same as (56)–(58).
Lemma 13(Error exponent). *Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then for any and there exists , such that for any *
where , as , depending only on the parameters α, β, , , and .
Proof. For a type let us define . Then for any n greater than of Lemma 12, there exists at least one type such that
where the second inequality follows by Lemma 7 applied with . Then we can use such a type for a bound:
where (a) follows by (71), and (b) holds for the random variable , independent of the channel noise, and the channel input/output random vectors with the decoder
where are the indices of the codewords in with their quantized versions in .It follows now from (72) that the LHS of (70) can be upper-bounded by (67) of Lemma 11 with . Substituting then (71) in place of we obtain a stricter condition under the minimum of (67), leading to an upper bound with a condition and to (70). □
8. Correct-Decoding Exponent
The end result of this section is Lemma 17, which is a converse bound on the correct-decoding exponent by the method of types.
Lemma 14(Joint type constraint). For any and , there exists , such that for any and any pair of vectors satisfying
the respective quantized versions and , defined as in (54), satisfy
The proof is the same as (56)–(58).
We use a Chernoff bound for the probability of an event when the method of types cannot be applied:
Lemma 15(Chernoff bound). Let , , be n independent random variables. Then for and :
Lemma 16(Correct-decoding exponent of mono-composition codebooks). Let be a type, such that , and let be a codebook, such that the quantized versions (54) of its codewords , , are all of the type , that is:
*Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then for any and there exists , such that for any *
where , and , as , depending only on α, β, , , and .
Proof. We consider probabilities of two disjoint events:
For the first term in the maximum, we obtain:
where (a) holds with for all of Lemma 14, (b) holds because , and in (c) we use the notation for the set of all the joint types satisfying both and . By the same steps as in (68), we further obtain
while Lemma 10 gives
Now by Lemma 8 for the number of joint types and by (77)–(79), we obtain
where denotes the expression (75). Applying Lemma 15 to the second term in the maximum of (76) we obtain (73)–(75). □
Lemma 17(Correct-decoding exponent). *Let be a random variable, independent of the channel noise, and let be the random channel-input and channel-output vectors, respectively. Then for any and there exists , such that for any *
where and are as defined in (74) and (75), respectively, and , as , depending only on the parameters α, β, , , and .
Proof. Similarly as in [7] [Lemma 5]:
where:(a) holds for of Lemma 12;(b) follows by Lemma 7 with , while is a maximizer of (83);(c) holds for the channel input/output random vectors and a code , such that with for , and with for , where are the indices of the codewords in the original codebook with their quantized versions in . Since all the codewords of have their quantized versions in , we can apply Lemma 16 with for the RHS of (83) to obtain (80) and (81). □
9. PDF to Type
Lemmas 19 and 20 of this section relate between minimums over types and over PDF’s. The next Lemma 18, which has a laborious proof, is required only in the proof of Lemma 19, used for Theorem 1.
Lemma 18(Quantization of PDF). Let be an alphabet defined as in (9), (7) with . Let be a type and , , be a collection of functions from (12), such that , , and . Then for any alphabet defined as in (9), (7) with , there exists a joint type with the marginal type , such that
where , , and , as , and depends only on the parameters α, β, , , , and (through (12)).
The proof is given in the Appendix B.
Lemma 19(PDF to type). Let and be two alphabets defined as in (9), (7) with and . Then for any , , and there exists , such that for any and for any type with :
where , as , and depends only on the parameters α, β, , , and .
Proof. For a type with a collection of such that and , we can find also an upper bound on . For example, using the Cauchy–Schwarz inequality:
Then by Lemma 18 there exists a joint type with the marginal type , such that simultaneously the three inequalities (84)–(86) are satisfied and it also follows by (10) and (85) that
Then the sum of (84) and (88) gives
while the difference of (84) and (86) gives
Note that all in the above relations are independent of the joint type and the functions . Therefore by (89), (90) and (85) we conclude, that given any for n sufficiently large for every type with the prerequisites of this lemma and every collection of that satisfy the conditions under the infimum on the LHS of (87) there exists a joint type such that simultaneously
and (89) holds with a uniform , i.e., independent of and . It follows that such satisfies also the conditions under the minimum on the RHS of (87) and results in the objective function of (87) satisfying (89) with the uniform . Then the minimum itself, which can only possibly be taken over a greater variety of , satisfies the inequality (87). □
Lemma 20(Type to PDF). For any and there exists , such that for any and for any type :
where , as , and depends only on the parameters β and .
Proof. Observe first that any collection of conditional PDF’s that satisfies the conditions under the infimum of (91) has finite differential entropies and well-defined quantities and . For any conditional type from the LHS of (91) we can define a set of histogram-like conditional PDF’s:
which are step functions of for each . Then , and , as defined in (11). Analogously to (56)–(58), it can be obtained that
Then also . Then the lemma follows. □
We use Lemmas 19 and 20 in Section 4 in the derivation of Theorems 1 and 2, respectively.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Cover T.M. Thomas J.A. Elements of Information Theory John Wiley & Sons Hoboken, NJ, USA 2006
- 2Shannon C.E. Probability of Error for Optimal Codes in a Gaussian Channel Bell Syst. Tech. J.19593861165610.1002/j.1538-7305.1959.tb 03905.x · doi ↗
- 3Ebert P.M. Error Bounds For Parallel Communication Channels Technical Report 448Research Laboratory of Electronics at Massachusetts Institute of Technology Cambridge, MA, USA 1966
- 4Gallager R.G. Information Theory and Reliable Communication John Wiley & Sons Hoboken, NJ, USA 1968
- 5Oohama Y. The Optimal Exponent Function for the Additive White Gaussian Noise Channel at Rates above the Capacity Proceedings of the IEEE International Symposium on Information Theory (ISIT)Aachen, Germany 25–30 June 2017
- 6Csiszár I. Körner J. Information Theory: Coding Theorems for Discrete Memoryless Systems Academic Press Cambridge, MA, USA 1981
- 7Dueck G. Körner J. Reliability Function of a Discrete Memoryless Channel at Rates above Capacity IEEE Trans. Inf. Theory 197925828510.1109/TIT.1979.1056003 · doi ↗
- 8Nakiboğlu B. The Sphere Packing Bound for Memoryless Channels Probl. Inf. Transm.20205620124410.1134/S 0032946020030011 · doi ↗
