A Generalization of the DMC
Sergey Tridenski, Anelia Somekh-Baruch

TL;DR
This paper generalizes the discrete memoryless channel by using a uniform distribution over output sequences and derives key performance metrics.
Contribution
The paper introduces a new channel model and derives error and decoding exponents for a random ensemble of such channels.
Findings
An achievable error exponent is derived for the generalized channel model.
The optimal correct-decoding exponent is determined along with its converse.
The channel ensemble capacity is obtained as a corollary.
Abstract
We consider a generalization of the discrete memoryless channel, in which the channel probability distribution is replaced by a uniform distribution over clouds of channel output sequences. For a random ensemble of such channels, we derive an achievable error exponent, as well as its converse together with the optimal correct-decoding exponent, all as functions of information rate. As a corollary of these results, we obtain the channel ensemble capacity.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Israel Science Foundation (ISF)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Communication Security Techniques · Error Correcting Code Techniques · Cooperative Communication and Network Coding
1. Introduction
We consider the basic information-theoretic scenario of point-to-point communication. The standard go-to model for such a scenario is the discrete memoryless channel (DMC). With this model, the communication performance is characterized by the channel capacity, surrounded by the error and the correct-decoding exponents, as functions of the information rate. In order to be characterized by these quantities, the communication is usually performed with a codebook of blocks of n channel input symbols, conveying equiprobable messages, where R is the rate in bits.
In this paper, we slightly deviate from the standard DMC model. In our set-up, the DMC itself reappears as a limiting case. Consider first the following communication scheme. Let K be some positive real parameter in addition to the rate R, and suppose that there has been an exponentially large number of block transmissions through a DMC. Each transmitted block is a codeword of length n, chosen each time with uniform probability from the same codebook of size . This corresponds to a significant amount of transmitted information of bits. By the end of these transmissions, each of the codewords has been used approximately times, resulting in not necessarily distinct channel output vectors, forming an unordered “cloud”. The parameter K therefore represents an exponential size of the cloud of channel output vectors generated by a single codeword. Suppose that, in the end of the transmissions, the unordered outcome clouds of all the codewords are revealed to the decoder. For small K, when most of the output vectors in the clouds are distinct, this “revelation” would be approximately equivalent to a noiseless transmission of the same bits of information. For higher K, however, the description of the clouds will require an exponentially smaller number of noiseless bits compared to .
Note that, given the received channel output blocks with time indices , as well as the knowledge of the clouds, the optimal decoder for any given output block with an index j (optimal in the sense that it minimizes the probability of error for the block with the index j) chooses the codeword with the maximal number of replicas of this block in its cloud. This decoder is optimal regardless of the message probabilities or the transition probabilities of the DMC that created the clouds. Moreover, the same decoder, which relies on the clouds and is oblivious of the transition probabilities in the channel that created the clouds, remains optimal whether or not the channel is memoryless or time-invariant within each block, as long as it is memoryless and time-invariant by blocks.
As an alternative communication scheme, consider contemporaneous block transmissions to receivers through physically distinct channels, modeled as stochastically independent and identical DMCs. Each transmitted block is a codeword of length n, chosen independently of others with uniform probability from the same common codebook of size . Suppose that, with noiseless feedback, all the received channel output vectors become associated with the respective sent codewords on the side of the transmitters and then, with cooperation between the transmitters, this information becomes available (published) to the receivers in the form of unordered outcome clouds of the average size of channel output vectors associated with the codewords, as in the previous scheme. This can be seen as a joint estimation of physically distinct but stochastically identical channels. As soon as its received vector is published, a receiver can start decoding. Our current work shows that the smaller the clouds, the lower the average probability of error and the higher the capacity of the resulting channel.
Given the clouds, the receiver sees effectively a different channel—one that chooses its output vector with uniform probability from the cloud of the sent codeword. This channel can be described by a model, different from DMC. In this model, we assume that the messages are equiprobable and each cloud contains exactly vectors. The clouds are generated randomly i.i.d. with a channel-generating distribution, independently for each codeword in a codebook. This is similar to constant composition clouds used for superposition coding [1] through a noiseless channel. The capacity and the relevant probability exponents of this scheme can be given in the average sense, for the ensemble of random channels. As the exponential size of the clouds K tends to infinity, the random channel ensemble converges to a single channel with the transition probabilities of the channel-generating distribution, which is a DMC in our case [2,3,4].
In this paper, we complete our work [5]. We make a rigorous proof of the random-coding error exponent [5] [Theorem 1] and add an error exponent converse bound. We verify that the correct-decoding exponent converse [5] [Theorem 2] is achievable.
The paper is organized as follows. In Section 2, we start introducing our notation and define the channel model. In Section 3, we derive an achievable error exponent for the random channel ensemble. In Section 4 and Section 5, we provide converse results. We derive an upper bound on the optimal error exponent (in Section 4) and the optimal correct-decoding exponent (in Section 4 and Section 7) of the random channel ensemble. In Section 6, we obtain the channel ensemble capacity as a corollary of the previous sections.
2. Channel Model
Let and be letters from finite channel input and output alphabets, respectively. Let denote transition probabilities of a channel-generating distribution. The channel is generated for a given codebook of blocks of a length n of letters from . Let be such a codebook, consisting of codewords , , where R is a positive real number representing a communication rate.
Given this codebook and another positive real number K, a channel instance is generated with the distribution W, as follows. For each one of the messages m, an exponentially large number of sequences is generated randomly given the corresponding codeword , where each sequence is generated independently of others. Each letter , , of each such sequence is generated i.i.d. according to W given the corresponding letter of . In this way, the set of clouds of ’s of size for each m forms one channel instance.
We assume that the messages , represented by the codewords , are equiprobable. Given that a particular message is sent through the channel, the stochastic channel action now amounts to choosing exactly one of all the not necessarily unique vectors , corresponding to the sent message, with the uniform probability . We assume that the decoder, receiving the channel output vector , knows not only the codebook, but also the channel instance, i.e., all the clouds comprising the corresponding ’s.
A cloud can have more than one replica of the received vector . The maximum-likelihood (ML) decoder makes an error with non-vanishing probability , if there exists an incorrect message with the same or a higher number of replicas of in its cloud, comparing to the sent message itself. Otherwise, there is no error.
Let with indices and be all the cloud vectors, not necessarily distinct. Then, the encoder is a function , mapping the messages to the clouds, which is , . The ML decoder without loss of optimality is a deterministic function , given by
where the minimum is taken over the indices m in the set.
3. Achievable Error Exponent
Suppose the codebook is generated i.i.d. according to a distribution over with probabilities . Let denote the average error probability of the maximum-likelihood decoder, averaged over all possible messages, codebooks, and channel instances:
where represents the random received vector, while I is the random sent message, are randomly generated codewords, are the random cloud vectors, and the expectation is taken according to the independent and identical joint distribution . Let denote probabilities in an auxiliary distribution over and let us define the following:
where is the Kullback–Leibler divergence averaged over T, the expectation is with respect to the joint distribution , and . All the logarithms here and below are to base e. In what follows, we usually suppress the superscripts in and . Then, we can show the following:
Theorem 1 (Random-coding error exponent).
where is defined in (3).
We prove this theorem by separately deriving matching lower and upper bounds. For the lower bound, for , let us further define
where and are defined by (2). Our lower bound is given by Lemma 1, together with Lemmas 3 and 4 below.
Lemma 1 (Lower bound).
where is defined in (5)–(8).
In the proof of Lemma 1, we use the following auxiliary lemma:
Lemma 2 (Super-exponential bounds). *Let *, * *, 2,…, *be i.i.d. Bernoulli( ) random variables. Then, for any *,
where is a function of that satisfies as .
Proof. The result follows straightforwardly from Markov’s inequality for the random variable and , resp., as well as the inequality . □
Proof of Lemma 1. We will use to establish (9).Let be the sent codeword and be the received vector. The cloud of can contain more than one vector . The maximum-likelihood decoder makes an error with non-vanishing probability ; if there exists an incorrect codeword (not necessarily distinct from , but representing a different message and having therefore an independently generated cloud) with the same or a higher number of vectors in its cloud, compared to the sent codeword itself. Otherwise, there is no error.Consider an event where and have a joint empirical distribution (type with denominator n) , i.e., , where T is a distribution on and V is a conditional distribution on given a letter from . The exponent of the probability of this event (probability of type class in [6]) is given by
where the term vanishes uniformly w.r.t. , as .Consider now the competing codewords. The exponent of the probability of an event, in which appears somewhere in the clouds corresponding to the incorrect codewords, is given by
where is uniform w.r.t. T. To observe this, consider a possibly different (from V) conditional type of some w.r.t. . The exponent of the probability of an event, in which a certain incorrect codeword belongs to the conditional type given , is given by
where is uniform w.r.t. . The exponent of the probability of an event, that a certain in the cloud of equals , is given by . The exponent of the probability of an event, that in the cloud of of the type the vector appears at least once, is given by
where the term , vanishing as , depends on K. In particular, as a lower bound on the exponent, (14) follows trivially without from the union bound on the probability. Meanwhile, to confirm (14) as an upper bound on the exponent, denoting and , we can write similarly to [3] [Equation (14)]:
where is a function of K. Adding together (13) and (14), we obtain the exponent of the probability of an event, that a certain incorrect codeword is of the conditional type w.r.t. , and appears at least once in its cloud:
where is uniform w.r.t. . Finally, the exponent of the probability of an event, where there exists in the codebook at least one incorrect codeword of the conditional type w.r.t. and appears at least once in its cloud, is given by
where uniformly w.r.t. as and may depend on K and R, which yields (12).Suppose that . In this case, the exponent of the conditional probability of error, given that the received vector and the sent codeword belong to the joint type, can be lower bounded by (12), and the exponent of the (unconditional) probability of error due to all such cases is lower-bounded by
Consider now the opposite case when . For this case, recall that the exponent of the probability of an event, in which there exists at least one incorrect codeword of the conditional type w.r.t. , is given by . Suppose now that the conditional type is such that . For this case, we use Lemma 2, with . Using (11) for the correct cloud and (10) for the competing clouds, the probability of the event that the cloud of an incorrect codeword of the type has at least as many occurrences of the vector , compared to the correct codeword of the type V, can be upper-bounded uniformly by
That is, it tends to zero super-exponentially fast with n. The remaining types with allow us to write a lower bound on the exponent of the (unconditional) probability of error due to all the cases , as
Together, (17) and (19) cover all cases and the minimum between the two gives the lower bound on the error exponent.Observe that the objective function of (17) can also be used in (19), because in (19), the set over which the minimization is performed satisfies . Furthermore, for the lower bound, we can simply extend the minimization set in (17) and (19) from types to arbitrary distributions and . Therefore, omitting , in the limit of a large n, we can replace the minimum of the bounds (17) and (19) with (5). □
To complete the lower bound given by Lemma 1, we establish the next two lemmas.
Lemma 3 (Epsilon equals zero). The expression defined in (5)–(8) satisfies
Proof. Observe first that both (6) and (7) are convex (∪) functions of . This can be verified directly by the definition of convexity, using the property that is convex (∪) and is linear in the pair . Furthermore, by continuity of and , it follows that (6) and (7) are lower semi-continuous functions of . Observe next from (6) and (7) that at least one of them is necessarily finite at , i.e., . Suppose that . Then, is finite for and by the lower semi-continuity of the convex function . Then, we also obtain (20). Consider the opposite case . Then, (6) at is a minimization of a continuous function of over a closed non-empty set. Let be the distribution , achieving the minimum in (6) at . Then, necessarily (otherwise with there has to be ). Then, is finite for and by the lower semi-continuity of the convex function . Then, we once again obtain (20). □
Lemma 4 (Identity).
where the LHS and the RHS are defined by (5)–(8) and (3), respectively.
Proof. For , we can conveniently rewrite the minimum (5) between (6) and (7) in the following unified manner:
where in the objective function we used also the property that . Now, it is convenient to verify, that in (22) the conditional distribution without loss of optimality can be replaced with V. To this end suppose that some joint distributions and satisfy the condition under the minimum of (22).If , then, since also , we cannot increase the objective function of (22) by using in place of .If , then we cannot increase the objective function of (22) by using in place of .It follows that (3) is a lower bound on minimum (22). Finally, since (3) is also an upper bound on (22), we conclude that there is equality between (3) and (22). □
Combining (21), (20), and (9), we have that the RHS of (4) is a lower bound. It remains to show that it is also an upper bound.
Lemma 5 (Upper bound).
where is defined in (3).
In the proof of Lemma 5 we use the following auxiliary lemma:
Lemma 6 (Two competing clouds). Let and be two statistically independent binomial random variables with the parameters and , where is a constant. Then,
where depends on q and as satisfies .
The proof is given in the Appendix A. In the above Lemma, N and can describe the random numbers of replicas of in an incorrect cloud and the correct cloud, respectively.
Proof of Lemma 5. For the upper bound it is enough to consider the exponent of the probability of the event that the transmitted and the received blocks and have a joint type , while in the codebook there exists at least one incorrect codeword of the same conditional type V w.r.t. , and appears at least once in its cloud. As in the proof of Lemma 1, this exponent is given by
The additional exponent of the conditional probability of error given this event is , as follows immediately by Lemma 6, used with and with , or . In the limit of a large n, we can omit and by continuity minimize (25) over all distributions , to obtain the RHS of (23). □
This completes the proof of Theorem 1. An alternative representation of the error exponent of Theorem 1 is given by
Lemma 7 (Dual form).
where is defined in (3) and
Proof. Observe first that the minimum (3) can be lower-bounded as
Observe further, that the lower bound (29) is the lower convex envelope of (28) as a function of . Indeed, the minimum (28) is a non-increasing function of R, and therefore it cannot have lower supporting lines with slopes greater than 0. It also cannot have lower supporting lines with negative slopes below , as it decreases with the slope exactly in the region of negative or small positive values of R. Note that the objective function of the minimum (29) is continuous in in the closed region of . Let be the minimizing distribution of the minimum in (29) for a given . For this distribution, there exists a real such that the expression in the square brackets of (29) is zero. Therefore, there is equality between (29) and (28) at . And this is achieved for each , which corresponds to lower supporting lines of slopes between 0 and .Finally, we observe that there is in fact equality between (28) and (29) for all R, since (28) is a convex (∪) function of R and, therefore, it coincides with its lower convex envelope. Indeed, using the property , the objective function of the minimization (28) can be rewritten as a maximum of three terms:
Then, this objective function is convex (∪) in the triple , verified as a maximum of convex (∪) functions of . In particular, the convexity of in follows by the log-sum inequality [6]. By the definition of convexity, it is then verified that the minimum (28) itself is a convex (∪) function of R.So far, we have shown that (28) and (29) are equal. Consider now the minimum of (29) with any :
By the same reasoning as before, there is equality also between (30) and (31). Putting together (31) and (29) and denoting , we can rewrite (28) as
where the minimizing solution is
□
4. A Converse Theorem for the Error and Correct-Decoding Exponents
Let denote the average error probability of the maximum-likelihood decoder for a given codebook of block length n, averaged over all messages and channel instances:
where represents the random received vector, I is the random sent message, represent random codewords equal to the codewords of the given codebook , while are the random cloud vectors and the expectation is taken according to the independent and identical conditional distribution W, generating the clouds given . Let denote the mutual information of a pair of random variables with the joint distribution , and let us define:
where P and U are such that . Then, we can show the following.
Theorem 2 (Converse bounds).
where (36) holds for all and (37) holds a.e.: except possibly for such where there is a transition (a jump) from to a finite value of (35) as a monotonically non-increasing function of R.
Let denote the conditional average error probability of the maximum-likelihood decoder for a codebook , given that the joint type of the sent and the received blocks is . Theorem 2 is a corollary of the following upper bound on the corresponding conditional probability of correct decoding:
Lemma 8. For any constant composition codebook and any ,
where the term , vanishing uniformly w.r.t. as , depends on ϵ but does not depend on the choice of .
Proof. Suppose we are given a constant composition codebook , where all codewords are of the same type with empirical probabilities . Looking at the codebook as a matrix of letters from , of size , we construct a whole ensemble of block codes by permuting the columns of the matrix. Observe that the total number of code permutations in the ensemble is given by
where is the entropy of the empirical distribution P, and denotes the number of same-symbol permutations in the type P, i.e., the symbol permutations that do not change a codeword that is a member of the type.Suppose that, for each code in the ensemble, a separate independent channel instance is generated. And suppose that, for every transmission, one code in the ensemble (known to the decoder with its own channel instance) is chosen randomly with uniform probability over permutations. Consider an event where the sent codeword, chosen with uniform probability over the code permutations and the messages, together with the received vector have a joint type , such that . Since the channel-generating distribution is memoryless, this will result in the same conditional average probability of correct decoding given , when averaged over all messages and channel instances, as itself. In what follows, we will derive an upper bound on this probability.Let be the received vector of the type T. Consider the conditional type class of codewords with the empirical distribution V given the vector . Observe that the total number of all codewords in the ensemble belonging to this conditional type class (counted as distinct if corresponding to different code permutations or messages) is given by
where is the average entropy of the conditional distribution V given T, i.e., .Let us fix two small numbers and consider separately two cases. Suppose, first, that . In this case, the probability of an event in which the cloud of any in the ensemble contains less than or more than vectors by Lemma 2 uniformly tends to zero super-exponentially fast with n. Denote the complementary highly probable event as . Let k be an index of a code in the ensemble. Let denote the number of codewords from the conditional type class in the code of index k. Then, given the conditions that the received vector is , whereby the sent codeword belongs to , and , we observe that the conditional probability of the code k is upper-bounded by . Furthermore, given that indeed the code k is used for communication, the conditional probability of correct decoding is upper-bounded by . Summing over all codes, we can write
Consider now the second case when . In this case, the probability of an event in which the cloud of any in the ensemble contains more than occurrences of the vector by (10) of Lemma 2 uniformly tends to zero super-exponentially fast. Let us denote this rare event as . In fact, among the codewords , those with clouds containing become rare. However, the probability of an event where, in the ensemble, there are less than codewords from having at least one vector in their cloud uniformly tends to zero super-exponentially fast. This, in turn, can be verified similarly to (11) of Lemma 2, using (39). Let us denote this rare event as . Let us denote the complementary (to the union of the events and ) and highly-probable event as .Let denote the number of such codewords in the code k that both belong to the conditional type class and have at least one in their respective cloud. Then, given the intersection of events that the received vector is and that the sent codeword belongs to and , we obtain that the conditional probability of the code k is upper-bounded by . Given that the code k is used for communication, the conditional probability of correct decoding is upper-bounded by . Repeating the steps leading to (40), we obtain (41) once again. □
Proof of Theorem 2. First, we verify the bound on the correct-decoding exponent (36). It is enough to consider constant composition codes, because they can asymptotically achieve the same exponent of the correct-decoding probability as the best block codes, as is shown in the beginning of [7] [Lemma 5] using a suboptimal encoder–decoder pair.Thus, let be a constant composition codebook of a type P. Consider an event where the sent codeword together with the received vector have a joint type . The exponent of the probability of such event is given by .We then add the lower bound on the exponent of the conditional probability of correct decoding given of Lemma 8 in the following form:
minimizing the resulting expression over all distributions , discarding , and taking , we obtain (34).Next, we establish the bound on the error exponent (37). Here, it also suffices to consider constant composition codebooks , because there is only a polynomial number of different types in a general codebook of block length n.Turning (38) into a lower bound on , we can obtain the following upper bound on the error exponent of :
Here, (43) follows directly from Lemma 8 and the fact that the exponent of is . In (44), we extend the inner minimization from conditional types to arbitrary distributions U with the help of an additional in the minimization condition. In (45), we extend the outer maximization to arbitrary distributions P, and as a result, the maximum cannot decrease.In the limit of a large n, the vanishing term in (45) disappears and we are left with . In order to replace with zero, observe that both the objective function and the expression in the minimization condition of (45) are convex (∪) functions of U. It follows that the inner minimum of (45) is a convex (∪) function of . Therefore, (45) itself, as a maximum of convex functions of , is convex (∪) in . We conclude that by continuity of a convex function, the maximum in (45) tends to (35) as , with a possible exception when (45) jumps to exactly at , which corresponds to the jump to of (35) as a convex (∪) function of R exactly at R. □
5. Alternative Representation of the Converse Bounds
In this section, we develop alternative expressions for the converse bounds of Theorem 2. Using the properties that and , the expression (34) for the lower bound of Theorem 2 can be written also as , where
and and are defined in (2). An alternative expression for (46) is given by
Lemma 9 (Alternative representation—correct-decoding exponent).
where is defined by (46) and is defined as in (27).
Proof. We can rewrite (46) as a minimum of two terms:
Solution of each one of the terms is similar to the method of Lemma 7 and gives (47). □
The expression (35) for the upper bound of Theorem 2 can be written alternatively as
Lemma 10 (Alternative representation—upper bound on the error exponent).
where and are defined in (35) and (27), respectively.
The proof is given in the Appendix A. Examples of this bound together with the achievable error exponent as a lower bound are given in Figure 1. Note the discontinuities (jumps to ) in the upper bounds. Observing the alternative to (48) expression (A2), which appears in the proof of Lemma 10, it can be verified similarly to Lemma 7 that the discontinuity (jump to ) in (48) occurs at
For , this gives , so that there is no jump for .
6. The Capacity of the Channel Ensemble
Let us define the capacity of the channel ensemble generated with W, denoted as , as the supremum of rates R, for which there exists a sequence of codebooks of size with as , where is defined as in (33). Comparing (1) with (33), we conclude that
It follows that if the achievable error exponent (4) is positive, then there exists a sequence of codebooks of size such that drops to zero exponentially fast as . If, on the other hand, the lower bound (36) on the minimal correct-decoding exponent is positive, then for any sequence of codebooks of size , the probability of correct decoding tends to zero exponentially fast as , so that tends to 1. Then, must correspond to a point on the R-axis, at which both the maximal achievable error exponent and the lower bound on the minimal correct-decoding exponent of the channel ensemble are equal to zero. By the results of the previous sections, it turns out that there is only one such point. Examples are shown in Figure 2. We find the point in the following theorem.
Theorem 3 (Ensemble capacity).
where is the Shannon capacity of the DMC W, and with .
Proof. The maximal achievable error exponent, provided by Theorem 1, is
where is given by (3). The lower bound on the minimal correct-decoding exponent, given by Theorem 2, can be written as
where is given by (46). Since if for all , both expressions (3) and (46), as functions of R, meet zero at the same point, which is . This gives
where the last equality follows because with , . □
7. The Optimal Correct-Decoding Exponent
In fact, the lower bound (36) is achievable. As in Section 3, suppose the codebook is generated i.i.d. according to a distribution over with probabilities , and let denote the average error probability of the maximum-likelihood decoder, averaged over all possible messages, codebooks, and channel instances.
Lemma 11 (Achievable correct-decoding exponent).
where is defined in (46).
Proof. Consider the following suboptimal decoder. The decoder works with a single anticipated joint type of the sent codeword and the received vector . If the type of is not T, the decoder declares an error. Otherwise, in case the type of the received block is indeed T, the decoder looks for the indices of the codewords with the conditional type V w.r.t. , with at least one replica of in their clouds, and chooses one of these indices as its estimate of the transmitted message. The choice is made randomly with uniform probability, regardless of the actual number of replicas of in each cloud. If there are no codewords of the conditional type V w.r.t. with at least one in their cloud, then the decoder declares an error again.Let denote the random number of incorrect codewords of the conditional type V w.r.t. , with at least one replica of in their clouds, in the codebook. Then, the conditional probability of the correct decoding, given that the joint type of the received and the transmitted blocks is indeed , is given by
with Jensen’s inequality where the expectation is w.r.t. the randomness of both the incorrect codewords and their clouds. Note that the exponent of can be expressed as R minus (15) with . The RHS of (50) then results in the following upper bound on the exponent of the conditional probability of correct decoding:
Adding to this the exponent of the joint type , we obtain (46). □
Now, since , by (36) of Theorem 2 and Lemma 11, we have the following.
Theorem 4 (Optimal correct-decoding exponent).
where is defined in (34) and is defined as in Section 4. This exponent is achievable by random coding.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Somekh-Baruch A. On Achievable Rates and Error Exponents for Channels with Mismatched Decoding IEEE Trans. Inf. Theory 20156172774010.1109/TIT.2014.2385699 · doi ↗
- 2Gallager R.G. Information Theory and Reliable Communication John Wiley & Sons Hoboken, NJ, USA 1968
- 3Gallager R.G. The Random Coding Bound is Tight for the Average Code IEEE Trans. Inf. Theory 19731924424610.1109/TIT.1973.1054971 · doi ↗
- 4Arimoto S. On the Converse to the Coding Theorem for Discrete Memoryless Channels IEEE Trans. Inf. Theory 19731935735910.1109/TIT.1973.1055007 · doi ↗
- 5Tridenski S. Somekh-Baruch A. A Generalization of the DMC Proceedings of the IEEE Information Theory Workshop (ITW), 2020 IEEE Information Theory Workshop (ITW)Riva del Garda, Italy 11–15 April 2021
- 6Cover T.M. Thomas J.A. Elements of Information Theory John Wiley & Sons Hoboken, NJ, USA 1991
- 7Dueck G. Körner J. Reliability Function of a Discrete Memoryless Channel at Rates above Capacity IEEE Trans. Inf. Theory 197925828510.1109/TIT.1979.1056003 · doi ↗
- 8Feller W. An Introduction to Probability Theory and Its Applications 3rd ed.John Wiley & Sons Hoboken, NJ, USA 1968 Volume 1
