Rate and Nearly-Lossless State over the Gilbert–Elliott Channel
Amos Lapidoth, Ligong Wang

TL;DR
This paper calculates the capacity of the Gilbert–Elliott channel when the state sequence is revealed to the encoder and needs to be nearly losslessly transmitted to the receiver.
Contribution
The paper shows that the channel capacity is independent of the timing of state information availability at the encoder.
Findings
The capacity of the Gilbert–Elliott channel is determined for a specific transmission setting.
A Block-Markov coding scheme with backward decoding achieves the calculated capacity.
The capacity remains the same regardless of the state information timing at the encoder.
Abstract
The capacity of the Gilbert–Elliott channel is calculated for a setting in which the state sequence is revealed to the encoder and is, along with the transmitted message, to be conveyed to the receiver with a vanishing symbol error rate. Said capacity does not depend on whether the state sequence is provided to the encoder strictly causally, causally, or noncausally. It can be achieved using a Block-Markov coding scheme with backward decoding.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —Swiss National Science Foundation (SNSF)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Communication Security Techniques · Error Correcting Code Techniques · Cellular Automata and Applications
1. Introduction
The Gilbert–Elliott channel [1,2,3,4] is a binary-input, binary-output finite-state channel [5,6,7,8], which was proposed as a simple model for digital communications with bursty errors. Its state sequence is a stationary two-state Markov process whose Time-i state takes values in the set . When the Time-i state is 0, the Time-i channel output is the result of feeding the Time-i input through a Binary Symmetric Channel (BSC) of crossover probability ; when is 1, the output is the result of feeding through a BSC of crossover probability . Alternatively, we can describe the output sequence as being the componentwise mod-2 addition of the input sequence with a noise sequence , where the latter is a binary two-state Hidden Markov process.
The capacity of the Gilbert–Elliott channel is achieved by having the input sequence comprise Independent and Identically Distributed (IID) random bits. It is given, in bits, by , where denotes the entropy rate of the noise [3]. If the state sequence is revealed to the decoder, then capacity is still achieved by the above input distribution, but it now equals the weighted average of the capacities of the BSCs, namely, . Here, denotes the probability that is 1, and is the binary entropy function. Throughout this paper, logarithms are to base 2, and information is measured in bits.
Here, we study the case where the state sequence is revealed not to the decoder but to the encoder. We do not, however, seek the Shannon capacity but rather the “rate-and-nearly-lossless-state” capacity, where the decoder wishes to recover not only the transmitted message but also the state sequence. We thus require that the state sequence be conveyed to the decoder with a vanishing symbol error rate. This requirement is weaker than the requirement that the probability of the receiver correctly recovering the entire state sequence tends to one. Neither do we consider a general (nonzero) distortion constraint, as studied for memoryless channels in [9,10].
We solve for this capacity and show that it does not depend on whether the state sequence is provided to the encoder strictly causally, causally, or noncausally. Moreover, it can be achieved using a Block-Markov coding scheme with backward decoding.
2. The Gilbert–Elliott Channel
The Gilbert–Elliott channel has binary input, output, and state alphabets: . The evolution of the state is unaffected by the channel inputs: irrespective of the channel inputs, it forms a stationary time-homogeneous Markov chain of kernel
and stationary distribution
by which we mean that
Above and throughout, we use to denote , and we use to denote . We denote the entropy rate of the state sequence . It is given explicitly by [11]
Given , the channel from X to Y is a BSC with crossover probability , whereas, given , it is a BSC with crossover probability . Here, are arbitrary known constants. Thus, if we define
and
then we can express the behavior of the channel given the state as
3. The Rate-and-Nearly-Lossless-State Capacity
When discussing rate-R blocklength-n communications over the Gilbert–Elliott channel, we consider the message set with messages, one of which is to be conveyed to the receiver. The latter observes the output sequence and attempts to recover the transmitted message m and the state sequence . The decoder is thus specified by a function
As to the encoder, its structure depends on the manner in which the state information is revealed to it. In the strictly causal setting, the Time-i channel input may depend, not only on the transmitted message, but also on the past states; it is therefore denoted . The encoder is thus specified by n functions
with being . In the causal case, the Time-i channel input is denoted , and the encoder is specified by n functions
with being . Finally, in the noncausal case, the Time-i channel input is denoted , and the encoder is specified by one function
with being the i-th component of .
We refer to an encoder/decoder pair as a “coding scheme”. The probability of error associated with a given coding scheme and a given message m is calculated when the scheme’s encoder is used to transmit Message m, and the scheme’s decoder is used by the receiver. The average probability of error associated with a coding scheme is the arithmetic average of the probabilities of error associated with the different messages. It can be expressed as
when the transmitted message M is drawn equiprobably from .
The symbol error rate, or (expected) Hamming distortion, in reconstructing the state sequence is
again computed when the transmitted message M is drawn equiprobably from .
In all cases, we say that a rate R is achievable if there exists a sequence of coding schemes indexed by the blocklength for which and both tend to zero. The capacity is defined as the supremum of the achievable rates, with the understanding that, if no positive rate is achievable, then capacity is zero.
Our main result is the following theorem:
Theorem 1. Irrespective of whether the state information is provided to the encoder strictly causally, causally, or noncausally, if
is positive, then it equals the rate-and-nearly-lossless-state capacity of the Gilbert–Elliott channel; else, said capacity is zero.
The proof is provided in the next section. Expression (14) can be interpreted as the result of subtracting the optimal average description length of the state sequence from the capacity of the channel when the receiver is cognizant of the state.
4. Proof of Theorem 1
4.1. Converse
We prove the converse part of the theorem under the assumption that the state sequence is provided to the encoder noncausally. Let M be uniform over . We have
Here, (15) holds because is a function of and hence also of ; (17) because is binary and because conditioning reduces entropy; and (18) because is concave and by recalling the definition of in (13). Note that tends to zero when tends to zero.
Since the decoder needs to recover M with high probability, Fano’s inequality implies the existence of some sequence tending to zero with the blocklength such that
From these two inequalities, we obtain
Note that, above, we only used the chain rule and the fact that (i.e., that M and are independent). The three terms in the last line can be bounded or simplified as follows:
The converse now follows from (26), (27), (28), and (32).
4.2. Direct Part
For the direct part of the theorem, we assume that the state sequence is revealed to the encoder strictly causally and consider a Block-Markov coding scheme with backward decoding. Consider b blocks, each of k channel uses. Label the length-k typical state sequences . For each block, randomly generate a codebook with independent codewords (with to be specified later), each having k independent Bernoulli components. In the first block, use the entire codebook to send a message of bits. As to the transmission in Block i for , first check whether or not the state sequence in the previous block, namely, , was typical. If it was, set to be the label that is assigned to it; else, set . Use the generated codebook to send both and , the latter consisting of information bits, so that the total number of bits matches the size of the codebook.
After completing the transmission in all b blocks, add one extra block to transmit . To this end, the transmitter—ignoring any information about the states in this extra block—uses the Gilbert–Elliott channel to send (which we view as a message) and nothing else. The length of the extra block may be larger than k, but—as long as the Shannon capacity of our channel is positive—this will not affect the overall rate when we choose b to be very large. We will address this caveat shortly. However, first, we discuss the decoding.
To decode, we begin with the last bock to decode and thus recover . This task will be accomplished successfully provided that the last extra block is sufficiently long. With the state information at hand, we then decode both and from Block b. Both can be decoded correctly with high probability (as k grows large) provided that
We continue this procedure backwards: By the time we get to decoding Block i, we will have already reliably recovered in Block , so we could use it as decoder-side information. The overall information rate—when b tends to infinity—approaches
which can indeed be made arbitrarily close to (14).
We have now established that, provided that is appropriately chosen, as k grows large, the probability of the decoder correctly decoding both the message and tends to one. It only remains to note that this also means that the symbol error rate is small. Indeed, having with high probability guarantees that the symbol error rate in the first b blocks is close to zero, whereas the influence of the extra block (whose symbol error rate will typically be high, since we do not attempt to decode the states there) becomes negligible when we choose b to be large.
We now return to the caveat regarding the Shannon capacity and the extra block. We need to show that the Shannon capacity is positive whenever (14) is positive. (If (14) is not positive, there is no need for a direct part.) Recall that the Shannon capacity of the Gilbert–Elliott channel without any state information is , with denoting the noise random variables, which form a Hidden Markov process. This Shannon capacity is positive unless bit. Our concern regarding the caveat is thus only when is 1.
In Appendix A, we show the following:
Proposition 1. If , then
- are IID Bernoulli , and*
either are IID or .
If are IID Bernoulli and are IID, then (14) cannot be positive because, in this case, it equals
If , then (14) is also (trivially) not positive. Hence, indeed, the Shannon capacity is positive whenever (14) is positive.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Gilbert E.N. Capacity of a burst-noise channel Bell Syst. Tech. J.1960391253126510.1002/j.1538-7305.1960.tb 03959.x · doi ↗
- 2Elliott E.O. Estimates of error rates for codes on burst-noise channels Bell Syst. Tech. J.1963421977199710.1002/j.1538-7305.1963.tb 00955.x · doi ↗
- 3Mushkin M. Bar-David I. Capacity and coding for the Gilbert-Elliott channels IEEE Trans. Inf. Theory 1989351277129010.1109/18.45284 · doi ↗
- 4Han Y. Guillén i Fàbregas A. Fixed-memory capacity bounds for the Gilbert-Elliott channel Proceedings of the 2024 IEEE International Symposium on Information Theory (ISIT)Athens, Greece 7–12 July 2024155159
- 5Gallager R.G. Information Theory and Reliable Communication John Wiley & Sons Hoboken, NJ, USA 1968
- 6Goldsmith A. Varaiya P. Capacity, mutual information, and coding for finite-state Markov channels IEEE Trans. Inf. Theory 19964286888610.1109/18.490551 · doi ↗
- 7Permuter H.H. Weissman T. Goldsmith A.J. Finite state channels with time-invariant deterministic feedback IEEE Trans. Inf. Theory 20065564466210.1109/TIT.2008.2009849 · doi ↗
- 8Shrader B. Permuter H. Feedback capacity of the compound channel IEEE Trans. Inf. Theory 2009553629364410.1109/TIT.2009.2023727 · doi ↗
