Rate and Nearly-Lossless State over the Gilbert–Elliott Channel

Amos Lapidoth; Ligong Wang

PMC · DOI:10.3390/e27050494·May 2, 2025

Rate and Nearly-Lossless State over the Gilbert–Elliott Channel

Amos Lapidoth, Ligong Wang

PDF

Open Access

TL;DR

This paper calculates the capacity of the Gilbert–Elliott channel when the state sequence is revealed to the encoder and needs to be nearly losslessly transmitted to the receiver.

Contribution

The paper shows that the channel capacity is independent of the timing of state information availability at the encoder.

Findings

01

The capacity of the Gilbert–Elliott channel is determined for a specific transmission setting.

02

A Block-Markov coding scheme with backward decoding achieves the calculated capacity.

03

The capacity remains the same regardless of the state information timing at the encoder.

Abstract

The capacity of the Gilbert–Elliott channel is calculated for a setting in which the state sequence is revealed to the encoder and is, along with the transmitted message, to be conveyed to the receiver with a vanishing symbol error rate. Said capacity does not depend on whether the state sequence is provided to the encoder strictly causally, causally, or noncausally. It can be achieved using a Block-Markov coding scheme with backward decoding.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Diseases2

IID injury to

Funding1

—Swiss National Science Foundation (SNSF)

Keywords

causalGilbert–Elliott channelnoncausalrate-and-state capacitystate informationstrictly causal

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWireless Communication Security Techniques · Error Correcting Code Techniques · Cellular Automata and Applications

Full text

1. Introduction

The Gilbert–Elliott channel [1,2,3,4] is a binary-input, binary-output finite-state channel [5,6,7,8], which was proposed as a simple model for digital communications with bursty errors. Its state sequence is a stationary two-state Markov process whose Time-i state $[eqn]$ takes values in the set $[eqn]$ . When the Time-i state $[eqn]$ is 0, the Time-i channel output $[eqn]$ is the result of feeding the Time-i input $[eqn]$ through a Binary Symmetric Channel (BSC) of crossover probability $[eqn]$ ; when $[eqn]$ is 1, the output $[eqn]$ is the result of feeding $[eqn]$ through a BSC of crossover probability $[eqn]$ . Alternatively, we can describe the output sequence as being the componentwise mod-2 addition of the input sequence with a noise sequence $[eqn]$ , where the latter is a binary two-state Hidden Markov process.

The capacity of the Gilbert–Elliott channel is achieved by having the input sequence comprise Independent and Identically Distributed (IID) random bits. It is given, in bits, by $[eqn]$ , where $[eqn]$ denotes the entropy rate of the noise $[eqn]$ [3]. If the state sequence is revealed to the decoder, then capacity is still achieved by the above input distribution, but it now equals the weighted average of the capacities of the BSCs, namely, $[eqn]$ . Here, $[eqn]$ denotes the probability that $[eqn]$ is 1, and $[eqn]$ is the binary entropy function. Throughout this paper, logarithms are to base 2, and information is measured in bits.

Here, we study the case where the state sequence is revealed not to the decoder but to the encoder. We do not, however, seek the Shannon capacity but rather the “rate-and-nearly-lossless-state” capacity, where the decoder wishes to recover not only the transmitted message but also the state sequence. We thus require that the state sequence be conveyed to the decoder with a vanishing symbol error rate. This requirement is weaker than the requirement that the probability of the receiver correctly recovering the entire state sequence tends to one. Neither do we consider a general (nonzero) distortion constraint, as studied for memoryless channels in [9,10].

We solve for this capacity and show that it does not depend on whether the state sequence is provided to the encoder strictly causally, causally, or noncausally. Moreover, it can be achieved using a Block-Markov coding scheme with backward decoding.

2. The Gilbert–Elliott Channel

The Gilbert–Elliott channel has binary input, output, and state alphabets: $[eqn]$ . The evolution of the state is unaffected by the channel inputs: irrespective of the channel inputs, it forms a stationary time-homogeneous Markov chain of kernel

[eqn]

and stationary distribution

[eqn]

by which we mean that

[eqn]

Above and throughout, we use $[eqn]$ to denote $[eqn]$ , and we use $[eqn]$ to denote $[eqn]$ . We denote the entropy rate of the state sequence $[eqn]$ . It is given explicitly by [11]

[eqn]

Given $[eqn]$ , the channel from X to Y is a BSC with crossover probability $[eqn]$ , whereas, given $[eqn]$ , it is a BSC with crossover probability $[eqn]$ . Here, $[eqn]$ are arbitrary known constants. Thus, if we define

[eqn]

and

[eqn]

then we can express the behavior of the channel given the state as

[eqn]

3. The Rate-and-Nearly-Lossless-State Capacity

When discussing rate-R blocklength-n communications over the Gilbert–Elliott channel, we consider the message set $[eqn]$ with $[eqn]$ messages, one of which is to be conveyed to the receiver. The latter observes the output sequence $[eqn]$ and attempts to recover the transmitted message m and the state sequence $[eqn]$ . The decoder is thus specified by a function

[eqn]

As to the encoder, its structure depends on the manner in which the state information is revealed to it. In the strictly causal setting, the Time-i channel input may depend, not only on the transmitted message, but also on the past states; it is therefore denoted $[eqn]$ . The encoder is thus specified by n functions

[eqn]

with $[eqn]$ being $[eqn]$ . In the causal case, the Time-i channel input is denoted $[eqn]$ , and the encoder is specified by n functions

[eqn]

with $[eqn]$ being $[eqn]$ . Finally, in the noncausal case, the Time-i channel input is denoted $[eqn]$ , and the encoder is specified by one function

[eqn]

with $[eqn]$ being the i-th component of $[eqn]$ .

We refer to an encoder/decoder pair as a “coding scheme”. The probability of error associated with a given coding scheme and a given message m is $[eqn]$ calculated when the scheme’s encoder is used to transmit Message m, and the scheme’s decoder is used by the receiver. The average probability of error associated with a coding scheme is the arithmetic average of the probabilities of error associated with the different messages. It can be expressed as

[eqn]

when the transmitted message M is drawn equiprobably from $[eqn]$ .

The symbol error rate, or (expected) Hamming distortion, in reconstructing the state sequence is

[eqn]

again computed when the transmitted message M is drawn equiprobably from $[eqn]$ .

In all cases, we say that a rate R is achievable if there exists a sequence of coding schemes indexed by the blocklength for which $[eqn]$ and $[eqn]$ both tend to zero. The capacity is defined as the supremum of the achievable rates, with the understanding that, if no positive rate is achievable, then capacity is zero.

Our main result is the following theorem:

Theorem 1. Irrespective of whether the state information is provided to the encoder strictly causally, causally, or noncausally, if

[eqn]

is positive, then it equals the rate-and-nearly-lossless-state capacity of the Gilbert–Elliott channel; else, said capacity is zero.

The proof is provided in the next section. Expression (14) can be interpreted as the result of subtracting the optimal average description length of the state sequence from the capacity of the channel when the receiver is cognizant of the state.

4. Proof of Theorem 1

4.1. Converse

We prove the converse part of the theorem under the assumption that the state sequence is provided to the encoder noncausally. Let M be uniform over $[eqn]$ . We have

[eqn]

Here, (15) holds because $[eqn]$ is a function of $[eqn]$ and hence also of $[eqn]$ ; (17) because $[eqn]$ is binary and because conditioning reduces entropy; and (18) because $[eqn]$ is concave and by recalling the definition of $[eqn]$ in (13). Note that $[eqn]$ tends to zero when $[eqn]$ tends to zero.

Since the decoder needs to recover M with high probability, Fano’s inequality implies the existence of some sequence $[eqn]$ tending to zero with the blocklength such that

[eqn]

From these two inequalities, we obtain

[eqn]

Note that, above, we only used the chain rule and the fact that $[eqn]$ (i.e., that M and $[eqn]$ are independent). The three terms in the last line can be bounded or simplified as follows:

[eqn]

The converse now follows from (26), (27), (28), and (32).

4.2. Direct Part

For the direct part of the theorem, we assume that the state sequence is revealed to the encoder strictly causally and consider a Block-Markov coding scheme with backward decoding. Consider b blocks, each of k channel uses. Label the length-k typical state sequences $[eqn]$ . For each block, randomly generate a codebook with $[eqn]$ independent codewords (with $[eqn]$ to be specified later), each having k independent Bernoulli $[eqn]$ components. In the first block, use the entire codebook to send a message of $[eqn]$ bits. As to the transmission in Block i for $[eqn]$ , first check whether or not the state sequence in the previous block, namely, $[eqn]$ , was typical. If it was, set $[eqn]$ to be the label that is assigned to it; else, set $[eqn]$ . Use the generated codebook to send both $[eqn]$ and $[eqn]$ , the latter consisting of $[eqn]$ information bits, so that the total number of bits matches the size of the codebook.

After completing the transmission in all b blocks, add one extra block to transmit $[eqn]$ . To this end, the transmitter—ignoring any information about the states in this extra block—uses the Gilbert–Elliott channel to send $[eqn]$ (which we view as a message) and nothing else. The length of the extra block may be larger than k, but—as long as the Shannon capacity of our channel is positive—this will not affect the overall rate when we choose b to be very large. We will address this caveat shortly. However, first, we discuss the decoding.

To decode, we begin with the last bock to decode $[eqn]$ and thus recover $[eqn]$ . This task will be accomplished successfully provided that the last extra block is sufficiently long. With the state information $[eqn]$ at hand, we then decode both $[eqn]$ and $[eqn]$ from Block b. Both can be decoded correctly with high probability (as k grows large) provided that

[eqn]

We continue this procedure backwards: By the time we get to decoding Block i, we will have already reliably recovered $[eqn]$ in Block $[eqn]$ , so we could use it as decoder-side information. The overall information rate—when b tends to infinity—approaches

[eqn]

which can indeed be made arbitrarily close to (14).

We have now established that, provided that $[eqn]$ is appropriately chosen, as k grows large, the probability of the decoder correctly decoding both the message and $[eqn]$ tends to one. It only remains to note that this also means that the symbol error rate $[eqn]$ is small. Indeed, having $[eqn]$ with high probability guarantees that the symbol error rate in the first b blocks is close to zero, whereas the influence of the extra block (whose symbol error rate will typically be high, since we do not attempt to decode the states there) becomes negligible when we choose b to be large.

We now return to the caveat regarding the Shannon capacity and the extra block. We need to show that the Shannon capacity is positive whenever (14) is positive. (If (14) is not positive, there is no need for a direct part.) Recall that the Shannon capacity of the Gilbert–Elliott channel without any state information is $[eqn]$ , with $[eqn]$ denoting the noise random variables, which form a Hidden Markov process. This Shannon capacity is positive unless $[eqn]$ bit. Our concern regarding the caveat is thus only when $[eqn]$ is 1.

In Appendix A, we show the following:

Proposition 1. If $[eqn]$ , then

$[eqn]$ are IID Bernoulli $[eqn]$ , and*

either $[eqn]$ are IID or $[eqn]$ .

If $[eqn]$ are IID Bernoulli $[eqn]$ and $[eqn]$ are IID, then (14) cannot be positive because, in this case, it equals

[eqn]

If $[eqn]$ , then (14) is also (trivially) not positive. Hence, indeed, the Shannon capacity is positive whenever (14) is positive.

Bibliography11

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Gilbert E.N. Capacity of a burst-noise channel Bell Syst. Tech. J.1960391253126510.1002/j.1538-7305.1960.tb 03959.x · doi ↗
2Elliott E.O. Estimates of error rates for codes on burst-noise channels Bell Syst. Tech. J.1963421977199710.1002/j.1538-7305.1963.tb 00955.x · doi ↗
3Mushkin M. Bar-David I. Capacity and coding for the Gilbert-Elliott channels IEEE Trans. Inf. Theory 1989351277129010.1109/18.45284 · doi ↗
4Han Y. Guillén i Fàbregas A. Fixed-memory capacity bounds for the Gilbert-Elliott channel Proceedings of the 2024 IEEE International Symposium on Information Theory (ISIT)Athens, Greece 7–12 July 2024155159
5Gallager R.G. Information Theory and Reliable Communication John Wiley & Sons Hoboken, NJ, USA 1968
6Goldsmith A. Varaiya P. Capacity, mutual information, and coding for finite-state Markov channels IEEE Trans. Inf. Theory 19964286888610.1109/18.490551 · doi ↗
7Permuter H.H. Weissman T. Goldsmith A.J. Finite state channels with time-invariant deterministic feedback IEEE Trans. Inf. Theory 20065564466210.1109/TIT.2008.2009849 · doi ↗
8Shrader B. Permuter H. Feedback capacity of the compound channel IEEE Trans. Inf. Theory 2009553629364410.1109/TIT.2009.2023727 · doi ↗