The Arbitrarily Varying Channel with Colored Gaussian Noise
Uzi Pereg, Yossef Steinberg

TL;DR
This paper investigates the capacity of arbitrarily varying channels with colored Gaussian noise, extending classical models by incorporating frequency domain analysis and game-theoretic insights, and demonstrating the suboptimality of scalar coding.
Contribution
It introduces capacity results for AVCs with colored Gaussian noise using double water filling in the frequency domain, connecting game theory and capacity analysis.
Findings
Deterministic and random code capacities are characterized for various AVC models.
Double water filling in frequency domain is optimal for AVC with colored Gaussian noise.
Scalar coding is suboptimal for the arbitrarily varying Gaussian product channel.
Abstract
We address the arbitrarily varying channel (AVC) with colored Gaussian noise. The work consists of three parts. First, we study the general discrete AVC with fixed parameters, where the channel depends on two state sequences, one arbitrary and the other fixed and known. This model can be viewed as a combination of the AVC and the time-varying channel. We determine both the deterministic code capacity and the random code capacity. Super-additivity is demonstrated, showing that the deterministic code capacity can be strictly larger than the weighted sum of the parametric capacities. In the second part, we consider the arbitrarily varying Gaussian product channel (AVGPC). Hughes and Narayan characterized the random code capacity through min-max optimization leading to a "double" water filling solution. Here, we establish the deterministic code capacity and also discuss the game-theoretic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The Arbitrarily Varying Channel with Colored Gaussian Noise
Uzi Pereg* 1* and Yossef Steinberg* 2*
1 Institute for Communications Engineering, Technical University of Munich
2 Department of Electrical Engineering, Technion
Email: [email protected], [email protected]
Abstract
We address the arbitrarily varying channel (AVC) with colored Gaussian noise. The work consists of three parts. First, we study the general discrete AVC with fixed parameters, where the channel depends on two state sequences, one arbitrary and the other fixed and known. This model can be viewed as a combination of the AVC and the time-varying channel. We determine both the deterministic code capacity and the random code capacity. Super-additivity is demonstrated, showing that the deterministic code capacity can be strictly larger than the weighted sum of the parametric capacities.
In the second part, we consider the arbitrarily varying Gaussian product channel (AVGPC). Hughes and Narayan characterized the random code capacity through min-max optimization leading to a “double” water filling solution. Here, we establish the deterministic code capacity and also discuss the game-theoretic meaning and the connection between double water filling and Nash equilibrium. As in the case of the standard Gaussian AVC, the deterministic code capacity is discontinuous in the input constraint, and depends on which of the input or state constraint is higher. As opposed to Shannon’s classic water filling solution, it is observed that deterministic coding using independent scalar codes is suboptimal for the AVGPC.
Finally, we establish the capacity of the AVC with colored Gaussian noise, where double water filling is performed in the frequency domain. The analysis relies on our preceding results, on the AVC with fixed parameters and the AVGPC.
Index Terms:
Arbitrarily varying channel, water filling, colored Gaussian noise, time varying channel, Gaussian product channel, deterministic code, random code.
†† This work was supported by the Israel Science Foundation (grant No. 1285/16).
I Introduction
A channel with colored Gaussian noise was first studied by Shannon [94], introducing the water filling optimal power allocation. This channel is the spectral counterpart of the Gaussian product channel (see e.g. [27, Section 9.5]). Those results led to useful algorithms for DSL and OFDM systems, and were generalized to multiple-input multiple output (MIMO) wireless communication systems as well (see e.g. [99, 38, 12, 11, 93, 41]). Furthermore, for some networks, water filling is performed in multiple stages [26, 111, 113, 114, 71, 105]. A limit formula for the capacity of the general time-varying channel (TVC) is given in [102] (see also [29, 47, 3, 33, 10, 76, 87, 112]). Another relevant setting is that of a finite-state channel, where the state evolves as a Markov chain [110, 74, 14, 73, 46, 100, 98]. In practice, there is often uncertainty regarding channel statistics, due to a variety of causes such as fading in wireless communication [95, 92, 1, 80, 42, 25, 59, 57], memory faults in storage [68, 51, 69, 66], malicious attacks on identification systems [45, 62], and cyber-physical warfare [97, 72, 104]. The arbitrarily varying channel (AVC) is an appropriate model to describe such a situation [16, 73].
Blackwell et al. [16] determined the random code capacity of the general AVC, i.e. the capacity achieved with shared randomness between the encoder and the decoder. It was also demonstrated in [16] that the random code capacity is not necessarily achievable using deterministic codes. A well-known result by Ahlswede [5] is the dichotomy property of the AVC, i.e. the deterministic code capacity, also referred to as ‘capacity’, either equals the random code capacity or else, it is zero. Subsequently, Ericson [37] and Csiszár and Narayan [30] have established a simple single-letter condition, namely non-symmetrizability, which is both necessary and sufficient for the capacity to be positive. Schaefer et al. [91] demonstrated the super-additivity phenomenon, i.e. when the capacity of a product of orthogonal AVCs is strictly larger than the sum of the capacities of the components. Csiszár and Narayan [31, 30] also considered the AVC when input and state constraints are imposed on the user and the jammer, respectively, due to their power limitations. Not only the constrained setting provokes serious technical difficulties analytically, but also, as shown in [30], constraints have a significant effect on the behavior of the capacity. Specifically, it is shown in [30] that dichotomy in the sense of [5] no longer holds when state constraints are imposed on the jammer. That is, the deterministic code capacity of the general AVC can be lower than the random code capacity, and yet non-zero.
The Gaussian AVC is specified by the relation , where and are the input and output sequences, respectively; is a state sequence of unknown joint distribution , not necessarily independent nor stationary; and the noise sequence is i.i.d. . The state sequence can be thought of as if generated by an adversary, or a jammer, who randomizes the channel states arbitrarily in an attempt to disrupt communication. It is also possible for to be a deterministic unknown state sequence. It is assumed that the user and the jammer have power limitations, and are subject to input and state constraints, and , respectively, where is the transmission length. In [60], Hughes and Narayan showed that the random code capacity is given by \mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}_{1}=\frac{1}{2}\log(1+\frac{\Omega}{\sigma^{2}+\Lambda}). Subsequently, Csiszár and Narayan [32] showed that the deterministic code capacity is given by
[TABLE]
It is noted in [32] that this result is not a straightforward consequence of the elegant Elimination Technique [5], used by Ahlswede to establish dichotomy for the AVC without constraints. Hosseinigoki and Kosut [57] determined the capacity in multiple side information scenarios for the Gaussian AVC with fast fading. Hughes and Narayan [61] determined the random code capacity of the arbitrarily varying Gaussian product channel (AVGPC), and showed that it is obtained as a “double” water filling solution to an optimization min-max problem, maximizing over input power allocation and minimizing over state power allocation. In the solution, the jammer performs water filling first, attempting to whiten the overall noise as much as possible, and then the user performs water filling taking into account the total interference power, contributed by both the channel noise and the jamming signal [61]. The Gaussian AVC is also considered in [4, 101, 70, 88, 90, 56, 59].
Extensive research has been conducted on other AVC models as well, of which we name a few. Recently, the arbitrarily varying wiretap channel has been extensively studied, as e.g. in [77, 17, 9, 18, 19, 78, 48, 2], including input and state constraints in [13, 64, 40]. The capacity region of the arbitrarily varying multiple access channel (MAC) with and without constraints is characterized in [85, 63, 7, 8]; capacity bounds for the arbitrarily varying broadcast channel are derived in [63, 52]; and for the arbitrarily varying relay channel in [83, 81]. Additional results on arbitrarily varying multi-user channels and constraints are derived e.g. in [108, 24, 50, 106, 84, 65]. Transmission of an arbitrarily varying Wyner-Ziv source over a Gel’fand-Pinsker channel is considered in [109, 107], and related problems were recently presented in [24, 22, 21]. Various Gaussian AVC networks are studied e.g. in [89, 49, 23, 54, 55, 82, 83, 85, 58].
In this paper, we address the AVC with colored Gaussian noise. The body of this manuscript consists of three parts, of which the first and the second can also be viewed as milestones on our path to the main result. First, we study the general discrete AVC with fixed parameters. This model is a combination of the TVC and the AVC, as the channel depends on two state sequences, one arbitrary and the other fixed. We determine both the deterministic code capacity and the random code capacity. Deterministic code super-additivity is demonstrated, showing that the capacity can be strictly larger than the weighted sum of the parametric capacities. In the second part of this paper, we establish the deterministic code capacity of the AVGPC, where there is white Gaussian noise and no parameters. We also give observations and discuss the game-theoretic interpretation of Hughes and Narayan’s random code characterization [61], and the connection between the double water filling solution and the idea of Nash equilibrium in game theory. We further examine the connection between the AVGPC and the product MAC [26, 71] (without a state), pointing out the similarities and differences between the models, results, and interpretation. As in the case of the standard Gaussian AVC, the deterministic code capacity is discontinuous in the input constraint, and depends on which of the input or state constraint is higher. As opposed to Shannon’s classic water filling solution [94], it is observed that deterministic coding using independent scalar codes is suboptimal for the AVGPC. Finally, we establish the capacity of the AVC with colored Gaussian noise, where double water filling is performed in the frequency domain.
While the results on the AVC with fixed parameters and on the AVGPC stand in their own right, they also play a key role in our proof of the main capacity theorem for the AVC with colored Gaussian noise. In the random code analysis for the AVC with fixed parameters, we modify Ahlswede’s Robustification Technique (RT) [6]. Essentially, the RT uses a reliable code for the compound channel to construct a random code for the AVC applying random permutations to the codeword symbols. A straightforward application of Ahlswede’s RT does not work here, since the user cannot apply permutations to the parameter sequence. Hence, we give a modified RT which is restricted to permutations that do not affect the parameter sequence, i.e. such that the parameter sequence is an eigenvector of all of our permutation matrices. The second part of the paper builds on identifying the symmetrizing jamming strategies and minimal symmetrizability costs for the AVGPC. At last, we use the results on the AVC with fixed parameters and the AVGPC in our proof of the capacity theorem for the AVC with colored Gaussian noise. By orthogonalization of the noise covariance, the AVC with colored Gaussian noise is transformed into an AVC with fixed parameters, which are determined by the spectral representation of the noise covariance matrix. This in turn yields double water-filling optimization in analogy to the AVGPC.
II Channels with Fixed Parameters
In this section we consider the AVC with fixed parameters. The results in this section will be used to analyze the AVC with colored Gaussian noise.
II-A Notation
We use the following notation. Calligraphic letters are used for finite sets. Lowercase letters stand for constants and values of random variables, and uppercase letters stand for random variables. The distribution of a random variable is specified by a probability mass function (pmf) over a finite set . The set of all pmfs over is denoted by . The set of all probability kernels is denoted by . We use to denote a sequence of letters from . A random sequence and its distribution are defined accordingly. For a pair of integers and , , we define the discrete interval .
The type of a given sequence is defined as the empirical distribution for , where is the number of occurrences of the symbol in the sequence . A type class is denoted by . Similarly, define the joint type for , , where is the number of occurrences of the symbol pair in the sequence . Then, a conditional type is defined as . Furthermore, we define the -typical set with respect to a distribution by
[TABLE]
The distribution of a real random variable is represented by a cumulative distribution function (cdf) over the real line, or alternatively, the probability density function (pdf) , when it exists. The notation is used when it is understood from the context that the length of the sequence is , and the -norm of is denoted by . The trace of a matrix is denoted by .
II-B Channel Description
A state-dependent discrete memoryless channel (DMC) with parameters consists of finite input alphabet , state alphabet , parameters alphabet , output alphabet , and a conditional pmf over . The channel is without feedback, and it is memoryless when conditioned on the state and parameter sequences, i.e.
[TABLE]
The AVC with fixed parameters is a DMC where the parameter sequence is fixed, while the state sequence has an unknown distribution, not necessarily independent nor stationary. That is, the parameter is sequence is given by
[TABLE]
where is a given sequence of letters from , known to the encoder, decoder, and jammer. Whereas, the state sequence with an unknown joint pmf over . In particular, could give mass to some state sequence . The AVC with fixed parameters is denoted by , where is a short notation for the sequence .
The compound channel with fixed parameters is used as a tool in the analysis. Different models of compound channels are described in the literature [29]. Here, the compound channel with fixed parameters is a DMC where the state has a conditional product distribution that is not known in exact, but rather belongs to a family of conditional distributions , with . That is,
[TABLE]
with an unknown conditional pmf . We note that this differs from the classical definition of the compound channel, as in [29], where the state is fixed throughout the transmission.
Remark 1*.*
Note that the special case of a channel , with a constant parameter for , reduces to the standard state-dependent DMC. Thereby, the AVC with a constant parameter can be regarded as the traditional AVC, as introduced by Blackwell et al. [16]. On the other hand, the special case of a channel , which does not depend on the state , reduces to a TVC [102].
Remark 2*.*
The AVC with colored Gaussian noise does not fit the description above. Nevertheless, the fixed parameters model is a crucial tool for our final goal, i.e. to determine the capacity of the AVC with colored Gaussian noise.
II-C Coding
We introduce some preliminary definitions.
Definition 1* (Code).*
A code for the AVC with fixed parameters consists of the following; a message set , where is assumed to be an integer, an encoding function , and a decoding function .
Given a message and and a parameter sequence , the encoder transmits the codeword . The decoder receives the channel output , and finds an estimate of the message . We denote the code by .
We proceed now to coding schemes when using stochastic-encoder stochastic-decoder pairs with common randomness.
Definition 2* (Random code).*
A random code for the AVC with fixed parameters consists of a collection of codes , along with a probability distribution over the code collection . We denote such a code by .
II-D Input and State Constraints
Next, we consider input constraints and state constraint, imposed on the encoder and the jammer, respectively. We note that the constraints specifications are known to both the user and the jammer in this model. Let , , and be some given bounded functions, and define
[TABLE]
Let and . Below, we specify the input constraint and state constraint , corresponding to the functions and , respectively. It is assumed that for some and , .
As the parameter sequence is fixed and known to the encoder, the decoder and the jammer, the input and state constraints below are specified for a particular sequence. Given an input constraint , the encoding function needs to satisfy
[TABLE]
That is, the input sequence satisfies with probability .
Moving to the state constraint , we have different definitions for the AVC and for the compound channel. The compound channel has a constraint on average, where the state sequence satisfies , while the AVC has an almost-surely constraint, with probability (w.p.) . Explicitly, we say that a compound channel is under a state constraint if , where
[TABLE]
This includes the case of a deterministic unknown state sequence, i.e. when gives probablity to a particular with .
II-E Capacity Under Constraints
We move to the definition of an achievable rate and the capacity of the AVC with fixed parameters under input and state constraints. Codes over the AVC with fixed parameters are defined as in Definition 1, with the additional constraint (8) on the codebook.
Define the conditional probability of error of a code given a state sequence by
[TABLE]
Definition 3* (Achievable rate and capacity under constraints).*
A code is a called a code for the AVC with fixed parameters under input constraint and state constraint , when (8) is satisfied and
[TABLE]
or, equivalently, for all with .
We say that a rate is achievable under constraints if for every and sufficiently large , there exists a code for the AVC with fixed parameters under input constraint and state constraint . The operational capacity is defined as the supremum of achievable rates, and it is denoted by . We use the term ‘capacity’ referring to this operational meaning, and in some places we call it the deterministic code capacity in order to emphasize that achievability is measured with respect to deterministic codes.
Analogously to the deterministic case, a random code satisfies the requirements
[TABLE]
The capacity region achieved by random codes is then denoted by \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}), and it is referred to as the random code capacity.
The definitions above are naturally extended to the compound channel with fixed parameters, under input constraints and state constraint , by limiting the requirements (8), (12) and (13) to conditionally memoryless state distributions . The respective deterministic code capacity and random code capacity \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}^{\mathcal{Q}}) are defined accordingly.
III Main Results – Channels with Fixed Parameters
In this section, we establish the random code capacity of the AVC with fixed parameters. To this end, we first give an auxiliary result on the compound channel.
III-A The Compound Channel with Fixed Parameters
We begin with the capacity theorem for the compound channel . This is an auxiliary result, obtained by a simple extension of [29, Exercise 6.8]. A similar result appears in [74] as well. Given a parameter squence of a fixed length, define
[TABLE]
with , where is the type of the parameter sequence .
Lemma 1*.*
The capacity of the compound channel with fixed parameters, under input constraint and state constraint , is given by
[TABLE]
and it is identical to the random code capacity, i.e. \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}^{\mathcal{Q}})=\mathbb{C}(\mathcal{W}^{\mathcal{Q}}).
The proof of Lemma 1 is given in Appendix A.
III-B The AVC with Fixed Parameters – Random Code Capacity
We determine the random code capacity of the AVC with fixed parameters, , under input constraint and state constraint . The random code derivation is based on our result on the compound channel with fixed parameters and a variation of Ahlswede’s Robustification Technique (RT). Define
[TABLE]
We begin with a lemma, based on Ahlswede’s RT [6] (see also [82, Lemma 9]). We modify it here to include the parameter sequence and the constraint on the family of conditional state distributions .
Lemma 2* (Modified RT).*
Let be a given function. If, for some fixed , and for all , with ,
[TABLE]
then,
[TABLE]
where is the set of all -tuple permutations such that , and .
Originally, Ahlswede’s RT is stated so that (17) holds for any , without state constraint (see [6]), and without conditioning on the parameter sequence . We give the proof of Lemma 2 in Appendix B. Next, we give our random code capacity theorem.
Theorem 3*.*
The random code capacity of the AVC with fixed parameters, under input constraint and state constraint , is given by
[TABLE]
The proof of Theorem 3 is given in Appendix C. The proof is based on our extension of Ahlswede’s RT above. Essentially, we use a reliable code for the compound channel to construct a random code for the AVC by applying random permutations to the codeword symbols. However, here, we only use permutations that do not affect the parameter sequence . The result above plays a central role in the proof of the capacity theorem in Section V, where the AVC with colored Gaussian noise is considered.
We also give an equivalent formulation in terms of the random code capacity of the traditional AVC. As mentioned in Remark 1, the case of an AVC with a constant parameter reduces to the traditional AVC under input and state constraints. For this channel, Csiszár and Narayan [31] showed that the random code capacity is given by
[TABLE]
where the last equality is due to the minimax theorem [96]. Then, define
[TABLE]
Lemma 4*.*
[TABLE]
The proof of Lemma 4 is given in Appendix D. Theorem 3 and Lemma 4 yield the following consequence.
Corollary 5*.*
The random code capacity of the AVC with fixed parameters, under input constraint and state constraint , is given by
[TABLE]
The corollary will also be useful in our analysis of the AVC with colored Gaussian noise.
III-C The AVC with Fixed Parameters – Deterministic Code Capacity
We move to the deterministic code capacity of the AVC with fixed parameters, , under input constraint and state constraint .
III-C1 Capacity Theorem
Before we state the capacity theorem, we give a few definitions. We begin with symmetrizability of a channel without parameters.
Definition 4* (see [30]).*
A state-dependent DMC is said to be symmetrizable if for some conditional distribution ,
[TABLE]
Equivalently, the channel is symmetric, i.e. , for all and . We say that such a symmetrizes .
Intuitively, symmetrizability identifies a poor channel, where the jammer can impinge the communication scheme by randomizing the state sequence according to , for some codeword . Suppose that the transmitted codeword is . The codeword can be thought of as an impostor sent by the jammer. Now, since the “average channel” is symmetric with respect to and , the two codewords appear to the receiver as equally likely. Indeed, by [37], if the AVC without parameters and free of constraints is symmetrizable, then its capacity is zero.
We will assume that either the channels are all symmetrizable, or the number of non-symmetrizable channels grows linearly with . That is,
[TABLE]
The asymptotic notation means that there exist and such that for all . An intuitive explanantion for this assumption is given in Remark 3 below. Next, we define a symmetrizability cost and threshold for the AVC with fixed parameters. For every and with
[TABLE]
define the minimal symmetrizability cost by
[TABLE]
where the minimization is over the conditional distributions that symmetrize , for (see Definition 4). We use the convention that a minimum value over an empty set is . Note that the last equality in (27) holds since is defined as the type of the parameter sequence , hence averaging over time is the same as averaging according to . In addition, define the symmetrizability threshold
[TABLE]
Intuitively, is the minimal average state cost which the jammer has to pay to symmetrize the channel at each time instance, for a given conditional input distribution . If this minimal state cost violates the state constraint , then the jammer is prohibited from symmetrizing the channel. Indeed, we will show that if there exists an input distribution with and for large , then the deterministic code capacity is positive. The symmetrizability threshold is the worst symmetrizability cost from the jammer’s perspective.
Our capacity result is stated below. Let
[TABLE]
with , where is the type of the parameter sequence with a fixed length .
Theorem 6*.*
Assume that for sufficiently large and that (25) holds. The capacity of an AVC with fixed parameters, under input constraint and state constraint , is given by
[TABLE]
In particular, if the channels , , are non-symmetrizable, then \mathbb{C}(\mathcal{W})=\mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W})=\, \liminf\limits_{n\rightarrow\infty}\mathsf{C}_{n}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}). That is, the deterministic code capacity coincides with the random code capacity.
The proof of Theorem 6 is given in Appendix G. The theorem will also play a central role in the proof of the capacity theorem in Section V.
Remark 3*.*
Observe that the second part of the theorem implies that for the case where there are no constraints, i.e. and , non-symmetrizability is a sufficient condition for positive capacity. Specfically, according to the definition of , in (27)-(28), if some of the channels are non-symmetrizable, then the symmetrizability threshold is , hence the capacity is positive. Intuitively, if the number of such channels is constant, i.e. for all , it seems that this assignment of does not make sense, since the user cannot achieve positive rates by coding over a negligible fraction of the block. Yet, our assumption in (25) excludes this scenario. In particular, if is non-zero, then we assume that grows linealy in , in which case positive rates can be achieved by coding over the part of the block that lies within . Furthermore, without constraints, we may replace the linear growth assumption with a poly-logarithmic one, i.e. , with . Indeed, based on Ahlswede’s elimination technique [5], the random code capacity can be achieved with a code collection of polynomial size, . Therefore, without state constraints, the random element can be reliably sent to the receiver over the sub-block , at rate , which tends to zero as , hence the decrease in the overall rate is negligible as well. We deduce that if , then the deterministic code capacity of the AVC with fixed parameters without constraints is the same as the random code capacity, i.e.
[TABLE]
Remark 4*.*
Even in the case where there are no parameters, the boundary case where is an open problem. Although, for the traditional AVC, it is conjectured in [30] that the capacity is zero in this case. Similarly, we conjecture that the capacity of the AVC with fixed parameters is given by for all values of , provided that (25) holds. There are special cases where we know that this holds, given in the corollary below. The corollary is based on the remark following Theorem 3 in [30].
Corollary 7*.*
Let be an AVC with fixed parameters such that all channels , , are symmetrizable. If the minimum in (27) is attained by a [math]- law, for every and with , then
[TABLE]
The proof of Corollary 7 is given in Appendix H. In particular, we note that the condition of [math]- law in Corollary 7 holds when the output is a deterministic function of , , and . As opposed to Theorem 6, the statement in Corollary 7 holds for all values of .
III-C2 Decoding Rule
We specify the decoding rule and state the corresponding properties, which are used in the analysis. To specify the decoding rule, we define the decoding sets , for , such that iff .
Definition 5* (Decoder).*
Given the codebook , declare that if there exists with such that the following hold.
For that is distributed according to the joint type , we have that
[TABLE] 2. 2)
For every such that for some with ,
[TABLE]
where , we have that
[TABLE]
We note that in Definition 5, the variables are dummy random variables, distributed according to the joint type of , where is a “tested” codeword, is a competing codeword, is a “tested” state sequence, is a competing state sequence, and is the received sequence. None of the sequences are random here. We may have that the conditional type differs from the actual channel . Therefore, the divergences and mutual informations in Definition 5 could be positive.
For the definition above to be proper, the decoding sets need to be disjoint, as stated in the following lemma.
Lemma 8* (Decoding Disambiguity).*
Suppose that in each codebook, all codewords have the same conditional type, i.e. for all . Assume (25) holds, that for some , , , , , and also
[TABLE]
Then, for sufficiently small ,
[TABLE]
The proof of Lemma 8 is given in Appendix E.
III-C3 Codebook Generation
We now extend Csiszár and Narayan’s lemma for the codebook generation [30].
Lemma 9* (Codebooks Generation).*
For every , sufficiently large , rate and conditional type , there exist a set of codewords of conditional type , such that for every and with , and every joint type with , the following hold.
[TABLE]
[TABLE]
and
[TABLE]
The proof of Lemma 9 is given in Appendix F.
III-D Super-Additivity
We also give an equivalent formulation with a sum over . Here, as opposed to the previous section, the formula cannot be expressed in terms of the capacities of the constant-parameter AVCs . Considering the AVC without constraints, Schaefer et al. [91] showed that the capacity of any product AVC that is composed of a symmetrizable channel and a non-symmetrizable channel is larger than the sum of the individual capacities (see Theorem 6 in [91]). Similarly, we give an example at the end of this section where the capacity of the AVC with fixed parameters is larger than the weighted sum of the capacities of the constant-parameter AVCs . This phenomenon can be viewed as an instance of the super-additivity property in [91].
We begin with constant-parameter definitions, i.e. for a fixed . For every input distribution with , define the constant-parameter minimal symmetrizability cost by
[TABLE]
where the minimization is over the distributions that symmetrize , where is fixed (see Definition 4). Then, we can write the minimal symmetrizability cost defined in (27) as
[TABLE]
Let
[TABLE]
We note that based on Csiszár and Narayan’s result in [30], the capacity of the constant-parameter AVC is given by with .
Lemma 10*.*
[TABLE]
The proof of Lemma 10 is given in Appendix I. Theorem 6, Corollary 7, and Lemma 10 yield the following consequence.
Corollary 11*.*
The deterministic code capacity of the AVC with fixed parameters, under input constraint and state constraint , is given by
[TABLE]
Furthermore, if the minimum in (41) is attained by a [math]- law, for every with , and for all , then
[TABLE]
for all values of .
The corollary will also be useful in our analysis of the AVC with colored Gaussian noise.
Example 1*.*
Consider the arbitrarily varying binary symmetric channel (BSC) with fixed parameters,
[TABLE]
with , where , for , . Consider a parameter sequence with an empirical distribution , say and for . Suppose that the user and the jammer are subject to input constraint and state constraint , respectively, with Hamming weight cost functions, i.e. and .
For the constant-parameter AVC, we have by Definition 4 that is symmetrized by any symmetric distribution, i.e. with . Denoting , we have that
[TABLE]
Based on the analysis by Csiszár and Narayan [30, Example 1], the capacity of the constant-parameter AVC under input constraint and state constraint is given by
[TABLE]
where is the binary entropy function and .
Suppose that
[TABLE]
For those values, we have that
[TABLE]
Thus, by Corollary 11, the capacity is given by
[TABLE]
with , and . Whereas, using two separate codes for and independently, the rate achieved is
[TABLE]
This can be viewed as an instance of the more general phenomenon of super-additivity, that holds for any product AVC which is composed of a symmetrizable AVC and a non-symmetrizable AVC [91, Theorem 6].
III-E Example: Channel with Fadings
To illustrate our results, we give another example.
Example 2*.*
Consider an arbitrarily varying fading channel,
[TABLE]
with a Gaussian noise sequence that is i.i.d. , where is a sequence of fixed fading coefficients. Recently, Hosseinigoki and Kosut [57] considered this channel with a random memoryless sequence of fading coefficients. Yet, we assume that the fading coefficients are fixed, and belong to a finite set . Intuitively, the jammer would like to confuse the decoder by sending a state sequence that simulates the sequence . Indeed, as seen below, the deterministic code capacity is positive only if there exists an input distribution such that , in which case the jammer cannot simulate without violating the state constraint.
Although we previously assumed that the alphabets are finite, our results can be extended to the continuous case as well, using standard discretization techniques [15, 5] [36, Section 3.4.1]. By Theorem 3, the random code capacity is given by
[TABLE]
Then, we show that
[TABLE]
with expectation over , where is the type of the sequence .
As for the deterministic code capacity, we show that the minimum in (27) is attained by a [math]- law that gives probability to , hence we can determine the capacity using Corollary 7. We show that the minimal symmetrizability cost is given by
[TABLE]
and deduce that the capacity of the AVC with fixed fading coeffients is given by
[TABLE]
with
[TABLE]
The derivation is given in Appendix J. We note that the last expression has the same form as the capacity formula established by Hosseinigoki and Kosut [57] for a random memoryless sequence of fading coefficients.
Next, we extend the result above to continuous fading coefficients, where . First, we observe that the formulas above can also be written as
[TABLE]
and
[TABLE]
This follows from the same considerations as in the proofs of Lemma 4 and Lemma 10. Now, if the fading coefficients are continuous, then one may perform the discretization procedure in [36, Section 3.4.1]. Hence, the deterministic and random code capacities in the continuous case are also given by the limit infimum of the formulas (61) and (62), respectively.
IV The Arbitrarily Varying Gaussian Product Channel
From this point on, we consider Gaussian AVCs, without parameters. In this section, we consider the Gaussian product channel. Our results on the AVC with colored Gaussian noise, in the next section, are based on the capacity theorems of the AVC with fixed parameters, in the previous section, and on the analysis in the current section.
IV-A Channel Description
The state-dependent Gaussian product channel consists of a set of parallel channels,
[TABLE]
where is the channel index, is the dimension (number of channels), and is a Gaussian vector with zero mean and covariance matrix . Let , and denote the input, state and noise sequences associated with the th channel, respectively, where is the time index, and let , and . The corresponding output of the product channel is the vector sequence .
The Gaussian arbitrarily varying product channel (AVGPC) is a state-dependent Gaussian product channel with state sequences of unknown distribution, not necessarily independent nor stationary. That is, , where is an unknown joint cumulative distribution function (cdf) over . In particular, could give probability mass to a particular sequence of state vectors . The channel is subject to input constraint and state constraint ,
[TABLE]
IV-B Coding
We introduce preliminary definitions for the AVGPC.
Definition 6* (Code).*
A code for the AVGPC consists of the following; a message set , where it is assumed throughout that is an integer, a sequence of encoding functions , for , such that
[TABLE]
and a decoding function . Given a message , the encoder transmits , for . The codeword is then given by . The decoder receives the channel outputs , and finds an estimate of the message . We denote the code by .
Define the conditional probability of error of a code given the sequence by
[TABLE]
where , with
[TABLE]
A code is called a code for the AVGPC if
[TABLE]
We say that a rate is achievable if for every and sufficiently large , there exists a code for the AVGPC. The operational capacity is defined as the supremum of all achievable rates, and it is denoted by . We use the term ‘capacity’ referring to this operational meaning, and in some places we call it the deterministic code capacity to emphasize that achievability is measured with respect to deterministic codes.
We proceed now to coding schemes when using stochastic-encoder stochastic-decoder pairs with common randomness.
Definition 7* (Random code).*
A random code for the AVGPC consists of a collection of codes , along with a pmf over the code collection . We denote such a code by . Analogously to the deterministic case, a random code for the AVGPC satisfies
[TABLE]
The capacity achieved by random codes is denoted by \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(K_{Z}), and it is referred to as the random code capacity.
IV-C Related Work
Consider the AVGPC with parallel Gaussian channels, where the covariance matrix of the additive noise is
[TABLE]
i.e. are independent and . Denote the random code capacity of the AVGPC with parallel channels by \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Sigma). Hughes and Narayan [61] have shown that the solution for the random code capacity is given by “double” water filling, where the jammer performs water filling first, attempting to whiten the overall noise as much as possible, and then the user performs water filling taking into account the total noise power, which is contributed by both the channel and the jammer. The formal definitions are given below. Let
[TABLE]
with , where is chosen to satisfy
[TABLE]
Next, let
[TABLE]
where is chosen to satisfy
[TABLE]
We can now define Hughes and Narayan’s capacity formula [61],
[TABLE]
Theorem 12* (see [61]).*
The random code capacity of the AVGPC is given by
[TABLE]
IV-D Observations on The Water Filling Game
We give further observations on the results by Hughes and Narayan [61], which will be useful in the sequel.
IV-D1 Game Theoretic Interpretation
By [61, Theorem 3], the random code capacity is the solution of the following optimization problem,
[TABLE]
where the minimization is over the simplex , and the maximization is over the simplex .
The optimization problem is thus interpreted as a two-player zero-sum simultaneous game, played by the user and the jammer, where and are the respective action sets. The payoff function is defined such that, given a profile ,
[TABLE]
We have defined a game with pure strategies, i.e. the players’ actions are deterministic. In the communication model, the optimal coding and jamming scheme are random in general, yet the capacity can be achieved with deterministic power allocations, as in the game.
The optimal power allocation has a water filling analogy (see e.g. [27, Section 9.4]), where the jammer pours water of volume to a vessel, and then the encoder pours more water of volume . The shape of the bottom of the vessel is determined by the noise variances \sigma_{1}^{2},\ldots,$$\sigma_{d}^{2}. The jammer brings the water level to , and then the encoder brings the water level to . Water filling for the AVGPC is illustrated in Figure 1, for , , , . The light shade “fluid” is the jammer’s water filling and the dark shade “fluid” is the transmitter’s. The resulting “water levels” are and . Then, substituting into (72) and (74) yields the power allocations for the jammer and for the transmitter.
One can easily prove the following properties of the random code capacity characterization.
Lemma 13*.*
The quantities defined by (72)-(76) satisfy
[TABLE]
For completeness, we give the proof of Lemma 13 is given in Appendix K. Based on the water filling analogy of the power allocation above, part 1 of Lemma 13 is natural, since is interpreted as the water level after the jammer pours his share, and is interpreted as the water level after the user pours additional water after that (see Figure 1). Part 3 and part 4 are not surprising either since, as can be seen in Figure 1, the variance of the combined interference is and the variance of the channel output is .
Observe that an equivalent statement of part 2 is the following. If the user discards a channel, i.e. assigns to the th channel, then the jammer does not invest power in this channel either, i.e. . This claim is also intuitive, and from a game theoretic perspective, it is an aspect of the jammer’s rationality, as explained below. As mentioned above the optimization problem is interpreted as a two-player zero-sum simultaneous game between the user and the jammer. The value of such a game is attained by a pair of strategies which forms a Nash equilibrium [103] (see also [79][75, Theorem 3.1.4]). That is, if the user and the jammer were to agree to use the power allocation strategies and , then neither player could profit by deviating from his original strategy, provided that the other player respects the agreement. Now, suppose that for some , and . Then, the jammer is wasting energy, and can surely profit from diverging this energy to some other channel with . Thus, such strategy profile is irrational and cannot be a Nash equilibrium.
For a general AVC, a coding scheme which assumes that the jammer is using his optimal strategy would typically fail. The code needs to be robust standing against any state sequence that satisfies the state constraint. For example, consider a scalar Gaussian AVC [60], specified by , under input constraint and state constraint , where the noise sequence is i.i.d. . Suppose that the receiver is using joint typicality decoding for a Gaussian channel , where is i.i.d. (see [27, Section 9.1]), corresponding to the optimal jamming strategy. Then, the jammer can fail the decoder by selecting a state sequence such that , for instance. As a result, there is a high probability that the square norm of the output sequence is below , for small , in which case the decoder cannot establish joint typicality and declares an error. The same principle holds in our problem. The user cannot assume that the jammer is using his optimal power allocation, and a reliable code must be robust standing against any power allocation of the jammer.
IV-D2 Multiple Access Channel Analogy
Water filling in two (or more) stages appears in other settings in the literature, e.g. [26, 71, 111, 113]. Consider a Gaussian product multiple access channel (MAC), where , , under the input constraints and . This can be viewed as a different variation of the AVGPC where a second transmitter replaces the jammer. By [26], a corner point of the capacity region can be achieved by applying water filling to the total power in the first step, and then to the power of User 2 in the second step. Specifically, by [26, Section III.B.], the optimal power allocations and , for Encoder 1 and Encoder 2, respectively, which achieve a corner point of the capacity region, satisfy
[TABLE]
such that . Following part 3 of Lemma 13, it can be seen that the strategy above is equivalent to (72)-(75). The total power allocation in (83) seems natural in order to maximize the sum rate. Though, our presentation in (72)-(75) is intuitive for the Gaussian product MAC as well. Indeed, using successive cancellation decoding, the receiver estimates the transmission of User 1 while treating the transmission of User 2 as noise, and then subtracts the estimated sequence from the received sequence to decode the transmission of User 2. Hence, decoding for User 1 is analogous to the decoder in our problem. Nevertheless, in the next section, we show that the deterministic code capacity in our adversarial problem has a different behavior.
Another water filling game is described by Lai and El Gamal in [71], who considered the flat fading MAC with selfish users, where the fading coefficients are continuous random variables, distributed according to . Suppose that the users are subject to average input constraints, and . As shown in [71], a maximum sum-rate point on the capacity region boundary is achieved if the users perform water filling treating each other’s transmission as noise. It is further shown that opportunistic communication is optimal, where User 1 only transmits if his water level times fading coefficient is at least as high as that of User 2, and vice versa. That is, the power allocations of the users are given by
[TABLE]
where and are chosen such that and . This threshold operation resembles the result in the next section, on the deterministic code capacity of the AVGPC, except that the phase transition of the AVGPC depends only on the “water volumes” and (see Subsection IV-F).
IV-E Results
We give our result on the AVGPC with parallel Gaussian channels, where the covariance matrix of the additive noise is , i.e. are independent and . The deterministic code capacity of the AVGPC with parallel channels is denoted by .
We establish the capacity of the AVGPC. Based on Csiszár and Narayan’s result in [30], the deterministic code capacity of an AVC under input and state constraints is given in terms of channel symmetrizability and the minimal state cost for the jammer to symmetrize the channel (see also [73] [82, Definition 5 and Theorem 5]). By [30, Definition 2], a AVGPC is symmetrized by a conditional pdf if
[TABLE]
where . In particular, observe that (86) holds for , where is the Dirac delta function. In other words, the channel is symmetrized by a distribution which gives probability to . For the AVGPC, the minimal state cost for the jammer to symmetrize the channel, for an input distribution , is given by
[TABLE]
where the minimization is over all conditional pdfs that symmetrize the channel, that is, satisfy (86). The following lemma states that the minimal state cost for symmetrizability is the same as the input power. The lemma will be used in the achievability proof of the capacity theorem.
Lemma 14*.*
For a zero mean Gaussian vector ,
[TABLE]
The proof of Lemma 14 is given in Appendix L. The proof builds on our observation that (86) holds if and only if . This in turn leads to the conclusion that the minimum in (87) is attained by . Moving to the capacity theorem, define
[TABLE]
Theorem 15*.*
The deterministic code capacity of the AVGPC is given by
[TABLE]
The proof of Theorem 15 is given in Appendix M. Considering the scalar case, Csiszár and Narayan showed the direct part by providing a coding scheme for the Gaussian AVC [32]. While the receiver in their coding scheme uses simple minimum-distance decoding, the analysis is fairly complicated. Here, on the other hand, we treat the AVGPC using a much simpler approach. To prove direct part, we consider the optimization problem based on the capacity formula of the general AVC under input and state constraints, which is given in terms of symmetrizing state distributions. We use Lemma 14 to show that if , then the transmitter’s water filling strategy in (74) guarantees that . Intuitively, this means that the jammer cannot symmetrize the channel without violating the state constraint. In this scenario, the random code capacity can be achieved with deterministic codes as well.
IV-F Discussion
We give a couple of remarks on our result in Theorem 15. As in the case of the Gaussian scalar AVC [32], the capacity is disconinuous in the input constraint, and has a phase transition behavior, depending on whether or . We give an intuitive explanation below. For the classic Gaussian AVC, reliable communication requires the power of the transmitted signal to be higher than the power of the jamming signal, otherwise the jammer can confuse the receiver by making the state sequence “look like” the input sequence [32]. At a first glance at our problem, one might have expected that the input power of the th channel also needs to be higher than the jamming power , in order for the output to be useful. This is not the case. Since the decoder has the vector of outputs , even if looks like , the receiver could still gain information from as the other outputs may “break the symmetry”.
Based on Shannon’s classic water filling result [94], the capacity of the Gaussian product channel, , , can be achieved by combining independent encoder-decoder pairs, where the th pair is associated with a capacity achieving code for the scalar Gaussian channel under input constraint . However, based on Csiszár and Narayan’s result on the Gaussian single AVC [32], the capacity of the th AVC, , is zero under input constraint and state constraint for . This means that, in contrast to the Shannon’s Gaussian product channel [94], using independent encoder-decoder pairs over the AVGPC is suboptimal in general. This can be viewed as a constrained version of the super-additivity phenomenon in [91].
V Main Results – AVC with Colored Gaussian Noise
We consider an AVC with colored Gaussian noise, i.e.
[TABLE]
where is a zero mean stationary Gaussian process, with power spectral density . Assume that the power spectral density is bounded and integrable. We denote the random code capacity and the deterministic code capacity of this channel by \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Psi_{Z}) and , respectively.
We show that the optimal power allocations of the user and the jammer are given by “double” water filling in the frequency domain. Define
[TABLE]
where is chosen to satisfy
[TABLE]
Next, define
[TABLE]
where is chosen to satisfy
[TABLE]
Now, let
[TABLE]
Theorem 16*.*
The random code capacity of the AVC with colored Gaussian noise is given by
[TABLE]
and the deterministic code capacity is given by
[TABLE]
The proof of Theorem 16 is given in Appendix N, combining our previous results on the AVC with fixed parameters and the AVGPC. Despite the common belief that the characterization for a channel with colored Gaussian noise easily follows from the results for the product channel setting, the analysis is more involved. While standard orthogonalization transforms the channel into an equivalent one with statistically independent noise instances, the noise in the transformed channel is not necessarily white. As the noise variance may change over time, we observe that the transformed channel is in fact an AVC with fixed parameters which represent the sequence of noise variances. Using Corollary 5 and Corollary 11, we obtain deterministic and random capacity formulas that are analogous to those of the AVGPC, and use Toeplitz matrix properties to express the formulas as integrals in the frequency domain.
The optimal power allocation has a water filling analogy in the frequency domain (see e.g. [27, Section 9.5]), where the jammer pours water of volume on top of the power spectral density , and then the encoder pours more water of volume . The jammer brings the water level to , and then the encoder brings the water level to . The process is illustrated in Figure 2.
Appendix A Proof of Theorem 1
Consider the compound channel with fixed parameters under input constraint and state constraint .
A-A Achievability Proof
To show achievability, we construct a code based on conditional typicality decoding with respect to a channel state type, which is “close” to one of the state distributions in .
Denote the type of the parameter sequence by . Define a set of conditional state types,
[TABLE]
with , and
[TABLE]
where is arbitrarily small. In words, is the set of conditional types , given a parameter sequence , such that the joint type is -close to , for some conditional state distribution in . We note that the sets and could be disjoint, since is not limited to conditional empirical distributions. Nevertheless, for a fixed and sufficiently large , every can be approximated by some . Indeed, for sufficiently large , there exists a joint type such that , hence and . Now, a code is constructed as follows.
Codebook Generation: Fix such that , where
[TABLE]
Generate independent sequences at random, , for .
Encoding: To send a message , if , transmit . Otherwise, transmit an idle sequence with .
Decoding: Find a unique for which there exists such that , where
[TABLE]
If there is none, or more than one such , declare an error. We note that using the set of types instead of the original set of state distributions alleviates the analysis, since is not necessarily finite nor countable.
Analysis of Probability of Error: Assume without loss of generality that the user sent . By the union of events bound, we have that , where
[TABLE]
The first term tends to zero exponentially by the law of large numbers and Chernoff’s bound (see e.g. [67, Theorem 1.2]). Now, suppose that the event occurs. Then, for sufficiently small , we have that , since . Hence, is the channel input.
Next, we claim that the second error event implies that , where is the actual state distribution chosen by the jammer. Assume to the contrary that holds, but . For sufficiently large , there exists a conditional type that approximates in the sense that for all and , hence
[TABLE]
for all , , (see (100)-(102)). To show -typicality with respect to , we observe that
[TABLE]
where the first inequality is due to the triangle inequality, and the second inequality follows from (104) and the assumption that . It follows that , and does not hold. Thus,
[TABLE]
This tends to zero exponentially as by the law of large numbers and Chernoff’s bound (see e.g. [67, Theorem 1.2]).
Moving to the third error event, as the number of type classes in is bounded by , we have that
[TABLE]
For every , is independent of , hence
[TABLE]
Let . Then, with . By Lemmas 2.6-2.7 in [29],
[TABLE]
where as . Therefore, by (107)(109),
[TABLE]
with as , where the last inequality is due to [29, Lemma 2.13]. The RHS of (110) tends to zero exponentially as , provided that . The probability of error, averaged over the class of codebooks, exponentially decays to zero as . Therefore, there must exist a deterministic code, for a sufficiently large . This completes the proof of the direct part.
A-B Converse Proof
Since the deterministic code capacity is always bounded by the random code capacity, we consider a sequence of random codes, where as . Then, let be the channel input sequence, and be the corresponding output sequence, where is the random element shared between the encoders and the decoder. For every , we have by Fano’s inequality that , hence
[TABLE]
where as . The third equality holds since is a deterministic function of , and the last equality since (M,\gamma)\leavevmode\hbox to9.01pt{\vbox to4.71pt{\pgfpicture\makeatletter\hbox{\hskip 0.2pt\lower-0.2pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}{}}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{2.58331pt}\pgfsys@lineto{8.61108pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{}{{{}}{}{}{}{}{}{}{}{}}{}\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@moveto{6.02777pt}{2.58331pt}\pgfsys@curveto{6.02777pt}{3.53448pt}{5.25671pt}{4.30554pt}{4.30554pt}{4.30554pt}\pgfsys@curveto{3.35437pt}{4.30554pt}{2.58331pt}{3.53448pt}{2.58331pt}{2.58331pt}\pgfsys@curveto{2.58331pt}{1.63214pt}{3.35437pt}{0.86108pt}{4.30554pt}{0.86108pt}\pgfsys@curveto{5.25671pt}{0.86108pt}{6.02777pt}{1.63214pt}{6.02777pt}{2.58331pt}\pgfsys@closepath\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {{}{}}{{}}{} {{}{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\color[rgb]{1,1,1}\definecolor[named]{pgfstrokecolor}{rgb}{1,1,1}\pgfsys@color@gray@stroke{1}\pgfsys@invoke{ }\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}{}\pgfsys@moveto{1.72223pt}{0.0pt}\pgfsys@lineto{4.30554pt}{0.0pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}(X^{n},T^{n})\leavevmode\hbox to9.01pt{\vbox to4.71pt{\pgfpicture\makeatletter\hbox{\hskip 0.2pt\lower-0.2pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}{}}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{2.58331pt}\pgfsys@lineto{8.61108pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{}{{{}}{}{}{}{}{}{}{}{}}{}\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@moveto{6.02777pt}{2.58331pt}\pgfsys@curveto{6.02777pt}{3.53448pt}{5.25671pt}{4.30554pt}{4.30554pt}{4.30554pt}\pgfsys@curveto{3.35437pt}{4.30554pt}{2.58331pt}{3.53448pt}{2.58331pt}{2.58331pt}\pgfsys@curveto{2.58331pt}{1.63214pt}{3.35437pt}{0.86108pt}{4.30554pt}{0.86108pt}\pgfsys@curveto{5.25671pt}{0.86108pt}{6.02777pt}{1.63214pt}{6.02777pt}{2.58331pt}\pgfsys@closepath\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {{}{}}{{}}{} {{}{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\color[rgb]{1,1,1}\definecolor[named]{pgfstrokecolor}{rgb}{1,1,1}\pgfsys@color@gray@stroke{1}\pgfsys@invoke{ }\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}{}\pgfsys@moveto{1.72223pt}{0.0pt}\pgfsys@lineto{4.30554pt}{0.0pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}Y^{n} form a Markov chain. It follows that
[TABLE]
for all , with , , , where the random variable is uniformly distributed over , and as . Observe that the random variable is distributed according to
[TABLE]
where is the number of occurrences of the symbol in the sequence . Since K\leavevmode\hbox to9.01pt{\vbox to4.71pt{\pgfpicture\makeatletter\hbox{\hskip 0.2pt\lower-0.2pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}{}}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{2.58331pt}\pgfsys@lineto{8.61108pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{}{{{}}{}{}{}{}{}{}{}{}}{}\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@moveto{6.02777pt}{2.58331pt}\pgfsys@curveto{6.02777pt}{3.53448pt}{5.25671pt}{4.30554pt}{4.30554pt}{4.30554pt}\pgfsys@curveto{3.35437pt}{4.30554pt}{2.58331pt}{3.53448pt}{2.58331pt}{2.58331pt}\pgfsys@curveto{2.58331pt}{1.63214pt}{3.35437pt}{0.86108pt}{4.30554pt}{0.86108pt}\pgfsys@curveto{5.25671pt}{0.86108pt}{6.02777pt}{1.63214pt}{6.02777pt}{2.58331pt}\pgfsys@closepath\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {{}{}}{{}}{} {{}{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\color[rgb]{1,1,1}\definecolor[named]{pgfstrokecolor}{rgb}{1,1,1}\pgfsys@color@gray@stroke{1}\pgfsys@invoke{ }\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}{}\pgfsys@moveto{1.72223pt}{0.0pt}\pgfsys@lineto{4.30554pt}{0.0pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}(T,X)\leavevmode\hbox to9.01pt{\vbox to4.71pt{\pgfpicture\makeatletter\hbox{\hskip 0.2pt\lower-0.2pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}{}}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{2.58331pt}\pgfsys@lineto{8.61108pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{}{{{}}{}{}{}{}{}{}{}{}}{}\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@moveto{6.02777pt}{2.58331pt}\pgfsys@curveto{6.02777pt}{3.53448pt}{5.25671pt}{4.30554pt}{4.30554pt}{4.30554pt}\pgfsys@curveto{3.35437pt}{4.30554pt}{2.58331pt}{3.53448pt}{2.58331pt}{2.58331pt}\pgfsys@curveto{2.58331pt}{1.63214pt}{3.35437pt}{0.86108pt}{4.30554pt}{0.86108pt}\pgfsys@curveto{5.25671pt}{0.86108pt}{6.02777pt}{1.63214pt}{6.02777pt}{2.58331pt}\pgfsys@closepath\pgfsys@moveto{4.30554pt}{2.58331pt}\pgfsys@stroke\pgfsys@invoke{ } {{}{}}{{}}{} {{}{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\color[rgb]{1,1,1}\definecolor[named]{pgfstrokecolor}{rgb}{1,1,1}\pgfsys@color@gray@stroke{1}\pgfsys@invoke{ }\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}{}\pgfsys@moveto{1.72223pt}{0.0pt}\pgfsys@lineto{4.30554pt}{0.0pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}Y form a Markov chain, we have that
[TABLE]
∎
Appendix B Proof of Lemma 2
We state the proof of our modified version of Ahlswede’s RT [6]. The proof follows the lines of [6, Subsection IV-B], which we modify here to include a constraint on the family of state distributions and the parameter sequence . Let such that . Denote the conditional type of given by . Observe that (see (9)), since .
Given a permutation ,
[TABLE]
where the first equality holds since is a bijection, the second equality holds since for every , and the last equality holds due to the product form of the conditional distribution . Hence, taking ,
[TABLE]
and by (17),
[TABLE]
Thus,
[TABLE]
As the expression in the square brackets is identical for all sequences of conditional type , we have that
[TABLE]
The second sum is the probability of the conditional type class of , hence
[TABLE]
by [27, Theorem 11.1.4]. The proof follows from (119) and (120). ∎
Appendix C Proof of Theorem 3
Consider the AVC with fixed parameters under input constraint and state constraint .
C-A Achievability Proof
To prove the random code capacity theorem for the AVC with fixed parameters, we use our result on the compound channel along with our modified Robustification Technique (RT), i.e. Lemma 2.
Let R<\mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}. At first, we consider the compound channel under input constraint , with . According to Lemma 1, for some and sufficiently large , there exists a code for the compound channel with fixed parameters such that
[TABLE]
and
[TABLE]
for all product state distributions , with .
Therefore, by Lemma 2, taking and , we have that for a sufficiently large ,
[TABLE]
for all with , where the sum is over the set of all -tuple permutations such that .
On the other hand, for every ,
[TABLE]
where is obtained by plugging in (11a); in we substitue instead of ; and holds because the channel is memoryless. Since for every , it follows that
[TABLE]
Then, consider the random code , specified by
[TABLE]
with a uniform distribution for . As the inputs cost is additive (see (6)), the permutation does not affect the costs of the codewords, hence the random code satisfies the input constraint . From (125), we see that , for all with . Therefore, together with (123), we have that the probability of error of the random code is bounded by , for every . It follows that is a random code for the AVC with fixed parameters under input constraint and state constraint . ∎
C-B Converse Proof
Assume to the contrary that there exists an achievable rate pair
[TABLE]
using random codes over the AVC under input constraint and state constraint , where is arbitrarily small. That is, for every and sufficiently large , there exists a random code for the AVC , such that , and
[TABLE]
for all and . In particular, for distributions that give mass to some sequence with , we have that .
Consider using the random code over the compound channel with fixed parameters under input constraint . Let be a given state distribution. Then, define a sequence of conditionally independent random variables . Letting , the probability of error is bounded by
[TABLE]
The first sum is bounded by (128), and the second term vanishes by the law of large numbers, since . It follows that the random code achieves a rate as in (127) over the compound channel with fixed parameters under input constraint , for an arbitrarily small , in contradiction to Lemma 1. We deduce that the assumption is false, and \mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W})\leq\mathsf{C}(\mathcal{W}^{\mathcal{Q}})\big{|}_{\mathcal{Q}=\overline{\mathcal{P}}_{\Lambda}(\mathcal{S}|\theta^{\infty})}=\mathsf{C}_{n}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}). ∎
Appendix D Proof of Lemma 4
To prove that \mathsf{R}_{n}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W})=\mathsf{C}_{n}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}), we begin with the property in the lemma below.
Lemma 17*.*
Let , , , be the parameters that achieve the saddle point in (21), i.e.
[TABLE]
Then, for every such that , we have that and .
Proof of Lemma 17.
For every , let denote input and state distributions such that , for , . Now, suppose that , and define
[TABLE]
Then, and for , . Furthermore, since the mutual information is concave- in the input distribution and convex- in the state distribution, we have that
[TABLE]
Therefore, the saddle point distributions must satisfy and , hence and . ∎
Next, it can be inferred from Lemma 17 that
[TABLE]
where is the type of the parameter sequence . The second equality follows from the definition of \mathsf{C}_{t}^{\,\text{ $$\mbox{ \small\star }$$ }}(\omega_{t},\lambda_{t}) in (20), using the minimax theorem [96] to switch between the order of the minimum and maximum. In the third line, we eliminate the slack variables and replacing and , respectively. The last equality holds by the definition of \mathsf{C}_{n}^{\,\text{ $$\mbox{ \small\star }$$ }}(\mathcal{W}) in (16). ∎
Appendix E Proof of Lemma 8
Consider the AVC with fixed parameters under input constraint and state constraint . Let be sequence of fixed parameters for a given blocklength. Recall that is a random variable that is distributed as the type of . We extend the proof in [30]. First, we give an auxiliary lemma, which we also used in [85].
Lemma 18* *(See [30]
[85, Lemma 11] ).
For every pair of conditional state distributions and such that
[TABLE]
there exists such that
[TABLE]
Proof of Lemma 18.
Assume to the contrary that the LHS in (135) is zero, and define
[TABLE]
Using the symmetry between and , we have that
[TABLE]
Since we have assumed that for all , it follows that
[TABLE]
for all , and . In other words, symmetrizes the channel for all . Therefore, by the definition of in (27), we have that
[TABLE]
in contradiction to (134). The equality above holds because is distributed as the type of the parameter sequence , hence averaging over time is the same as averaging according to . It follows that the LHS of (135) must be positive. This completes the proof of the auxiliary Lemma. ∎
We move to the main part of the proof. To show that (37) holds for sufficiently small , assume to the contrary that there exists such that is in . By the assumption in the lemma, the codewords have the same conditional type. In particular, .
By Condition 1) of the decoding rule,
[TABLE]
and by Condition 2) of the decoding rule,
[TABLE]
where are distributed according to the joint type of , , , , and . Adding (140) and (141) yields
[TABLE]
That is, . Therefore, by the log-sum inequality (see e.g. [27, Theorem 2.7.1]),
[TABLE]
where . Then, by Pinsker’s inequality (see e.g. [29, Problem 3.18]),
[TABLE]
where is a constant. By the same arguments, (34) implies that
[TABLE]
where . Now, observe that inserting the sum over into the absolute value maintains the inequality, by the triangle inequality. Furthermore, since , for , , we have that
[TABLE]
Equivalently, the above can be expressed as
[TABLE]
Now, we show that the state distributions and satisfy the conditions of Lemma 18. Indeed,
[TABLE]
where the last inequality is due to (36). Thus, there exists such that (135) holds with and , which contradicts (147), if is sufficiently small such that . ∎
Appendix F Proof of Lemma 9
Let , , be statistically independent sequences, uniformly distributed over the conditional type class . Fix and , and consider a joint type , such that . We intend to show that satisfy each of the desired properties with double exponential high probability , , implying that there exists a deterministic codebook that satisfies (38)-(40) simultaneously. We begin with the following large deviations result by Csisár and Narayan [30].
Lemma 19* (see [30, Lemma A1]).*
Let , and consider a sequence of random vectors , and functions , for . If
[TABLE]
then
[TABLE]
To show that (38) holds, consider the indicator
[TABLE]
By standard type class considerations (see e.g. [67, Theorem 1.3]), we have that
[TABLE]
where the last inequality holds since .
Next, we use Lemma 19, and plug
[TABLE]
For sufficiently large , we have that . Hence, by Lemma 19,
[TABLE]
By the symmetry between and in the derivation above, the double exponential decay of the probability in (154) implies that there exists a codebook that satisfies (38).
Similarly, to show (39), we replace the indicator of the type in (151) by an indicator of the type , and rewrite (152) with , to obtain
[TABLE]
where is arbitrarily small. If and , then choosing , we have that
[TABLE]
hence,
[TABLE]
It remains to show that (40) holds. Assume that
[TABLE]
Let denote the set of indices such that , provided that their number does not exceed ; else, let . Also, let
[TABLE]
Then, choosing in (155) yields
[TABLE]
Therefore, instead of bounding the set of messages, it is sufficient to consider the sum . Furthermore, by standard type class considerations (see e.g. [67, Theorem 1.3]), we have that
[TABLE]
where the last inequality is due to (158). Thus, by Lemma 19,
[TABLE]
as we have assumed that . Equations (160) and (162) imply that the property in (40) holds with double exponential probability , where . ∎
Appendix G Proof of Theorem 6
G-A Achievability Proof
Suppose that for sufficiently large . Let be chosen later, and let be a conditional type over , for which , , and , with
[TABLE]
As explained below, we may assume without loss of generality that for some that does not depend on , we have that for all . Indeed, following our assumption in (25), the asymptotic capacity formula does not change when we remove parameter values such that . Hence, coding can be limited to the rest of the block with negligible rate decrease, thus removing those parameters from consideration. Then, choose to be sufficiently small such that Lemma 8 guarantees that the decoder in Definition 5 is well defined. Now, Lemma 9 assures that there is a codebook of conditional type that satisfies (38)-(40). Consider the following coding scheme.
Encoding: To send , transmit .
Decoding: Find a unique message such that belongs to , as in Definition 5. If there is none, declare an error. Lemma 8 guarantees that there cannot be two messages for which this holds.
Analysis of Probability of Error: Fix with , let denote the conditional type of given , and let denote the transmitted message. Consider the error events
[TABLE]
and
[TABLE]
where are dummy random variables, which are distributed as the joint type of . By the union of events bound,
[TABLE]
where the conditioning on and is omitted for convenience of notation. Based on Lemma 9, the probabilities of the events and tend to zero as , by (39) and (40), respectively.
Now, suppose that Condition 1) of the decoding rule is violated. Observe that the event implies that
[TABLE]
Then, by standard large deviations considerations (see e.g. [27, pp. 362–364]),
[TABLE]
which tends to zero as , for sufficiently small , with .
Moving to Condition 2) of the decoding rule, let denote the set of joint types such that
[TABLE]
Then, by standard type class considerations (see e.g. [67, Theorem 1.3]),
[TABLE]
for every given . Hence, by (38),
[TABLE]
To further bound , consider the following cases. Suppose that . Then, given , we have that
[TABLE]
By (173), it then follows that
[TABLE]
Returning to (175), we note that since the number of types is polynomial in , the cardinality of the set of types can be bounded by , for sufficiently large . Hence, by (175) and (177), we have that , which tends to zero as , for .
Otherwise, if , then given ,
[TABLE]
Thus,
[TABLE]
Hence, by (175) we have that
[TABLE]
For , we have by (172) that is arbitrarily close to some , where
[TABLE]
if is sufficiently small. In which case,
[TABLE]
where is arbitrarily small. Therefore, provided that
[TABLE]
we have that tends to zero as . ∎
G-B Converse Proof
We will use the following lemma, based on the observations of Ericson [37].
Lemma 20*.*
Consider the AVC with fixed parameters free of state constraints, and let be a deterministic code. Suppose that the channels are symmetrizable for all , and let , , be a set of conditional state distributions that satisfy (4). If , then
[TABLE]
where .
For completeness, we give the proof below.
Proof of Lemma 20.
Denote the codebook size by , and the codewords by .
Under the conditions of the lemma,
[TABLE]
where have defined for short notation. By switching between the summation indices and , we obtain
[TABLE]
Now, as the channel is memoryless,
[TABLE]
where the second equality is due to (4). Therefore,
[TABLE]
Assuming the sum rate is positive, we have that , hence . ∎
Now, we are in position to prove the converse part of Theorem 6. Consider a sequence of deterministic codes over the AVC with fixed parameters under input constraint and state constraint , where as . In particular, the conditional probability of error given a state sequence is bounded by
[TABLE]
Let be the channel input sequence, and let be the corresponding output.
Consider using the same code over the compound channel with fixed parameters, i.e. where the jammer selects a state sequence at random according to a product distribution, , under the average state constraint . Here, there is no state constraint with probability , as the jammer may select a sequence with . Yet, the probability of error is bounded by
[TABLE]
The first sum is bounded by (190), and the second term vanishes by the law of large numbers, since . It follows that the code sequence of the constrained AVC achieves the same rate over the compound channel . As in Appendix A, Fano’s inequality implies that for every jamming strategy ,
[TABLE]
with , , , where is uniformly distributed over . Hence, is distributed according to the type of the parameter sequence (see (113)).
Returning to the original AVC, suppose that . It remains to show that implies that . If the channels is non-symmetrizable for some , then , and there is nothing to show. Hence, consider the case where are symmetrizable for all . Assume to the contrary that and . Hence, there exist conditional state distributions that symmetrize , such that
[TABLE]
Now, consider the following jamming strategy. First, the jammer selects a codeword from the codebook uniformly at random. Then, the jammer selects a sequence at random, according to the conditional distribution
[TABLE]
At last, if , the jammer chooses the state sequence to be . Otherwise, the jammer chooses to be some sequence of zero cost. Such jamming strategy satisfies the state constraint with probability .
To contradict our assumption that , we first show that . Observe that for every ,
[TABLE]
Since is distributed as , we obtain
[TABLE]
Thus, by Chebyshev’s inequality we have that for sufficiently large ,
[TABLE]
where is arbitrarily small. Now, on the one hand, the probability of error is bounded by
[TABLE]
where is as defined in (185). On the other hand, the sequence can be thought of as the state sequence of an AVC without a state constraint, hence, by Lemma 20,
[TABLE]
Thus, by (198)-(199), the probability of error is bounded by . As this cannot be the case for a code with vanishing probability of error, we deduce that the assumption is false, i.e. implies that .
If , then for all with , and a positive rate cannot be achieved. This completes the converse proof. ∎
Appendix H Proof of Corollary 7
Assume that the AVC with fixed parameters satisfies the conditions of Corollary 7. Looking into the converse proof above, the following addition suffices. We show that for every code as in the converse proof above, implies that . Since there is only a polynomial number of types, we may consider to be the conditional type of given , for all (see [29, Problem 6.19]).
Suppose that , assume to the contrary that , and let be distributions that achieve the minimum in (27), i.e.
[TABLE]
Based on the condition of the corollary, we may assume that is a [math]- law, i.e.
[TABLE]
for some deterministic function .
Recall that we have defined , in the converse proof, where is a uniformly distributed variable over . Thus, by (200),
[TABLE]
Now, consider the following jamming strategy. First, the jammer selects a codeword from the codebook uniformly at random. Then, given , the jammer chooses the state sequence . Observe that
[TABLE]
where the last equality is due to (202). Thus, the state sequence satisfies the state constraint. Now, observe that the jamming strategy is equivalent to as in (185). Thus, by Lemma 20, we have that , hence a positive rate cannot be achieved. ∎
Appendix I Proof of Lemma 10
Suppose that . The proof is similar to that of Lemma 4. We begin with the property in the lemma below.
Lemma 21*.*
Let , , , , be the parameters that achieve the saddle point in (43), i.e.
[TABLE]
Then, for every such that , we have that , , and .
Proof of Lemma 21.
For every , let denote input and state distributions such that , , for , . Now, suppose that , and define
[TABLE]
Then, , , and for , . Furthermore, since the mutual information is concave- in the input distribution and convex- in the state distribution, we have that
[TABLE]
Therefore, the saddle point distributions must satisfy and , hence , , and . ∎
Next, it can be inferred from Lemma 21 that
[TABLE]
where is the type of the parameter sequence . The second equality follows from the definition of in (44), using the minimax theorem [96] to switch between the order of the minimum and maximum. In the third line, we eliminate the slack variables , , and , replacing , , and , respectively. The last equality holds by the definition of in (29). ∎
Appendix J Analysis of Example 2
Consider the fading AVC in Example 2. To show the direct part with random codes, set the conditional input distribution given in (21). Then, for every ,
[TABLE]
where we have denoted . The last inequality holds since Gaussian noise is known to be the worst additive noise under variance constraint [34, Lemma II.2]. The direct part follows. As for the converse part, consider a jamming scheme where the state is drawn according to the conditional distribution given . Then, the proof follows from Shannon’s classic result on the Gaussian channel with .
We move to the deterministic code capacity. By Definition 4, the constant-parameter channel is symmetrized by a conditional pdf if
[TABLE]
where . Equivalently, the constant-parameter channel is symmetrized by if
[TABLE]
for all . By substituting in the LHS, and in the RHS, we have
[TABLE]
For every , define the random variable . We note that the RHS is the convolution of the pdfs of the random variables and , while the LHS is the convolution of the pdfs of the random variables and . This is not surprising since the channel output is a sum of independent random variables, and thus the pdf of is a convolution of pdfs. It follows that , and by plugging instead of , we have that symmetrizes the constant-parameter channel if and only if
[TABLE]
Then, the corresponding state cost satisfies
[TABLE]
where the second equality follows by the integral substitution of . Observe that the bracketed integral can be expressed as
[TABLE]
Thus, by (213),
[TABLE]
Note that the last inequality holds for any which symmetrizes the channel, and in particular for , where is the Dirac delta function. In addition, since gives probability to , we have that (215) holds with equality for , and thus,
[TABLE]
with . Hence,
[TABLE]
Having shown that the minimum in (27) is attained by a [math]- law, we have by Corollary 7 that the capacity of the fading AVC is , with
[TABLE]
To show the direct part, we only need to consider the case where . Then, set the conditional input distribution given in (218). As in the direct part with random codes,
[TABLE]
with , since Gaussian noise is the worst additive noise under variance constraint [34, Lemma II.2]. The direct part follows. As for the converse part, for the conditional distribution given , we have that
[TABLE]
with , since the Gaussian distribution maximizes the differential entropy. The proof follows. ∎
Appendix K Proof of Lemma 13
Part 1
Since , there must be some such that , thus . If , then it follows that , hence
[TABLE]
Otherwise, , thus by the assumption , we have that
[TABLE]
Part 2
Assume to the contrary that and . The assumption implies that , in contradiction to part 1 of the Lemma. Hence, the assumption is false, and implies that .
Part 3 and Part 4
By the definition of in (72), we have that for all . Thus,
[TABLE]
where the last equality is due to part 1. Part 4 immediately follows. ∎
Appendix L Proof of Lemma 14
Let be a zero mean random vector with the covariance matrix . Observe that by (86), the AVGPC is symmetrized by a conditional pdf if
[TABLE]
for all . By substituting in the LHS, and in the RHS, this is equivalent to
[TABLE]
For every , define the random vector . We note that the RHS is the convolution of the pdfs of the random vectors and , while the LHS is the convolution of the pdfs of the random vectors and . This is not surprising since the channel output is a sum of independent random vectors, and thus the pdf of is a convolution of pdfs. It follows that , and by plugging instead of , we have that symmetrizes the AVGPC if and only if
[TABLE]
Then, the corresponding state cost satisfies
[TABLE]
where the second equality follows by the integral substitution of . Observe that the bracketed integral can be expressed as
[TABLE]
Thus, by (227),
[TABLE]
Note that the last inequality holds for any which symmetrizes the channel. Now, observe that (226) holds for , where is the Dirac delta function, hence symmetrizes the channel. In addition, since gives probability to , we have that (229) holds with equality for , and thus, . ∎
Appendix M Proof of Theorem 15
Consider the AVGPC under input constraint and state constraint .
Achievability Proof
Assume that . We show that \mathbb{C}(\Sigma)\geq\mathsf{C}(\Sigma)=\mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Sigma). By [28, Theorem 3], if there exists an input distribution such that , then the capacity is given by
[TABLE]
where and .
Consider the input distribution of a Gaussian vector , where the covariance matrix is given by . By Lemma 14, we have that
[TABLE]
Having assumed that , it follows that , hence (230) applies. Then, setting yields
[TABLE]
where the second inequality holds as are independent and since conditioning reduces entropy, and the last inequality holds since Gaussian noise is known to be the worst additive noise under variance constraint [34, Lemma II.2].
From this point, we use the considerations given in [61]. To prove the direct part, it remains to show that the assignment of , for , is optimal in the RHS of (234), where are as defined in (72)-(73). An assignment of is optimal if and only if it satisfies the KKT optimality conditions [20, Section 5.5.3],
[TABLE]
for , where is a Lagrange multiplier.
We claim that the conditions are met by
[TABLE]
Condition (235) is met by the definition of , , in (72)-(73). Let be a given channel index. We consider the following cases. Suppose that . Then, Condition (237) is clearly satisfied. Now, if , then Condition (236) is satisfied since by part 1 of Lemma 13. Otherwise, , and then
[TABLE]
where the last inequality holds since only if . Thus, Condition (236) is satisfied.
Next, suppose that , hence . By part 2 of Lemma 13, this implies that , i.e. . Thus,
[TABLE]
and thus Condition (236) is satisfied with equality, and Condition (237) is satisfied as well.
As the KKT conditions are satisfied under (238), we deduce that the assignment of , , minimizes the RHS of (234). Together with (234), this implies that \mathbb{C}(\Sigma)\geq\mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Sigma) for .
Converse Proof
We use a similar technique as in [32] (see also [37, 16]). In general, the deterministic code capacity is bounded by the random code capacity, hence \mathbb{C}(\Sigma)\leq\mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Sigma)=\mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Sigma), by Theorem 12. It remains to show that if , then the capacity is zero. Suppose that , and assume to the contrary that there exists an achievable rate . Then, there exists a sequence of codes for the AVGPC such that as , where the size of the message set is at least , i.e. .
Consider a jammer who chooses the state sequence from the codebook uniformly at random, i.e. , where is uniformly distributed over . This choice meets the state constraint, since the square norm of the state sequence is . The average probability of error is then bounded by
[TABLE]
where , and
[TABLE]
By interchanging the summation variables and , we now have that
[TABLE]
Next, observe that for , , and thus the probability of error is lower bounded by
[TABLE]
where the last inequality holds since . Hence, the assumption is false and a positive rate cannot be achieved when . This completes the proof of the converse part. ∎
Appendix N Proof of Theorem 16
Consider the AVC with colored Gaussian noise. First, we show that the problem can be transformed into that of an AVC with fixed parameters. Then, we derive a limit expression for the random code capacity, and prove the capacity characterization in Theorem 16 using the Toeplitz matrix properties in the auxiliary lemma below. To derive the deterministic code capacity, we use similar symmetrizability and optimization arguments as in our proofs for the Gaussian product channel.
Lemma 22*.*
[35, Section 2.3] (see also [43, 53] [39, Section 8.5])
Let be the power spectral density of a zero mean stationary process . Assume that is bounded and integrable, for some , and denote the auto-correlation function by
[TABLE]
with . For a sequence of length , let denote the eigenvalues of the covariance matrix , where for . Then, for every real, monotone non-increasing, and bounded function ,
[TABLE]
if the integral exists.
N-A Transformation to AVC with Fixed Parameters
Let denote the covariance matrix of the noise sequence . Consider the eigen decomposition of the covariance matrix , and denote the eigenvector and eigenvaule matrices by and , respectively, i.e.
[TABLE]
We claim that the capacity of the AVC with colored Gaussian noise is the same as the capacity of the following AVC,
[TABLE]
where , , and . Since is a unitary matrix, i.e. , the input and state constraints remain the same, as , and similarly, . Furthermore, the noise covariance matrix is now
[TABLE]
This transformation can be thought of as a linear system, which is not time invariant. Hence, the noise of the transformed channel is a Gaussian process, but it is non-stationary. Thereby, the input-output relation above specifies a time varying channel, . From operational perspective, if there exists a code for the original AVC with colored Gaussian noise, then the code , given by and , is a code for the transformed AVC in (248). Similarly, if there exists a code for the transformed AVC, then the code , given by and , is a code for the original AVC. Thus, the original AVC and the transformed AVC have the same operational capacity.
Therefore, we can assume without loss of generality that the noise sequence has independent components , . Assume, at first, that for , with some set of finite size, which does not grow with , and that , where is arbitrarily small. Hence, observe that the channel in (248) is equivalent to a channel with fixed parameters, specified by
[TABLE]
with the parameter sequence . It is left to determine the random code capacity and deterministic code capacity of the Gaussian AVC with fixed parameters in (250). Although we previously assumed in Sections II and III that the input, state, and output alphabets are finite, our results can be extended to the continuous case as well, using standard discretization techniques [15, 5] [36, Section 3.4.1].
Now, consider the double water filling allocation,
[TABLE]
for , where and are chosen to satisfy and , respectively. Define
[TABLE]
N-B Random Code Capacity
Now that we have shown that the problem reduces to that of an AVC with fixed parameters, we have by Corollary 5 that the random code capacity is given by
[TABLE]
where \mathsf{C}_{\sigma}^{\,\text{ $$\mbox{ \small\star }$$ }}(P,N) is the random code capacity of the traditional AVC under input constraint and state constraint . Hughes and Narayan [60] showed that the random code capacity of such a channel, where the noise sequence is i.i.d. , is given by
[TABLE]
Hence, for the AVC with colored Gaussian noise,
[TABLE]
Next, observe that this is the same min-max optimization as for the AVGPC in (78), due to [61], with , , . Therefore, by Theorem 12 [61] and (256),
[TABLE]
Given a bounded power spectral density , define a function by
[TABLE]
and observe that
[TABLE]
As is non-increasing and bounded by , we have by Lemma 22 that
[TABLE]
Observing that the function defined in (258) is also continuous, while is bounded and integrable, it follows that the integral exists [86, Theorem 6.11]. Plugging (258) into the RHS of (260), we obtain
[TABLE]
where and satisfy (93) and (95), respectively. Since the covariance matrix of the stationary noise process is Toeplitz (see e.g. [43]), the density of eigenvalues on the real line tends to the power spectral density [44]. Given that the power spectral density is bounded and integrable, we have that the sequence of eigenvalues is summable [43, Theorem 4.2], and thus, bounded as well. Hence, we can remove the assumption that the set of noise variances has finite cardinality, by quantization of the variances. The random code characterization now follows from (257) and (261).
N-C Deterministic Code Capacity
Moving to the deterministic code capacity, observe that for a constant-parameter Gaussian AVC, where the noise sequence is i.i.d. , we have that , by Lemma 14, taking . Therefore, for the Gaussian AVC with a parameter sequence ,
[TABLE]
where the first equality holds by the definition of in (28) and by (42). It can further be seen from the proof of Lemma 14 in Appendix L that the Gaussian channel is symmetrized by a distribution that gives probability to , and that the minimum in the formula of in (41) is attained with this distribution.
Therefore, by Corollary 11, the capacity of the AVC with colored Gaussian noise is given by the limit inferior of
[TABLE]
Consider the direct part. Suppose that , hence (see (262)), and set for . This choice of parameters satisfies the optimization constraints in (263), as , and also . Therefore,
[TABLE]
where the the last inequality holds since Gaussian noise is known to be the worst additive noise under variance constraint [34, Lemma II.2]. Next, observe that this is the same minimization as in (234), in the proof of the direct part for the AVGPC, with , , (see proof of Theorem 15 in Appendix M). Therefore, the minimum is attained with , and the RHS of (257) is achievable with deterministic codes as well, provided that .
The converse part is straightforward. Since the deterministic code capacity is always bounded by the random code capacity, we have that \mathbb{C}(\Psi_{Z})\leq\mathbb{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Psi_{Z})=\mathsf{C}^{\,\text{ $$\mbox{ \small\star }$$ }}(\Psi_{Z}). If , then by (262), hence by the second part of Corollary 11. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abdul Salam et al. [2017] A. Abdul Salam, R. Sheriff, S. Al-Araji, K. Mezher, and Q. Nasir. Novel approach for modeling wireless fading channels using a finite state markov chain. ETRI J. , 39(5):718–728, October 2017.
- 2Ahlswede et al. [2019] A. Ahlswede, I. Althöfer, C. Deppe, and U. Tamm. Probabilistic methods and distributed information . Springer, 2019.
- 3Ahlswede [1968] R. Ahlswede. The weak capacity of averaged channels. J. Prob. Theory and Related Areas , 11(1):61–73, 1968.
- 4Ahlswede [1971] R. Ahlswede. The capacity of a channel with arbitrarily varying additive gaussian channel probability functions. In Trans. 6th Prague Conf. Inform. Theory, Statist. Decision Func., Random Processes , Prague, Czech Republic, Sep 1971.
- 5Ahlswede [1978] R. Ahlswede. Elimination of correlation in random codes for arbitrarily varying channels. Z. Wahrscheinlichkeitstheorie Verw. Gebiete , 44(2):159–175, Jun 1978.
- 6Ahlswede [1986] R. Ahlswede. Arbitrarily varying channels with states sequence known to the sender. IEEE Trans. Inform. Theory , 32(5):621–629, Sep 1986.
- 7Ahlswede and Cai [1996] R. Ahlswede and N. Cai. Arbitrarily varying multiple-access channels . Universität Bielefeld., 1996.
- 8Ahlswede and Cai [1999] R. Ahlswede and N. Cai. Arbitrarily varying multiple-access channels. i. ericson’s symmetrizability is adequate, gubner’s conjecture is true. IEEE Trans. Inform. Theory , 45(2):742–749, Mar 1999. ISSN 0018-9448.
