Feedback Capacity of the Continuous-Time ARMA(1,1) Gaussian Channel
Jun Su, Guangyue Han, Shlomo Shamai (Shitz)

TL;DR
This paper derives a closed-form expression for the feedback capacity of the continuous-time ARMA(1,1) Gaussian channel, revealing conditions under which feedback increases capacity and challenging existing bounds and conjectures.
Contribution
The paper provides the first explicit formula for feedback capacity of the continuous-time ARMA(1,1) Gaussian channel, showing feedback may not always increase capacity.
Findings
Feedback capacity is given by the root of a specific equation under certain conditions.
Feedback may not increase the capacity of continuous-time Gaussian channels with colored noise.
Disproves analogues of the half-bit bound and Cover's 2P conjecture in continuous-time setting.
Abstract
We consider the continuous-time ARMA(1,1) Gaussian channel and derive its feedback capacity in closed form. More specifically, the channel is given by , where the channel input satisfies average power constraint and the noise is a first-order {\em autoregressive moving average} (ARMA(1,1)) Gaussian process satisfying where and is a white Gaussian process with unit double-sided spectral density. We show that the feedback capacity of this channel is equal to the unique positive root of the equation when and is equal to …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBluetooth and Wireless Communication Technologies
Feedback Capacity of OU-Colored AWGN Channels
Jun Su Guangyue Han Shlomo Shamai (Shitz)
The University of Hong Kong The University of Hong Kong Technion-Israel Institute of Technology
email: [email protected] email: [email protected] email: [email protected]
Abstract
We derive an explicit feedback capacity formula for the OU-Colored AWGN channel. Among many others, this result shows that at least in some cases, the continuous-time Schalkwijk-Kailath coding scheme achieves the feedback capacity for such a channel, and feedback may not increase the capacity of a continuous-time ACGN channel even if the noise process is colored.
1 Introduction
We start with the following continuous-time additive white Gaussian noise (AWGN) channel
[TABLE]
where the channel noise is a white Gaussian process with unit double-sided spectral density, is the channel input and is the channel output. Since can be regarded as the derivative of the standard Brownian motion in the generalized sense [1, 2], or equivalently, the integral of , the AWGN channel as in (1) can be alternatively characterized by
[TABLE]
where is the channel input and is the channel output. Unlike white Gaussian noise, which is a generalized stochastic process in the sense of Schwartz’s distribution [3], Brownian motion is an ordinary stochastic process that has been extensively studied in stochastic calculus. Evidently, the two formulations as in (1) and (2) allow us to examine an AWGN channel from different perspectives; in particular, the use of Brownian motion equips us with a wide range of established tools and techniques in stochastic calculus (see, e.g., [4, 5] and references therein).
This paper is concerned with the following continuous-time additive colored Gaussian noise (ACGN) channel
[TABLE]
where the channel noise is a (possibly colored and generalized) stationary Gaussian process. Evidently, AWGN channels are a degenerated case of ACGN channels. Similarly as above, the ACGN channel as in (3) can be alternatively characterized by
[TABLE]
where is the (generalized) integral of . Following [4], the treatment of ACGN channels in this work is mainly based on the formulation in (4).
For any and , an * code* for the ACGN channel (4) consists of the following:
- ()
A message index independent of and uniformly distributed over .
- ()
For the non-feedback case, an encoding function , yielding codewords ; for the feedback case, an encoding function , yielding codewords . For both cases, the classical average power constraint is satisfied:
[TABLE]
- ()
A decoding functional .
Here we remark that for the feedback case, it follows from the pathwise continuity of that , and therefore the channel output is in fact the unique solution to the following stochastic functional differential equation:
[TABLE]
The error probability for the code as above is defined as
[TABLE]
A rate is achievable if there exists a sequence of codes with . The channel capacity is defined as the supremum of all achievable rates, denoted by for the non-feedback case and for the feedback case.
The literature on ACGN channels is vast, and so we only survey those results that are most relevant to this work below. It has been shown by Huang and Johnson [6, 7] that can be achieved by a Gaussian input. For a special family of ACGN channels, Hitsuda [8] has applied a canonical representation method to derive a fundamental formula for the channel mutual information (see Lemma 3.2); based on this result, Ihara [9] showed that can be achieved by a Gaussian input with an additive feedback term. Similarly as in the discrete-time case, the property that feedback can at most double the capacity of an ACGN channel, i.e., , is established by examining a discrete-time approximation of (see [10] [11] [12]). Employing a Hilbert space approach [13, 14], Baker [15, 16] has derived a theoretical formula for , which however is somewhat difficult to evaluate. When it comes to effective computation of or , to the best of our knowledge, there are only a few results featuring an “explicit” and “computable” formula, detailed below. Here, we remark that Baker, Ihara and Hitsuda have studied the capacity of some families of ACGN channels, yet under different types of power constraints (see [16, 15, 17, 8]).
For the ACGN channel formulated as in (3), when is an ordinary stationary Gaussian process with rational spectrum, can be determined by the water-filling method (see, e.g., [18, 14, 19, 17]). More specifically,
[TABLE]
where is the spectral density function (SDF) of the noise process and the water level is a constant determined by
[TABLE]
- 2.
For the AWGN channel as in (1) or (2), it is a classical result that and feedback does not increase the channel capacity, that is to say, (see, e.g., [4, 5, 20]). Moreover, can be achieved by a linear feedback coding scheme that maximizes the channel mutual information and minimizes the filtering error simultaneously [21, 22, 23].
In this paper, we will focus our attention on a special family of ACGN channels, which is characterized as
[TABLE]
where . Note that the channel above can be alternatively characterized by
[TABLE]
where, as before, is a white Gaussian process, and is a stationary Ornstein-Uhlenbeck (OU) process, arguably the simplest nontrivial continuous-time stationary Gaussian process. Evidently, when , the equation (7) boils down to (1), and when , the channel input, after going through an AWGN channel, will be further corrupted by an OU noise. For this reason, we will henceforth refer to the channel (6) as an OU-Colored AWGN channel.
The main contribution in this work is an explicit characterization of the feedback capacity of an OU-Colored AWGN channel. Before this work, no “explicit” and “computable” formula is known for any nontrivial stationary ACGN channel (3). Throughout the remainder of this paper, the notations and will be reserved for the OU-Colored AWGN channel (6).
We will first derive a lower bound on , which turns out to be tight for some cases. To achieve this, we will examine the following ACGN channel
[TABLE]
where is a Volterra kernel function on for any . Here we emphasize that the channel (8) may not correspond to a stationary ACGN channel as in (3). However, it can be shown that is equivalent to the Brownian motion [24], which renders the channel (8) more amenable to in-depth mathematical analysis, as evidenced by relevant results in the literature (see, e.g., [8, 9, 25]).
More specifically, let be the message process, and let denote the mutual information rate between and under the so-called continuous-time Schalkwijk-Kailath (SK) coding scheme. We will show (Theorem 4.2) that
[TABLE]
where is the limit of the unique solution to an ordinary differential equation, and moreover, one of the real roots of a third-order polynomial. It turns out that an OU-colored AWGN channel can be regarded as a special case of (8), and therefore can help provide a lower bound on .
With the aforementioned lower bound, we are ready to derive an explicit expression of . More specifically, by examining a discrete-time approximation of the channel (6), we prove (Theorem 5.1) that for the case , is upper bounded by , which means ; for the other cases, we show . As a byproduct, this result shows that feedback may not increase the capacity of a continuous-time ACGN channel even if noise process is colored. By contrast, for a discrete-time ACGN channel, feedback does not increase the capacity if and only if the noise spectrum is white (see [26, Corollary 4.3]).
The remainder of the paper is organized as follows. In Section 2, we review necessary notation and terminlogies. We review the coding theorem for the feedback capacity and introduce the continuous-time SK coding scheme in Section 3. Section 4 provides an asymptotic characterization of for a subclass of ACGN channels, which represents a lower bound on . In Section 5, we derive an explicit formula for .
2 Notation and Terminlogies
We use to denote the underlying probability space, and to denote the expectation with respect to the probability measure . As is typical in the theory of stochastic calculus, we assume the probability space is equipped with a filtration , which satisfies the usual conditions [27] and is rich enough to accommodate a standard Brownian motion. Throughout the paper, we will mostly use uppercase letters (e.g., , ) to denote random variables, and their lowercase counterparts (e.g., , ) to denote their realizations.
Let denote the space of all continuous functions over , and let be the space of all functions in that have continuous derivatives on . For any , let denote the space of all continuous functions over . Let be random variables defined on the probability space , which will be used to illustrate most of the notions and facts in this section (note that the same notations may have different connotations in other sections). Note that in this paper, a random variable can be real-valued with a probability density function, or path-valued (more precisely, - or -valued).
For any two path-valued random variables and , we use and to denote the probability distributions on induced by and , respectively, and the product distribution of and ; moreover, we will use to denote their joint probability distribution on . Besides, we use to denote the -field generated by .
For any two probability measures and , we write to mean they are equivalent, namely, is absolutely continuous with respect to and vice versa. By Hitsuda [24], if a Gaussian process is equivalent to a given Brownian motion, then there exists a (possibly different) Brownian motion such that can be uniquely represented by
[TABLE]
where is a Volterra kernel function on for any , i.e., if and for any . Conversely, for a given Brownian motion , if has a representation in the form (9), then is equivalent to . Note that, for any , there exists a Volterra kernel function , referred to as the resolvent kernel of , such that
[TABLE]
for any (see [28, Chapter 2]). Therefore, the Brownian motion can be also uniquely determined in terms of as
[TABLE]
The mutual information between two real-valued random variables is defined as
[TABLE]
where denote the probability density functions of , respectively, and their joint probability density function. More generally, for two -valued random variables , we define
[TABLE]
where denotes the Radon-Nikodym derivative of with respect to .
The notion of mutual information can be further extended to generalized random processes, which we will only briefly describe and we refer the reader to [13] for a more comprehensive exposition.
The mutual information between two generalized random processes and is defined as
[TABLE]
where the supremum is over all possible and all possible testing functions and , and we have defined
[TABLE]
[TABLE]
It can be verified that the general definition of mutual information as in (14) includes (12) and (13) as special cases; moreover, when one of and , say, , is a random variable, the general definition boils down to
[TABLE]
where the supremum is over all possible and all possible testing functions .
3 Continuous-Time SK Coding
In this section, we shall examine the continuous-time ACGN channel (8). Throughout this section, let .
The celebrated channel coding theorem by Shannon [20] states, roughly speaking, that for a discrete memoryless channel, the capacity can be written as a supremum of the mutual information between the channel input and output. This classical result has been extensively extended and generalized to various channel models. Not surprisingly, under some mild assumptions, similar results hold for the non-feedback and feedback capacity of our channel. We will present the coding theorem for the feedback capacity below, while that for the non-feedback capacity can be found in Section 5.2.
For the purpose of presenting a coding theorem for the feedback capacity, instead of transmitting a message index , a random variable taking values from a finite alphabet, we will transmit a message process , a real-valued random process. Then, compared to (5), the associated stochastic functional differential equation will take the following form:
[TABLE]
where we have set . Following [4], we consider the so-called -block feedback capacity
[TABLE]
where the supremum is taken over all pairs satisfying the following constraint
[TABLE]
Now, we define
[TABLE]
provided the limit exists, and furthermore define
[TABLE]
where the supremum is taken for all pairs satisfying the constraint
[TABLE]
Then, the aforementioned coding theorem for the feedback capacity is stated below.
Theorem 3.1** ([29, Theorem 1]).**
Assume that
[TABLE]
If and is continuous point of , then the rate is achievable. Conversely, if a rate is achievable, then .
The following lemma generalizes the classical I-CMMSE relationship in [5, 30].
Lemma 3.2** ([12, Theorem 1]).**
Suppose . Then, we have
[TABLE]
where is a random process defined by
[TABLE]
and is the resolvent kernel of in and .
When it comes to the -block feedback capacity of the channel (8), we remark that the so-called additive feedback coding scheme can achieve (see, e.g., [31, 9]). This coding scheme is formulated as follows. Consider the additive feedback coding scheme with , where represents the feedback term, causally dependent on the output , and is appropriately chosen such that the stochastic functional differential equation
[TABLE]
admits a unique solution. Obviously, if there is no feedback, (17) becomes
[TABLE]
Slightly extending the result [4, Theorem 6.2.3], we can prove the following lemma in the same manner.
Lemma 3.3**.**
Suppose that
[TABLE]
Then, for any , we have
[TABLE]
Note that (18) means that for the channel (8) under this scheme, additive feedback will not provide the receiver with any new information. However, feedback can be used as a means to save transmission energy, since, for a fixed message , we can lower by appropriately choosing . This observation suggests an effective way to design a coding scheme to maximize for the channel (8) in which satisfies (15). Indeed, Ihara proved the following result, for which a relatively more direct proof is provided in Appendix A.
Theorem 3.4** ([9, Theorem 3] Reformulated).**
For the continuous-time ACGN channel (8) under the constraint (15), of can be achieved by a Gaussian pair of the following form
[TABLE]
where
[TABLE]
Moreover, , and so the pair characterizes an additive feedback coding scheme of the form (17) where .
The essence of the above theorem is that we can restrict our attention to the coding schemes of the form as in (19). Following the spirits of the classical Schalkwijk-Kailath (SK) coding scheme, we formulate in our notation the continuous-time version of the celebrated SK coding scheme in the form of
[TABLE]
satisfying
[TABLE]
where is a standard Gaussian random variable and is some function.
In general, the above continuous-time SK coding scheme can be invalid in the sense that may not exist. However, in Sections 4 and 5, we will show that the continuous-time SK coding scheme is valid for a subclass of ACGN channels (8) and is also optimal for some special families of ACGN channels.
4 Mutual Information Rate
In this section, we narrow our attention to the special family of ACGN channels (8) in which the resolvent kernel of can be written as
[TABLE]
where and .
We first prove a lemma characterizing the asymptotics of the solution to the following ordinary differential equation (ODE)
[TABLE]
where satisfying and for two constants .
Lemma 4.1**.**
For every , the ODE (22) admits a unique solution . Moreover, exists, which is one of the real roots of the following cubic equation:
[TABLE]
Equipped with Lemma 4.1, we can prove the following theorem.
Theorem 4.2**.**
Assume the resolvent kernel of in (8) can be written in the form (21) with
[TABLE]
where . Then, we have
[TABLE]
where and is the solution of the ODE (22) with and . Moreover, is one of the real roots of the following cubic equation
[TABLE]
Proof.
We shall employ a continuous-time SK coding scheme . Let be a function defined by
[TABLE]
where the function is defined to be a solution of the following Abel equation of the first kind:
[TABLE]
It then follows from (23) and Lemma 4.1 that exists (denoted by ) and is one of the real roots of the cubic equation (25).
Next, we shall prove that the continuous-time SK coding scheme defined by (26) and (20) is valid, that is, for any
[TABLE]
Indeed, since satisfies (27), it holds that for all
[TABLE]
Multiplying both sides of (29) by , we obtain
[TABLE]
and
[TABLE]
Therefore, (29) leads to
[TABLE]
which is equivalent to
[TABLE]
where
[TABLE]
Therefore, noting the initial condition , it holds that
[TABLE]
By [32, Theorem 12.2], we can readily establish
[TABLE]
which, together with (32), immediately implies (28), as desired.
Now we are ready to prove (24). From Lemma 3.2 and (33), it follows that for a fixed ,
[TABLE]
Thus, we have
[TABLE]
where (a) follows from (31) and (32), (b) follows from Lemma 4.1. Thus, (24) is established and then the proof is complete. ∎
Remark 4.3**.**
It turns out that from the proof of Theorem 4.2, is uniquely determined by , rather than the choice of .
To illustrate the application of the above theorem, we give the following two examples.
Example 4.4**.**
When , the channel (8) boils down to the AWGN channel (2). Apparently, one can choose and , yielding , which is widely known as the capacity of (2). **
Example 4.5**.**
When , it turns out that the channel (8) boils down to
[TABLE]
Apparently, it can be verified that , where is a non-zero constant. Thus, we have , yielding that is the unique positive root of the cubic equation . This recovers Proposition 1 in [12]. **
To conclude this section, although Theorem 4.2 provides a lower bound on feedback capacity of a subclass of ACGN channels, this lower bound is somewhat implicit. In Section 5, we find more detailed answers by narrowing our attention to a special class of channel models.
5 Capacity of OU-Colored AWGN Channels
In this section, we focus on the following OU-Colored AWGN channel
[TABLE]
where
[TABLE]
The following theorem is our main result in which we derive an explicit formula for .
Theorem 5.1**.**
* is determined in the following two cases:*
- (1)
if or , then ;
- (2)
if , then is the unique positive root of the third-order polynomial
[TABLE]
Before the proof, we introduce two auxiliary random processes , by
[TABLE]
respectively. Let be a Gaussian random variable defined by
[TABLE]
Note that solves the stochastic differential equation
[TABLE]
Thus, we obtain
[TABLE]
Moreover, it holds that
[TABLE]
5.1 Proof of the Converse Part (Upper Bound)
In this subsection, we prove the converse part of Theorem 5.1, which relies on some existing results on the feedback capacity of discrete-time ARMA(1,1) Gaussian channels under the average power constraint (see detailed definitions in [33]). For such channels, Yang et al. [34, Theorem 7] derived a relatively explicit formula for feedback capacity under the assumption that stationary inputs can achieve feedback capacity, which has been confirmed by Kim in the proof of [26, Theorem 3.1]. Thus, feedback capacity for the ARMA(1,1) noise channels is known, as reformulated below.
Theorem 5.2** ([34],[26]111Theorem 5.2 has been stated and proved in [26, Theorem 5.3]. However, a recent paper [35] pointed out that the proof of a key result [26, Corollary 4.4] is incorrect, and as a consequence, the proof of Theorem 5.3 in [26] is invalid.).**
Suppose the noise process is an ARMA(1,1) Gaussian process satisfying
[TABLE]
where is a white Gaussian process with zero mean and unit variance. Then, under the average power constraint
[TABLE]
the feedback capacity of additive Gaussian channel is given by
[TABLE]
where is the unique positive root of the fourth-order polynomial
[TABLE]
Remark 5.3**.**
Yang et al. and Kim only gave the result for ; the case can be readily proved by converting it into the case ; the case can be easily established via a perturbation argument.
Then, we can derive an upper bound for the -block feedback capacity in the following lemma.
Lemma 5.4**.**
For any , the -block feedback capacity of the OU-Colored AWGN channel (34) is upper bounded by
[TABLE]
where is the unique positive root of polynomial (35).
Proof.
By Theorem 3.4, we can prove (39) by considering any Gaussian pair of the form (19) in which satisfies the constraint (15). Thus, WLOG, the message process is assumed to be Gaussian such that . If there is no feedback, the channel output is given by
[TABLE]
The channel input is assumed in the form . Then, the channel output is given by
[TABLE]
Moreover, it is known [9] that there exists a Volterra kernel on such that . The remainder of the proof is divided into three steps. In Steps 1 & 2, we assume that the following condition:
- (C.0)
The Volterra kernel is continuous on the set
is satisfied.
Step 1. In this step, we shall introduce a sequence of ARMA(1,1) Gaussian channels constructed from the OU-Colored AWGN channel (34) by using a discrete-time approximation method.
For any , we consider a partition of satisfying for all , where . Define by
[TABLE]
respectively, where . Then, it is shown that is an ARMA(1,1) Gaussian process satisfying
[TABLE]
which, however, is not stationary. It turns out that we can modify (41) to guarantee stationarity. Specifically, we redefine as follows:
[TABLE]
where . It is straightforward to verify that is a stationary ARMA(1,1) process of the following form
[TABLE]
Furthermore, we define and as follows:
[TABLE]
where and are defined by
[TABLE]
respectively. Note that (45) and (44) corresponds to -block discrete-time ARMA(1,1) Gaussian channels with feedback and without feedback, respectively.
Step 2. This step will be devoted to approximating and by feedback capacities of the sequence of ARMA(1,1) Gaussian channels (45).
We have the following chain of inequalities:
[TABLE]
where is some function (to be specified later) dependent on with the property , where denotes the -block feedback capacity [33] of the channel (45) under the constraint that the average power of the channel input is bounded by (see [26]) and denotes feedback capacity. Now, with (a)-(f) validified (proofs can be founded in Appendix C), (39) immediately follows from (46) and Theorem 3.4.
Step 3. We will prove that the continuity assumption (C.0) can be dropped. Indeed, there exists a sequence of Volterra kernels satisfying (C.0) and
[TABLE]
Set
[TABLE]
Then, we have
[TABLE]
where we have used the fact
[TABLE]
Note that (c)-(f) in Step 2 hold true for any continuous Volterra kernel function . Thus, replacing in (46) by in the derivation of (c,d,e,f), we obtain
[TABLE]
which, together with (47), establishes the same inequality (46). ∎
The following corollary is an immediate consequence of Lemma 5.4.
Corollary 5.5**.**
It holds that
[TABLE]
Proof.
For any input satisfying (16), there exists a function with such that
[TABLE]
for all . By the definition of , we obtain
[TABLE]
Thus, (48) immediately follows from (39) and the continuity of and on . ∎
Proof of the Converse Part.
The converse part immediately follows from Theorem 3.1, Lemma 5.4 and Corollary 5.5. ∎
5.2 Proof of the Achievability Part (Lower Bound)
We will first prove the case (2) in Theorem 5.1, which relies on Theorem 4.2.
Case (2).
Note that can be regarded as the solution of the following stochastic differential equation
[TABLE]
Set . Then, the covariance function of is
[TABLE]
which is continuous at . By [36, Theorem 7.15], it holds that . It then follows from (9)-(11) that there exists a standard Brownian motion on such that
[TABLE]
where the Volterra kernel function for any .
We now evaluate the resolvent kernel and prove that fulfills all the conditions in Theorem 4.2. It follows from (36) and (37) that
[TABLE]
here is a Volterra kernel function satisfying for any , where the Volterra kernel function for any . Since the resolvent kernel of is calculated by
[TABLE]
[TABLE]
where is the Volterra kernel function satisfying for any . Therefore, by (50), we have
[TABLE]
where
[TABLE]
Then, since is the resolvent kernel of , it holds that
[TABLE]
By [37, Lemma 6.2.6], the innovation process defined by
[TABLE]
is a standard Brownian motion. The one-dimensional Kalman-Bucy filter [37, Theorem 6.2.8] is applied to estimate from the observation equations (51) to yield the following estimate:
[TABLE]
Substituting (52) and (54) into (53), we obtain (49) by a series of elementary calculations. Specifically, is calculated by
[TABLE]
It is easy to see that satisfies all the conditions in Theorem 4.2. Then, the corresponding are given by
[TABLE]
and
[TABLE]
respectively. By Theorem 4.2, we have , where is one of the real roots of the following cubic equation
[TABLE]
It is not difficult to see that the equation (55) has the unique positive root for all . Then, substituting into (55), we are able to prove that is the unique positive root of the third-order polynomial (35), which implies
[TABLE]
This, together with Corollary 5.5 and Theorem 3.1, immediately yields
[TABLE]
as desired. ∎
Remark 5.6**.**
Noting, for a fixed , actually depends on , which we rewrite as . It follows from (56) that if , where is feedback capacity of an AWGN channel (2). In other words, “coloring” may increase capacity.
However, the condition that or in Case (1) may invalidate the uniqueness of the real root of the cubic equation (55). As a result, it is challenging to determine explicitly, despite the fact that it must be one of the real roots of the polynomial (55). Nevertheless, all real roots of this polynomial must be in . As a result, , which suggests that the continuous-time SK coding scheme fails to achieve the capacity in Case (1).
Next, to prove the achievability of Case (1), let us turn our attention to the OU-Colored AWGN channel (7) in the generalized sense. Let
[TABLE]
where we have defined that , . Then, in the most rigorous terms, the channel (7) should be interpreted as
[TABLE]
where is the space of test functions over , i.e., all infinitely differentiable real functions with bounded support. Now, for any , let , and define
[TABLE]
where the supremum is taken over all positive integers and all test functions . Then, we consider the so-called -block non-feedback capacity
[TABLE]
where the supremum is taken over all independent of and satisfying the average power constraint
[TABLE]
Furthermore, we define
[TABLE]
provided the limit exists, and define
[TABLE]
where the supremum is taken over all independent of and satisfying
[TABLE]
Then, we present the aforementioned coding theorem for below.
Theorem 5.7** ([29, Theorem 1]).**
Assume that
[TABLE]
If and is continuous point of , then the rate is achievable. Conversely, if a rate is achievable, then .
Then, the proof of achievability of Case (1) will use the following corollary, which gives the explicit formula for .
Corollary 5.8**.**
It holds that
[TABLE]
for all or .
The proof relies on the following result, whose proof is very similar to that of [12, Lemma 4] and thus omitted.
Lemma 5.9**.**
* is a generalized stationary Gaussian process with spectral density function*
[TABLE]
where .
Proof of Corollary 5.8.
Note that it follows from Lemma 5.4 and that the condition (57) is fulfilled. Now, we claim that
[TABLE]
which, together with Theorem 5.7, immediately implies (58). To prove (59), it suffices to show that
[TABLE]
since follows from Corollary 5.5 and . For each , define a function
[TABLE]
Consider a series of zero-mean stationary Gaussian inputs with spectral density functions , respectively. Since both and are rational, by [14, Theorem 10.3.1], we have
[TABLE]
Note that and is a strictly decreasing (resp., increasing) function on (resp., ). Thus, by the monotone convergence theorem, we deduce that
[TABLE]
Next, we consider a series of zero-mean stationary Gaussian inputs with spectral density functions
[TABLE]
Similarly, for a fixed , the above argument yields
[TABLE]
Then, (60) is immediately derived by letting . The proof is then complete. ∎
Now we can prove the achievability of Case (1).
Case (1).
The achievability part follows immediately from together with Proposition 5.8. ∎
Appendix A Proof of Theorem 3.4
It is known that is achieved by transmitting a Gaussian message process with an additive feedback term given by (17). Thus, it suffices to consider any additive feedback coding scheme with , where is Gaussian and satisfies (15). Note that can be written as
[TABLE]
where is an -Volterra kernel. Let
[TABLE]
Substituting (61) into (62) and by (10), the stochastic equation
[TABLE]
has the unique solution , which implies is uniquely determined by and thus whenever . Now, let . It then follows from Lemma 3.3 that
[TABLE]
and
[TABLE]
where follows from Lemma 3.3 and holds true since is measurable. The proof is then complete.
Appendix B Proof of Lemma 4.1
We first prove the existence and uniqueness of the solution . Let denote the polynomial (in ): . Since are continuous, [38, Theorem (7.6)] gives rise to a unique nonextendible solution , which is either defined for all or blows up at some . In fact, the domain of extends to the infinity since it cannot blow up in finite interval. Indeed, by way of contradiction, suppose that there exists such that
[TABLE]
Then, it follows from the continuity of at that there exists such that for all . However, by (22), it holds that for , which contradicts (63), as desired.
Next, we shall prove the “moreover” part. To achieve this, let
[TABLE]
Since
[TABLE]
we have that . Next, we deal with the following three cases:
- (I)
The cubic has one real root and two non-real complex conjugate roots ;
- (II)
The cubic has three distinct real roots ();
- (III)
The cubic has a simple root and a double root ).
We shall prove that the solution converges to some real root as for case by case. For , let .
Case (I). Let be a sufficiently small constant. It then follows immediately from the continuity of roots of polynomial [39, Theorem B] and (64) that there exists such that for any , admits the unique real root satisfying
[TABLE]
It then remains to show that there exists such that
[TABLE]
Indeed, it then immediately follows from (65) and (66) that , as desired.
Note that by ODE (22), we have
[TABLE]
Clearly, if , then (66) holds true with . WLOG, we assume in the following that and since the proof is similar if and . We now claim that there exists such that
[TABLE]
To see this, by way of contradiction, we suppose the opposite is true, that is,
[TABLE]
It then follows from (67) that for all
[TABLE]
which, together with (65), implies that
[TABLE]
Hence, both and exist, which implies . Then, by the ODE (22), we have , which contradicts (68). Consequently, (66) immediately follows from (67) with , as desired.
Case (II). The proof of this case is largely similar to that in Case (I), except that may converge to the middle root as . Indeed, let so that for . Then, there exists so that the polynomial admits three real roots satisfying for all . Consider seven disjoint subintervals of : and . On the one hand, the same argument in Case (I) yields if or if . On the other hand, if , then there will be only two subcases for , i.e., either for all or for some . The latter subcase can be proved similarly as done before. For the previous subcase, we have , as desired.
Case (III). WLOG, we assume that . Let be given such that for . As in Case (II), it suffices to consider the subcase . By (67), for has two subcases, i.e., either for all or for some . The former subcase leads to and the latter subcase of converges to . The proof is then complete.
Appendix C Proofs of (a)-(f)
We shall first give the proofs of (a), (c) and (e) as follows.
Proof of (a).
The equality (a) follows from Lemma 3.3. ∎
Proof of (c).
It is easy to show that is a linear combination of , and vice versa, which implies (c). ∎
Proof of (e).
From the stationarity of the ARMA(1,1) process (43), it follows that is super-additive [26]:
[TABLE]
As a consequence, for any , which implies (e). ∎
We are now in a position to give the proofs of (b) and (d).
Proof of (b).
Let for . Define and as follows:
[TABLE]
where and . We further define an approximation process of as
[TABLE]
Let . Then, we have
[TABLE]
Hence, defined in (42) can be equivalently written as
[TABLE]
It follows from (36) and (38) that can similarly expressed as:
[TABLE]
Therefore, for any , we have
[TABLE]
Hence, we can readily prove that converges in distribution to . By the lower semi-continuity of mutual information [13], we obtain
[TABLE]
This, together with and , implies (b). ∎
Proof of (d).
Recall that we have constructed an -block discrete-time ARMA(1,1) Gaussian channel with feedback
[TABLE]
The energy and average power for such a channel can be computed as
[TABLE]
Define a Volterra kernel by
[TABLE]
and a random process by
[TABLE]
respectively. By the assumption (C.0), we have that , where denotes the usual norm on . Thus, it is clear from (40) and (70) that
[TABLE]
Furthermore, set and respectively. It then follows from (69) and (42) that
[TABLE]
and
[TABLE]
for all . Therefore, we have
[TABLE]
where (a) follows from the general Lebesgue dominated convergence theorem, and where in (b) we have used the result derived from the assumption (C.0) that
[TABLE]
and another result derived from (73) and (74) that
[TABLE]
Thus, we conclude that
[TABLE]
which follows from (72), (75) and
[TABLE]
Now, by Hölder’s inequality, we have
[TABLE]
which, together with (76), implies that there exists an error function such that
[TABLE]
and
[TABLE]
Then, (d) immediately follows from the definition of -block capacity. ∎
Note that defined by (42) satisfies
[TABLE]
where and .
Proof of (f).
In the following, we deal with the case only, since the case can be proved in a parallel manner.
First of all, it is clear that
[TABLE]
Next, we complete the proof by considering the following three cases:
Case 1: . For any arbitrarily small there exists a sufficiently large such that for , and . Thus, by Theorem 5.2, we obtain , where is the unique positive root of the following polynomial
[TABLE]
By the continuity of roots of polynomial [39, Theorem B], we infer that . Moreover, by elementary calculus, it holds that is differentiable in over and exists. Since
[TABLE]
exists, which is denoted by . Then, it holds that
[TABLE]
for large enough. Now, substituting (78) into (77) and letting , we establish the equation
[TABLE]
Thus, we have
[TABLE]
Letting , we conclude . Thus, we complete the proof of (f) in this case.
Case 2: . By Theorem 5.2 again, the polynomial (77) in Case 1 becomes
[TABLE]
Similarly, we can also obtain that , as desired.
Case 3: . In this case, the OU-Colored AWGN channel (34) boils down to a white Gaussian channel. Indeed, similarly as above, we can readily show that , which is our desired result. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Koralov and Y. G. Sinai, Theory of Probability and Random Processes . Springer Science & Business Media, 2007.
- 2[2] I. M. Gel’fand and N. Y. Vilenkin, Generalized Functions: Applications of Harmonic Analysis , vol. 4. Academic press, 2014.
- 3[3] N. Obata, White Noise Calculus and Fock Space . Springer, 2006.
- 4[4] S. Ihara, Information Theory for Continuous Systems , vol. 2. World Scientific, 1993.
- 5[5] T. Kadota, M. Zakai, and J. Ziv, “Mutual information of the white Gaussian channel with and without feedback,” IEEE Transactions on Information Theory , vol. 17, no. 4, pp. 368–371, 1971.
- 6[6] R. Huang and R. Johnson, “Information capacity of time-continuous channels,” IRE Transactions on Information Theory , vol. 8, no. 5, pp. 191–198, 1962.
- 7[7] R. Huang and R. Johnson, “Information transmission with time-continuous random processes,” IEEE Transactions on Information Theory , vol. 9, no. 2, pp. 84–94, 1963.
- 8[8] M. Hitsuda and S. Ihara, “Gaussian channels and the optimal coding,” Journal of Multivariate Analysis , vol. 5, no. 1, pp. 106–118, 1975.
