This paper characterizes input-degradedness and input-equivalence between channels, providing necessary and sufficient conditions, and explores the topological and continuity properties of these relationships in the space of channels.
Contribution
It introduces new characterizations of input-degradedness, including a Blackwell-Sherman-Stein-like theorem, and analyzes the topologies and continuity of channel parameters under input-equivalence.
Findings
01
A necessary and sufficient condition for input-degradedness.
02
Any good decoder for one channel is also good for an input-degraded channel.
03
Topological properties and continuity of channel parameters under input-equivalence.
Abstract
A channel W is said to be input-degraded from another channel W′ if W can be simulated from W′ by randomization at the input. We provide a necessary and sufficient condition for a channel to be input-degraded from another one. We show that any decoder that is good for W′ is also good for W. We provide two characterizations for input-degradedness, one of which is similar to the Blackwell-Sherman-Stein theorem. We say that two channels are input-equivalent if they are input-degraded from each other. We study the topologies that can be constructed on the space of input-equivalent channels, and we investigate their properties. Moreover, we study the continuity of several channel parameters and operations under these topologies.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
On the Input-Degradedness and Input-Equivalence Between Channels
A channel W is said to be input-degraded from another channel W′ if W can be simulated from W′ by randomization at the input. We provide a necessary and sufficient condition for a channel to be input-degraded from another one. We show that any decoder that is good for W′ is also good for W. We provide two characterizations for input-degradedness, one of which is similar to the Blackwell-Sherman-Stein theorem. We say that two channels are input-equivalent if they are input-degraded from each other. We study the topologies that can be constructed on the space of input-equivalent channels, and we investigate their properties. Moreover, we study the continuity of several channel parameters and operations under these topologies.
I Introduction
The ordering of communication channels was first introduced by Shannon in [1]. A channel W′ is said to contain another channel W if W can be simulated from W′ by randomization at the input and the output using a shared randomness between the transmitter and the receiver. Shannon showed that the existence of an (n,M,ϵ) code for W implies the existence of an (n,M,ϵ) code for W′.
Another ordering that has been well studied is the degradedness between channels. A channel W is said to be degraded from another channel W′ if W can be simulated from W′ by randomization at the output, or more precisely, if W can be obtained from W′ by composing it with another channel. It is easy to see that degradedness is a special case of Shannon’s ordering. One can trace the roots of the notion of degradedness to the seminal work of Blackwell in the 1950’s about comparing statistical experiments [2]. Note that in the Shannon’s ordering, the input and output alphabets need not be the same, whereas in the degradedness definition, we have to assume that W and W′ share the same input alphabet X but they can have different output alphabets.
It is well known that if W is degraded from W′, then for any fixed code C⊂Xn, the probability of error of the ML decoder for C when it is used for W′ is at least as good as the probability of error of the ML decoder for C when it is used for W.
In this paper, we introduce another special case of the Shannon ordering that we call input-degradedness. A channel W is said to be input-degraded from another channel W′ if W can be simulated from W′ by randomization at the input. Note that W and W′ must have the same output alphabet, but they can have different input alphabets. We say that two channels are input-equivalent if they are input-degraded from each other.
One motivation to study the input-degradedness ordering is the following: let W be a fixed channel with input alphabet X and output alphabet Y. Assume that after some effort, an engineer came up with a good encoder/decoder pair for W in the sense that the probability of error is small. Assume also that the designed decoder is particularly desirable for some reason (e.g., it has a low computational complexity) so that we would like to use it for other channels if possible. What are the channels W′ for which the designed decoder also performs well in the sense that there exists a code having a low probability of error under the same decoder? We will show that a sufficient condition for the decoder to perform well for W′ is the input-degradedness of W with respect to W′.
In [3] and [4], we constructed topologies for the space of equivalent channels and studied the continuity of various channel parameters and operations under these topologies. In this paper, we show that many of the results in [3] and [4] can be replicated (with some variation) for the space of input-equivalent channels.
In Section II, we introduce the preliminaries for this paper. In Section III, we introduce and study the input-degradedness ordering. Various operational implications and characterizations of input-degradedness are provided in Section IV. The quotient topology of the space of input-equivalent channels with fixed input and output alphabets is studied in Section V. The space of input-equivalent channels with fixed output alphabet and arbitrary but finite input alphabet is defined in Section VI. A topology on this space is said to be natural if it induces the quotient topology on the subspaces of input-equivalent channels with fixed input alphabet. In Section VI, we investigate the properties of natural topologies. The finest natural topology, which we call the strong topology, is studied in Section VII. The similarity metric on the space of input-equivalent channels is introduced in Section VIII. We study the continuity of various channel parameters and operations under the strong and similarity topologies in Section IX. Finally, we show that the Borel σ-algebra is the same for all Hausdorff natural topologies.
II Preliminaries
We assume that the reader is familiar with the basic concepts of general topology. The main concepts and theorems that we need can be found in the preliminaries section of [3].
II-A Measure theoretic notations
The set of probability measures on a measurable space (M,Σ) is denoted as P(M,Σ). For every P1,P2∈P(M,Σ), the total variation distance between P1 and P2 is defined as:
[TABLE]
Let P be a probability measure on (M,Σ), and let f:M→M′ be a measurable mapping from (M,Σ) to another measurable space (M′,Σ′). The push-forward probability measure of P by f is the probability measure f#P on (M′,Σ′) defined as (f#P)(A′)=P(f−1(A′)) for every A′∈Σ′. If A is a subset of P(M,Σ), we define its push-forward by f as f#(A)={f#P:P∈A}.
We denote the product of two measurable spaces (M1,Σ1) and (M2,Σ2) as (M1×M2,Σ1⊗Σ2). If P1∈P(M1,Σ1) and P2∈P(M2,Σ2), we denote the product of P1 and P2 as P1×P2. Let A1 and A2 be two subsets of P(M1,Σ1) and P(M2,Σ2) respectively. We define the tensor product of A1 and A2 as follows:
[TABLE]
If X is a finite set, we denote the set of probability distributions on X as ΔX. We always endow ΔX with the total variation distance and its induced topology.
II-B The space of channels from X to Y
Let DMCX,Y be the set of all channels having X as input alphabet and Y as output alphabet. For every W,W′∈DMCX,Y, define the distance between W and W′ as:
[TABLE]
Throughout this paper, we always associate the space DMCX,Y with the metric distance dX,Y and the metric topology TX,Y induced by it. It is easy to see that TX,Y is the same as the topology inherited from the Euclidean topology of RX×Y by relativization. It is also easy to see that the metric space DMCX,Y is compact and path-connected (see [3]).
For every W∈DMCX,Y and every V∈DMCY,Z, define the composition V∘W∈DMCX,Z as
[TABLE]
For every mapping f:X→Y, define the deterministic channel Df∈DMCX,Y as
[TABLE]
It is easy to see that if f:X→Y and g:Y→Z, then Dg∘Df=Dg∘f.
II-C Convex-extreme points
Let X be a finite set. For every A⊂ΔX, let co(A) be the convex hull of A. We say that p∈A is convex-extreme if it is an extreme point of co(A), i.e., for every p1,…,pn∈co(A) and every λ1,…,λn>0 satisfying i=1∑nλi=1 and i=1∑nλipi=p, we have p1=…=pn=p. It is easy to see that if A is finite, then the convex-extreme points of A coincide with the extreme points of co(A). We denote the set of convex-extreme points of A as CE(A).
II-D The Hausdorff metric
Let (M,d) be a metric space. Let K(M) be the set of compact subsets of M. The Hausdorff metric on K(M) is defined as:
[TABLE]
II-E Quotient topology
Let (T,U) be a topological space and let R be an equivalence relation on T. The quotient topology on T/R is the finest topology that makes the projection mapping ProjR onto the equivalence classes continuous. It is given by
[TABLE]
Lemma 1**.**
Let f:T→S be a continuous mapping from (T,U) to (S,V). If f(x)=f(x′) for every x,x′∈T satisfying xRx′, then we can define a transcendent mappingf:T/R→S such that f(x^)=f(x′) for any x′∈x^. f is well defined on T/R . Moreover, f is a continuous mapping from (T/R,U/R) to (S,V).
Let (T,U) and (S,V) be two topological spaces and let R be an equivalence relation on T. Consider the equivalence relation R′ on T×S defined as (x1,y1)R′(x2,y2) if and only if x1Rx2 and y1=y2. A natural question to ask is whether the canonical bijection between \big{(}(T/R)\times S,(\mathcal{U}/R)\otimes\mathcal{V}\big{)} and \big{(}(T\times S)/R^{\prime},(\mathcal{U}\otimes\mathcal{V})/R^{\prime}\big{)} is a homeomorphism. It turns out that this is not the case in general. The following theorem, which is widely used in algebraic topology, provides a sufficient condition:
Theorem 1**.**
[5]**
If (S,V) is locally compact and Hausdorff, then the canonical bijection between \big{(}(T/R)\times S,(\mathcal{U}/R)\otimes\mathcal{V}\big{)} and \big{(}(T\times S)/R^{\prime},(\mathcal{U}\otimes\mathcal{V})/R^{\prime}\big{)} is a homeomorphism.
Corollary 1**.**
[4]** Let (T,U) and (S,V) be two topological spaces, and let RT and RS be two equivalence relations on T and S respectively. Define the equivalence relation R on T×S as (x1,y1)R(x2,y2) if and only if x1RTx2 and y1RSy2. If (S,V) and (T/RT,U/RT) are locally compact and Hausdorff, then the canonical bijection between \big{(}(T/R_{T})\times(S/R_{S}),(\mathcal{U}/R_{T})\otimes(\mathcal{V}/R_{S})\big{)} and \big{(}(T\times S)/R,(\mathcal{U}\otimes\mathcal{V})/R\big{)} is a homeomorphism.
III Input-degradedness and input-equivalence
Let X,X′ and Y be three finite sets. Let W∈DMCX,Y and W′∈DMCX′,Y. We say that W is input-degraded from W′ if there exists a channel V′∈DMCX,X′ such that W=W′∘V′. The channels W and W′ are said to be input-equivalent if each one is input-degraded from the other.
Let W∈DMCX,Y be a fixed channel with input alphabet X and output alphabet Y. For every x∈X, define Wx∈ΔY as:
[TABLE]
Proposition 1**.**
Let X′,X and Y be three finite sets. W∈DMCX,Y is input-degraded from W′∈DMCX′,Y if and only if co({Wx:x∈X})⊂co({Wx′′:x′∈X′}).
Proof.
Assume that W is input-degraded from W′. There exists V′∈DMCX,X′ such that W=W′∘V′. For every x∈X and y∈Y, we have:
[TABLE]
Therefore, Wx=x′∈X′∑V′(x′∣x)Wx′′ which means that Wx∈co({Wx′′:x′∈X′}) for every x∈X, hence co({Wx:x∈X})⊂co({Wx′′:x′∈X′}).
Conversely, assume that co({Wx:x∈X})⊂co({Wx′′:x′∈X′}) and let x∈X. Since Wx∈co({Wx′′:x′∈X′}), there exists a set of numbers αx,x′≥0 satisfying x′∈X′∑αx,x′=1 such that Wx=x′∈X′∑αx,x′Wx′. Define V′∈DMCX,X′ as V(x′∣x)=αx,x′ for every x∈X and every x′∈X′. We have W=W′∘V′ and so W is input-degraded from W′.
∎
For every channel W∈DMCX,Y, we define the input-equivalence characteristic of W, or simply the characteristic of W, as CE(W):=CE({Wx:x∈X}). The input-rank of W∈DMCX,Y is the size of its characteristic: irank(W)=∣CE(W)∣.
Proposition 2**.**
Let X′,X and Y be three finite sets. W∈DMCX,Y is input-equivalent to W′∈DMCX′,Y if and only if CE(W)=CE(W′).
Proof.
It follows from Proposition 1 that W is input-equivalent to W′ if and only if co({Wx:x∈X})=co({Wx′′:x′∈X′}), which happens if and only if CE(W)=CE(co({Wx:x∈X}))=CE(co({Wx′′:x′∈X′}))=CE(W′).
∎
IV Operational implications of input-degradedness
IV-A Operational implication in terms of decoders
Let Y be a finite set. An (n,M)-decoder on Y is a mapping D:Yn→M, where ∣M∣=M. The set M is the message set of D, n is the blocklength of D, M is the size of D and n1log∣M∣ is the rate of D (measured in nats).
Let W∈DMCX,Y be a channel with input alphabet X and output alphabet Y, and let D:Yn→M be a decoder on Y. A maximum-likelihood (ML) encoder for D when it is used for W is any encoder E:M→Xn satisfying
[TABLE]
where (E1(m),…,En(m))=E(m)∈Xn.
It is easy to see that a maximum-likelihood encoder has the best probability of error among all encoders (assuming that the decoder D is used). The probability of error of D under ML-encoding for W is given by:
[TABLE]
Proposition 3**.**
Let X′,X and Y be three finite sets. If W∈DMCX,Y is input-degraded from W′∈DMCX′,Y, then Pe,D(W′)≤Pe,D(W) for every decoder D on Y. Moreover, if W and W′ are input-equivalent, then Pe,D(W)=Pe,D(W′) for every decoder D on Y.
Proof.
Assume that W∈DMCX,Y is input-degraded from W′∈DMCX′,Y. Let V′∈DMCX,X′ be such that W=W′∘V′.
Fix an (n,M) decoder D on Y and let M be its message set. We have:
[TABLE]
Therefore Pe,D(W′)≤Pe,D(W).
If W and W′ are input-degraded from each other, then Pe,D(W′)≤Pe,D(W) and Pe,D(W)≤Pe,D(W′), hence Pe,D(W′)=Pe,D(W).
∎
IV-B A characterization of input-degradedness
Let W∈DMCX,Y and let U be a finite set. For every p∈ΔU and every D∈DMCY,U, define
[TABLE]
Pc(p,W,D) can be interpreted as follows: let U be a random variable in U distributed as p. Assume that U was encoded using the random encoder E∈DMCU,X to get X∈X. Send X through the channel W and let Y∈Y be the output. Apply the random decoder D∈DMCY,U on Y to get an estimate U^ of U. We have:
[TABLE]
Therefore, Pc(p,W,D) is the optimal probability of successfully estimating U by the fixed decoder D among all random encoders E∈DMCU,X. Note that the optimal encoder can always be chosen to be deterministic.
Theorem 2**.**
A channel W∈DMCX,Y is input-degraded from another channel W′∈DMCX′,Y if and only if Pc(p,W,D)≤Pc(p,W′,D) for every p∈ΔU, every D∈DMCY,U and every finite set U.
Proof.
Assume that W is input-degraded from W′. There exists V′∈DMCX,X′ such that W=W′∘V′. For every finite set U, every p∈ΔU and every D∈DMCY,U, we have:
[TABLE]
Conversely, assume that Pc(p,W,D)≤Pc(p,W′,D) for every p∈ΔU, every D∈DMCY,U and every finite set U.
Let x0 be any symbol that does belong to X and let U=X∪{x0}. For every n≥1, define pn∈ΔU as follows:
[TABLE]
pn was chosen in such a way that pn(x)pn(x0)=n∣X∣ for every x∈X. This is going to be useful later. Define the channel W0∈DMCU,Y as follows:
[TABLE]
Fix the encoder E∈DMCU,X as follows:
[TABLE]
For every D∈DMCY,U, we have:
[TABLE]
Therefore,
[TABLE]
hence
[TABLE]
or equivalently
[TABLE]
Note that the sets DMCY,U and DMCU,X′ are compact and convex. On the other hand, since the function \displaystyle\sum_{\begin{subarray}{c}u\in\mathcal{U},\\
y\in\mathcal{Y}\end{subarray}}p_{n}(u)\Big{(}W_{0}(y|u)-(W^{\prime}\circ E^{\prime})(y|u)\Big{)}D(u|y) is affine in both D∈DMCY,U and E′∈DMCU,X′, it is continuous, concave in D and convex in E′. Therefore, we can apply the minimax theorem [6] to exchange the max and the min in Equation (1). We obtain:
[TABLE]
Therefore, there exists En′∈DMCU,X′ such that
[TABLE]
where (a) follows from the fact that \displaystyle\sum_{\begin{subarray}{c}u\in\mathcal{U},\\
y\in\mathcal{Y}\end{subarray}}p_{n}(u)\Big{(}W_{0}(y|u)-(W^{\prime}\circ E^{\prime}_{n})(y|u)\Big{)}D(u|y) is maximized when D is chosen to be deterministic in such a way that for every y∈Y, D(uy∣y)=1 for any uy∈U satisfying \displaystyle p_{n}(u_{y})(W_{0}(y|u_{y})-(W^{\prime}\circ E^{\prime}_{n})(y|u_{y}))=\max_{u\in\mathcal{U}}\Big{\{}p_{n}(u)\big{(}W_{0}(y|u)-(W^{\prime}\circ E^{\prime}_{n})(y|u)\big{)}\Big{\}}. We conclude that
[TABLE]
Assume there exists y∈Y and u~∈U such that
[TABLE]
In this case, we have
[TABLE]
which is a contradiction. Therefore, for every y∈Y and every x∈X, we have
[TABLE]
which implies that
[TABLE]
Since the space DMCU,X′ is compact, there exists a converging subsequence (Enk′)k≥0 of (En′)n≥1. Let E′ be the limit of (Enk′)k≥0. For every x∈X and every y∈Y, we have:
[TABLE]
which means that W(y∣x)=(W′∘E′)(y∣x). Define V′∈DMCX,X′ as V′(x′∣x)=E′(x′∣x) for every x∈X and every x′∈X′. For every x∈X and every y∈Y, we have:
[TABLE]
Therefore, W=W′∘V′. We conclude that W is input-degraded from W′.
∎
IV-C A characterization in terms of randomized games
A randomized game is a 5-tuple G=(Z,X,Y,l,W) such that X,Y and Z are finite sets, l is a mapping from Z×Y to R, and W∈DMCX,Y. The mapping l is called the payoff function of the game G, and the channel W is called the randomizer of G. During the game, a player sees a symbol z∈Z and decides on a symbol x∈X. A random symbol y∈Y is then randomly generated according to the conditional probability distribution W(y∣x) and the player gets the payoff l(z,y).
A strategy for the game G is a channel S∈DMCZ,X. For every z∈Z, the payoff gained by the strategy S for z in the game G is given by:
[TABLE]
The payoff vector gained by the strategy S in the game G is given by:
[TABLE]
It is easy to see that for every α∈[0,1] and every S1,S2∈DMCZ,X, we have
[TABLE]
The achievable payoff region for the game G is given by:
The average payoff for the strategy S∈DMCZ,X for the game G is given by:
[TABLE]
The optimal average payoff for the game G is given by
[TABLE]
Note that we can always find an optimal strategy that is deterministic.
The following theorem provides a characterization of input-degradedness that is similar to the famous Blackwell-Sherman-Stein theorem [2], [7], [8].
Theorem 3**.**
Let X,X′ and Y be three finite sets. Let W∈DMCX,Y and W′∈DMCX′,Y. The following conditions are equivalent:
(a)
W* is input-degraded from W′.*
(b)
For every finite set Z and every payoff function l:Z×Y→R, we have
[TABLE]
(c)
For every finite set Z and every payoff function l:Z×Y→R, we have
[TABLE]
Proof.
Assume that (a) is true. There exists V′∈DMCX′,X such that W=W′∘V′. Fix a finite set Z and a payoff function l:Z×Y→R. Define G=(Z,X,Y,l,W) and G′=(Z,X′,Y,l,W′).
Fix \vec{v}=(v_{z})_{z\in\mathcal{Z}}\in\{\operatorname*{ach}}(\mathcal{G}).ThereexistsS\in\operatorname*{DMC}{\mathcal{Z},\mathcal{X}}suchthat(v_{z}){z\in\mathcal{Z}}=\vec{v}=\big{(}$(z,S,\mathcal{G})\big{)}{z\in\mathcal{Z}}.LetS^{\prime}=V^{\prime}\circ S.Foreveryz\in\mathcal{Z}$, we have:
[TABLE]
Therefore, \vec{v}=\vec{\}(S^{\prime},\mathcal{G}^{\prime})\in${\operatorname*{ach}}(\mathcal{G}^{\prime}).Sincethisistrueforevery\vec{v}\in${\operatorname*{ach}}(\mathcal{G}),wehave${\operatorname*{ach}}(\mathcal{G})\subset${\operatorname*{ach}}(\mathcal{G}^{\prime})$. We conclude that (a) implies (b).
Now assume that (b) is true. Fix a finite set Z and a payoff function l:Z×Y→R. Define G=(Z,X,Y,l,W) and G′=(Z,X′,Y,l,W′). We have \{\operatorname*{ach}}(\mathcal{G})\subset${\operatorname*{ach}}(\mathcal{G}^{\prime})$. Therefore,
[TABLE]
where (∗) follows from the fact that \{\operatorname*{ach}}(\mathcal{G})\subset${\operatorname*{ach}}(\mathcal{G}^{\prime})$. This shows that (b) implies (c).
Now assume that (c) is true. Fix a finite set U, p∈ΔU and D∈DMCY,U. Define the payoff function l:U×Y→R as l(u,y)=∣U∣p(u)D(u∣y). Define the randomized games G=(U,X,Y,W,l) and G′=(U,X′,Y,W′,l). We have:
[TABLE]
Similarly, we can show that P_{c}(p,W^{\prime},D)=\{\operatorname*{opt}}(\mathcal{G}^{\prime}).Sinceweassumedthat(c)istrue,wehave${\operatorname*{opt}}(\mathcal{G})\leq${\operatorname*{opt}}(\mathcal{G}^{\prime}).Therefore,foreveryfiniteset\mathcal{U},everyp\in\Delta{\mathcal{U}}andeveryD\in\operatorname*{DMC}{\mathcal{Y},\mathcal{U}},wehaveP{c}(p,W,D)\leq P_{c}(p,W^{\prime},D). Theorem [2](#Thmmythe2) now implies that Wisinput−degradedfromW^{\prime}$, hence (c) implies (a). We conclude that (a), (b) and (c) are equivalent.
∎
V Space of input-equivalent channels from X to Y
V-A The DMCX,Y(i) space
Let X and Y be two finite sets. Define the equivalence relation RX,Y(i) on DMCX,Y as follows:
[TABLE]
Definition 1**.**
The space of input-equivalent channels with input alphabet X and output alphabet Y is the quotient of the space of channels from X to Y by the input-equivalence relation:
[TABLE]
We define the topology TX,Y(i) on DMCX,Y(i) as the quotient topology TX,Y/RX,Y(i).
Due to proposition 2, we can define the input-equivalence characteristic of W^∈DMCX,Y(i) as CE(W^):=CE(W′) for any W′∈W^. Define co(W^):=co(CE(W^)). It is easy to see that co(W^)=co({Wx′:x∈X}) for any W′∈W^.
Let A and B be two sets. A coupling of A and B is a subset R of A×B such that
[TABLE]
and
[TABLE]
We denote the set of couplings of A and B as R(A,B).
We define the similarity distance on DMCX,Y(i) as follows:
[TABLE]
Proposition 4**.**
(DMCX,Y(i),dX,Y(i))* is a metric space.*
Proof.
We will show that d_{\mathcal{X},\mathcal{Y}}^{(i)}(\hat{W}_{1},\hat{W}_{2})=d_{H}\big{(}\operatorname*{co}(\hat{W}_{1}),\operatorname*{co}(\hat{W}_{2})\big{)}, where dH is the Hausdorff metric on K(ΔY) corresponding to the total variation distance on ΔY. Define K1=co(W^1) and K2=co(W^2), and let R∈R(K1,K2). For every (P1,P2)∈R, we have:
[TABLE]
Therefore,
[TABLE]
Similarly,
[TABLE]
Hence,
[TABLE]
We conclude that
[TABLE]
Let P1∈K1. Since K2 is compact, there exists P~2(P1)∈K2 such that
[TABLE]
Similarly, for every P2∈K2, there exists P~1(P2)∈K1 such that ∥P2−P~1(P2)∥TV=P1∈K1inf∥P1−P2∥TV. Define the coupling R0∈R(K1,K2) as
[TABLE]
We have:
[TABLE]
We conclude that d_{\mathcal{X},\mathcal{Y}}^{(i)}(\hat{W}_{1},\hat{W}_{2})=d_{H}(K_{1},K_{2})=d_{H}\big{(}\operatorname*{co}(\hat{W}_{1}),\operatorname*{co}(\hat{W}_{2})\big{)}, hence dX,Y(i) is a metric.
∎
Proposition 5**.**
Let W,W′∈DMCX,Y and let W^ and W^′ be the RX,Y(i)-equivalence classes of W and W′ respectively. We have dX,Y(i)(W^,W^′)≤dX,Y(W,W′).
Proof.
Define R0⊂co(W^)×co(W^′) as follows:
[TABLE]
Clearly, R0 is a coupling of co(W^) and co(W^′). For every (P1,P2)∈R0, there exists (λx)x∈X∈[0,1]X such that x∈X∑λx=1, P1=x∈X∑λxWx and P2=x∈X∑λxWx′. We have:
[TABLE]
Therefore,
[TABLE]
∎
Theorem 4**.**
The topology induced by dX,Y(i) on DMCX,Y(i) is the same as the quotient topology TX,Y(i). Moreover, (DMCX,Y(i),dX,Y(i)) is compact and path-connected.
Proof.
Since (DMCX,Y,dX,Y) is compact and path-connected, the quotient space (DMCX,Y(i),TX,Y(i)) is compact and path-connected.
Define the mapping Proj:DMCX,Y→DMCX,Y(i) as Proj(W)=W^, where W^ is the RX,Y(i)-equivalence class of W. Proposition 5 implies that Proj is a continuous mapping from (DMCX,Y,dX,Y) to (DMCX,Y(i),dX,Y(i)). Since Proj(W) depends only on W^, Lemma 1 implies that the transcendent mapping of Proj defined on the quotient space (DMCX,Y(i),TX,Y(i)) is continuous. But the transcendent mapping of Proj is nothing but the identity on DMCX,Y(i). Therefore, the identity mapping id on DMCX,Y(i) is a continuous mapping from (DMCX,Y(i),TX,Y(i)) to (DMCX,Y(i),dX,Y(i)).
For every subset U of DMCX,Y(i) we have:
•
If U is open in (DMCX,Y(i),dX,Y(i)), then U=id−1(U) is open in (DMCX,Y(i),TX,Y(i)).
•
If U is open in (DMCX,Y(i),TX,Y(i)), then its complement Uc is closed in (DMCX,Y(i),TX,Y(i)) which is compact, hence Uc is compact in (DMCX,Y(i),TX,Y(i)). This shows that Uc=id(Uc) is a compact subset of (DMCX,Y(i),dX,Y(i)). But (DMCX,Y(i),dX,Y(i)) is a metric space, so Uc is closed in (DMCX,Y(i),dX,Y(i)). Therefore, U is open (DMCX,Y(i),dX,Y(i)).
We conclude that (DMCX,Y(i),TX,Y(i)) and (DMCX,Y(i),dX,Y(i)) have the same open sets. Therefore, the topology induced by dX,Y(i) on DMCX,Y(i) is the same as the quotient topology TX,Y(i). Now since (DMCX,Y(i),TX,Y(i)) is compact and path-connected, (DMCX,Y(i),dX,Y(i)) is compact and path-connected as well.
∎
In the rest of this paper, we always associate DMCX,Y(i) with the similarity metric dX,Y(i) and the quotient topology TX,Y(i).
V-B Canonical embedding and canonical identification
Let X1,X2 and Y be three finite sets such that ∣X1∣≤∣X2∣. We will show that there is a canonical embedding from DMCX1,Y(i) to DMCX2,Y(i). In other words, there exists an explicitly constructable compact subset A of DMCX2,Y(i) such that A is homeomorphic to DMCX1,Y(i). A and the homeomorphism depend only on X1,X2 and Y (this is why we say that they are canonical). Moreover, we can show that A depends only on ∣X1∣, X2 and Y.
Lemma 2**.**
For every W∈DMCX1,Y and every surjection f from X2 to X1, W is input-equivalent to W∘Df.
Proof.
Clearly W∘Df is input-degraded from W. Now let f′ be any mapping from X1 to X2 such that f(f′(x1))=x1 for every x1∈X1. We have W=W∘(Df∘Df′)=(W∘Df)∘Df′, and so W is also input-degraded from W∘Df.
∎
Corollary 2**.**
For every W,W′∈DMCX1,Y and every two surjections f,g from X2 to X1, we have:
[TABLE]
Proof.
Since W is input-equivalent to W∘Df and W′ is input-equivalent to W′∘Dg, then W is input-equivalent to W′ if and only if W∘Df is input-equivalent to W′∘Dg.
∎
For every W∈DMCX1,Y, we denote the RX1,Y(i)-equivalence class of W as W^, and for every W∈DMCX2,Y, we denote the RX2,Y(i)-equivalence class of W as W~.
Proposition 6**.**
Let X1,X2 and Y be three finite sets such that ∣X1∣≤∣X2∣. Let f:X2→X1 be any fixed surjection from X2 to X1. Define the mapping F:DMCX1,Y(i)→DMCX2,Y(i) as
F(W^)=W′∘Df=Proj2(W′∘Df), where W′∈W^ and Proj2 is the projection onto the RX,Y2(i)-equivalence classes. We have:
•
F* is well defined, i.e., F(W^) does not depend on W′∈W^.*
•
F* is a homeomorphism from DMCX1,Y(i) to F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}\subset\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}}^{(i)}.*
•
F* does not depend on the surjection f. It depends only on X1, X2 and Y, hence it is canonical.*
•
F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}* depends only on ∣X1∣, X2 and Y.*
•
For every W′∈W^ and every W′′∈F(W^), W′ is input-equivalent to W′′.
Proof.
Corollary 2 implies that Proj2(W∘Df)=Proj2(W′∘Df) if and only if WRX1,Y(i)W′. Therefore, Proj2(W′∘Df) does not depend on W′∈W^, hence F is well defined. Corollary 2 also shows that Proj2(W′∘Df) does not depend on the particular choice of the surjection f, hence it is canonical (i.e., it depends only on X1,X2 and Y).
On the other hand, the mapping W→W∘Df is a continuous mapping from DMCX1,Y to DMCX2,Y, and Proj2 is continuous. Therefore, the mapping W→Proj2(W∘Df) is a continuous mapping from DMCX1,Y to DMCX2,Y(i). Now since Proj2(W∘Df) depends only on the RX1,Y(i)-equivalence class W^ of W, Lemma 1 implies that the transcendent mapping of W→Proj2(W∘Df) that is defined on DMCX1,Y(i) is continuous. Therefore, F is a continuous mapping from (DMCX1,Y(i),TX1,Y(i)) to (DMCX2,Y(i),TX2,Y(i)). Moreover, we can see from Corollary 2 that F is an injection.
For every closed subset B of DMCX1,Y(i), B is compact since DMCX1,Y(i) is compact, hence F(B) is compact because F is continuous. This implies that F(B) is closed in DMCX2,Y(i) since DMCX2,Y(i) is Hausdorff (as it is metrizable). Therefore, F is a closed mapping.
Now since F is an injection that is both continuous and closed, F is a homeomorphism between DMCX1,Y(i) and F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}\subset\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}}^{(i)}.
We would like now to show that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)} depends only on ∣X1∣, X2 and Y. Let X1′ be a finite set such that ∣X1∣=∣X1′∣. For every W∈DMCX1′,Y, let W∈DMCX1′,Y(i) be the RX1′,Y(i)-equivalence class of W.
Let g:X1→X1′ be a fixed bijection from X1 to X1′ and let f′=g∘f. Define F′:DMCX1′,Y(i)→DMCX2,Y(i) as F′(W)=W′∘Df′=Proj2(W′∘Df′), where W′∈W. As above, F′ is well defined, and it is a homeomorphism from DMCX1′,Y(i) to F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}}^{(i)}\big{)}. We want to show that F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}}^{(i)}\big{)}=F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}. For every W∈DMCX1′,Y(i), let W′∈W. We have
[TABLE]
Since this is true for every W∈DMCX1′,Y(i), we deduce that F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}}^{(i)}\big{)}\subset F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}. By exchanging the roles of X1 and X1′ and using the fact that f=g−1∘f′, we get F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}\subset F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}}^{(i)}\big{)}. We conclude that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}=F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}}^{(i)}\big{)}, which means that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)} depends only on ∣X1∣, X2 and Y.
For every W′∈W^ and every W′′∈F(W^)=W′∘Df, W′′ is input-equivalent to W′∘Df and W′∘Df is input-equivalent to W′ (by Lemma 2), hence W′′ is input-equivalent to W′.
∎
Corollary 3**.**
If ∣X1∣=∣X2∣, there exists a canonical homeomorphism from DMCX1,Y(i) to DMCX2,Y(i) depending only on X1,X2 and Y.
Proof.
Let f be a bijection from X2 to X1. Define the mapping F:DMCX1,Y(i)→DMCX2,Y(i) as
F(W^)=W′∘Df=Proj2(W′∘Df), where W′∈W^ and Proj2:DMCX2,Y→DMCX2,Y(i) is the projection onto the RX2,Y(i)-equivalence classes.
Also, define the mapping F′:DMCX2,Y(i)→DMCX1,Y(i) as
F(V~)=V′∘Df−1=Proj1(V′∘Df−1), where V′∈V~ and Proj1:DMCX,Y1→DMCX1,Y(i) is the projection onto the RX1,Y(i)-equivalence classes.
Proposition 6 shows that F and F′ are well defined.
For every W∈DMCX1,Y, we have:
[TABLE]
where (a) follows from the fact that W∈W^ and (b) follows from the fact that W∘Df∈W∘Df.
We can similarly show that F(F′(V~))=V~ for every V~∈DMCX2,Y(i). Therefore, both F and F′ are bijections. Proposition 6 now implies that F is a homeomorphism from DMCX1,Y(i) to F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}}^{(i)}\big{)}=\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}}^{(i)}. Moreover, F depends only on X,Y1 and Y2.
∎
Corollary 3 allows us to identify DMCX,Y(i) with DMC[n],Y(i) through the canonical homeomorphism, where n=∣X∣ and [n]={1,…,n}. Moreover, for every 1≤n≤m, Proposition 6 allows us to identify DMC[n],Y(i) with the canonical subspace of DMC[m],Y(i) that is homeomorphic to DMC[n],Y(i). In the rest of this paper, we consider that DMC[n],Y(i) is a compact subspace of DMC[m],Y(i).
Intuitively, DMC[n],Y(i) has a “lower dimension” compared to DMC[m],Y(i). So one expects that the interior of DMC[n],Y(i) in (DMC[m],Y(i),T[m],Y(i)) is empty if m>n. The following proposition shows that this intuition is accurate when ∣Y∣≥3.
Proposition 7**.**
We have:
•
If ∣Y∣=1, then DMC[n],Y(i)=DMC[1],Y(i) for every n≥1.
•
If ∣Y∣=2, then DMC[n],Y(i)=DMC[2],Y(i) for every n≥2.
•
If ∣Y∣≥3, then for every 1≤n<m, the interior of DMC[n],Y(i) in (DMC[m],Y(i),T[m],Y(i)) is empty.
The previous section showed that if we are interested in input-equivalent channels, it is sufficient to study the spaces DMC[n],Y and DMC[n],Y(i) for every n≥1, where [n]={1,…,n}. Define the space
[TABLE]
where ∐ is the disjoint union symbol. The subscript ∗ indicates that the input alphabets of the considered channels are arbitrary but finite. We define the equivalence relation R∗,Y(i) on DMC∗,Y as follows:
[TABLE]
Definition 2**.**
The space of input-equivalent channels with output alphabet Y is the quotient of the space of channels with output alphabet Y by the input-equivalence relation:
[TABLE]
Clearly, DMC[n],Y/R∗,Y(i) can be canonically identified with DMC[n],Y/R[n],Y(i)=DMC[n],Y(i). Therefore, we can write
[TABLE]
We define the input-rank of W^∈DMC∗,Y(i) as the size of its characteristic: irank(W^)=∣CE(W^)∣. Due to Proposition 2, we have
[TABLE]
A subset A of DMC∗,Y(i) is said to be rank-bounded if there exists n≥1 such that A⊂DMC[n],Y(i).
VI-A Natural topologies on DMC∗,Y(i)
Since DMC∗,Y(i) is the quotient of DMC∗,Y and since DMC∗,Y was not given any topology, there is no “standard topology” on DMC∗,Y(i). However, there are many properties that one may require from any “reasonable” topology on DMC∗,Y(i). In this paper, we focus on one particular requirement that we consider the most basic property required from any “acceptable” topology on DMC∗,Y(i):
Definition 3**.**
A topology T on DMC∗,Y(i) is said to be natural if it induces the quotient topology T[n],Y(i) on DMC[n],Y(i) for every n≥1.
The reason why we consider such topology as natural is because the quotient topology T[n],Y(i) is the “standard” and “most natural” topology on DMC[n],Y(i). Therefore, we do not want to induce any non-standard topology on DMC[n],Y(i) by relativization.
Proposition 8**.**
Every natural topology is σ-compact, separable and path-connected.
Proof.
Since DMC∗,Y(i) is the countable union of compact and separable subspaces (namely {DMC[n],Y(i)}n≥1), DMC∗,Y(i) is σ-compact and separable as well.
On the other hand, since n≥1⋂DMC[n],Y(i)=DMC[1],Y(i)=\o and since DMC[n],Y(i) is path-connected for every n≥1, the union DMC∗,Y(i)=n≥0⋃DMC[n],Y(i) is path-connected.
∎
Proposition 7 implies that if ∣Y∣=1, then DMC∗,Y(i)=DMC[1],Y(i), and so the only natural topology on DMC∗,Y(i) is T[1],Y(i). Similarly, if ∣Y∣=2, then DMC∗,Y(i)=DMC[2],Y(i), and the only natural topology on DMC∗,Y(i) is T[2],Y(i). In the rest of this section, we investigate the properties of natural topologies when ∣Y∣≥3.
Proposition 9**.**
If ∣Y∣≥3 and T is a natural topology, every open set is rank-unbounded.
Proof.
Assume to the contrary that there exists a non-empty open set U∈T such that U⊂DMC[n],Y(i) for some n≥1. U∩DMC[n+1],Y(i) is open in DMC[n+1],Y(i) because T is natural. On the other hand, U∩DMC[n+1],Y(i)⊂U⊂DMC[n],Y(i). Proposition 7 now implies that U∩DMC[n+1],Y(i)=\o. Therefore,
[TABLE]
which is a contradiction.
∎
Corollary 4**.**
If ∣Y∣≥3 and T is a natural topology, then for every n≥1, the interior of DMC[n],Y(i) in (DMC∗,Y(i),T) is empty.
Proposition 10**.**
If ∣Y∣≥3 and T is a Hausdorff natural topology, then (DMC∗,Y(i),T) is not a Baire space.
Proof.
Fix n≥1. Since T is natural, DMC[n],Y(i) is a compact subset of (DMC∗,Y(i),T). But T is Hausdorff, so DMC[n],Y(i) is a closed subset of (DMC∗,Y(i),T). Therefore, DMC∗,Y(i)∖DMC[n],Y(i) is open.
On the other hand, Corollary 4 shows that the interior of DMC[n],Y(i) in (DMC∗,Y(i),T) is empty. Therefore, DMC∗,Y(i)∖DMC[n],Y(i) is dense in (DMC∗,Y(i),T).
Now since
[TABLE]
and since DMC∗,Y(i)∖DMC[n],Y(i) is open and dense in (DMC∗,Y(i),T) for every n≥1, we conclude that (DMC∗,Y(i),T) is not a Baire space.
∎
Corollary 5**.**
If ∣Y∣≥3, no natural topology on DMC∗,Y(i) can be completely metrizable.
Proof.
The corollary follows from Proposition 10 and the fact that every completely metrizable topology is both Hausdorff and Baire.
∎
Proposition 11**.**
If ∣Y∣≥3 and T is a Hausdorff natural topology, then (DMC∗,Y(i),T) is not locally compact anywhere, i.e., for every W^∈DMC∗,Y(i), there is no compact neighborhood of W^ in (DMC∗,Y(i),T).
Proof.
Assume to the contrary that there exists a compact neighborhood K of W^. There exists an open set U such that W^∈U⊂K.
Since K is compact and Hausdorff, it is a Baire space. Moreover, since U is an open subset of K, U is also a Baire space.
Fix n≥1. Since the interior of DMC[n],Y(i) in (DMC∗,Y(i),T) is empty, the interior of U∩DMC[n],Y(i) in U is also empty. Therefore, U∖DMC[n],Y(i) is dense in U. On the other hand, since T is natural, DMC[n],Y(i) is compact which implies that it is closed because T is Hausdorff. Therefore, U∖DMC[n],Y(i) is open in U. Now since
[TABLE]
and since U∖DMC[n],Y(i) is open and dense in U for every n≥1, U is not Baire, which is a contradiction. Therefore, there is no compact neighborhood of W^ in (DMC∗,Y(i),T).
∎
VII Strong topology on DMC∗,Y(i)
The first natural topology that we study is the strong topologyTs,∗,Y(i) on DMC∗,Y(i), which is the finest natural topology.
Since the spaces {DMC[n],Y}n≥1 are disjoint and since there is no a priori way to (topologically) compare channels in DMC[n],Y with channels in DMC[n′],Y for n=n′, the “most natural” topology that we can define on DMC∗,Y is the disjoint union topology Ts,∗,Y:=n≥1⨁T[n],Y. Clearly, the space (DMC∗,Y,Ts,∗,Y) is disconnected. Moreover, Ts,∗,Y is metrizable because it is the disjoint union of metrizable spaces. It is also σ-compact because it is the union of countably many compact spaces.
We added the subscript s to emphasize the fact that Ts,∗,Y is a strong topology (remember that the disjoint union topology is the finest topology that makes the canonical injections continuous).
Definition 4**.**
We define the strong topology Ts,∗,Y(i) on DMC∗,Y(i) as the quotient topology Ts,∗,Y/R∗,Y(i).
We call open and closed sets in (DMC∗,Y(i),Ts,∗,Y(i)) as strongly open and strongly closed sets respectively.
Let Proj:DMC∗,Y→DMC∗,Y(i) be the projection onto the R∗,Y(i)-equivalence classes, and for every n≥1 let Projn:DMC[n],Y→DMC[n],Y(i) be the projection onto the R[n],Y(i)-equivalence classes. Due to the identifications that we made in Section VI, we have Proj(W)=Projn(W) for every W∈DMC[n],Y. Therefore, for every U⊂DMC∗,Y(i), we have
[TABLE]
Hence,
[TABLE]
where (a) and (c) follows from the properties of the quotient topology, and (b) follows from the properties of the disjoint union topology.
We conclude that U⊂DMC∗,Y(i) is strongly open in DMC∗,Y(i) if and only if U∩DMC[n],Y(i) is open in DMC[n],Y(i) for every n≥1. This shows that the topology on DMC[n],Y(i) that is inherited from (DMC∗,Y(i),Ts,∗,Y(i)) is exactly T[n],Y(i). Therefore, Ts,∗,Y(i) is a natural topology. On the other hand, if T is an arbitrary natural topology and U∈T, then U∩DMC[n],Y(i) is open in DMC[n],Y(i) for every n≥1, so U∈Ts,∗,Y(i). We conclude that Ts,∗,Y(i) is the finest natural topology.
We can also characterize the strongly closed subsets of DMC∗,Y(i) in terms of the closed sets of the DMC[n],Y(i) spaces:
[TABLE]
Since DMC[n],Y(i) is metrizable for every n≥1, it is also normal. We can use this fact to prove that the strong topology on DMC∗,Y(i) is normal:
The following theorem shows that the strong topology satisfies many desirable properties.
Theorem 5**.**
(DMC∗,Y(i),Ts,∗,Y(i))* is a compactly generated, sequential and T4 space.*
Proof.
Since (DMC∗,Y,Ts,∗,Y) is metrizable, it is sequential. Therefore, (DMC∗,Y(i),Ts,∗,Y(i)), which is the quotient of a sequential space, is sequential.
Let us now show that DMC∗,Y(i) is T4. Fix W^∈DMC∗,Y(i). For every n≥1, we have {W^}∩DMC[n],Y(i) is either \o or {W^} depending on whether W^∈DMC[n],Y(i) or not. Since DMC[n],Y(i) is metrizable, it is T1 and so singletons are closed in DMC[n],Y(i). We conclude that in all cases, {W^}∩DMC[n],Y(i) is closed in DMC[n],Y(i) for every n≥1. Therefore, {W^} is strongly closed in DMC∗,Y(i). This shows that (DMC∗,Y(i),Ts,∗,Y(i)) is T1. On the other hand, Lemma 3 shows that (DMC∗,Y(i),Ts,∗,Y(i)) is normal. This means that (DMC∗,Y(i),Ts,∗,Y(i)) is T4, which implies that it is Hausdorff.
Now since (DMC∗,Y,Ts,∗,Y) is metrizable, it is compactly generated. On the other hand, the quotient space (DMC∗,Y(i),Ts,∗,Y(i)) was shown to be Hausdorff. We conclude that (DMC∗,Y(i),Ts,∗,Y(i)) is compactly generated.
∎
Corollary 6**.**
If ∣Y∣≥3, (DMC∗,Y(i),Ts,∗,Y(i)) is not locally compact anywhere.
Proof.
Since Ts,∗,Y(i) is a natural Hausdorff topology, Proposition 11 implies that Ts,∗,Y(i) is not locally compact anywhere.
∎
As in the case of the space of equivalent channels [3], the space (DMC∗,Y(i),Ts,∗,Y(i)) fails to be first-countable (and hence it is not metrizable) when ∣Y∣≥3. This is one manifestation of the strength of the topology Ts,∗,Y(i). In order to show that (DMC∗,Y(i),Ts,∗,Y(i)) is not first-countable, we need to characterize the converging sequences in (DMC∗,Y(i),Ts,∗,Y(i)).
A sequence (W^n)n≥1 in DMC∗,Y(i) is said to be rank-bounded if irank(W^n) is bounded. (W^n)n≥1 is rank-unbounded if it is not bounded.
The following proposition shows that every rank-unbounded sequence does not converge in (DMC∗,Y(i),Ts,∗,Y(i)).
Proposition 12**.**
A sequence (W^n)n≥0 converges in (DMC∗,Y(i),Ts,∗,Y(i)) if and only if there exists m≥1 such that W^n∈DMC[m],Y(i) for every n≥0, and (W^n)n≥0 converges in (DMC[m],Y(i),T[m],Y(i)).
Proof.
Assume that a sequence (W^n)n≥0 in DMC∗,Y(i) is rank-unbounded. This cannot happen unless ∣Y∣≥3. In order to show that (W^n)n≥0 that does not converge, it is sufficient to show that there exists a subsequence of (W^n)n≥0 which does not converge.
Let (W^nk)k≥0 be any subsequence of (W^n)n≥0 where the input-rank strictly increases, i.e., irank(Wnk)<irank(Wnk′) for every 0≤k<k′. We will show that (W^nk)k≥0 does not converge.
Assume to the contrary that (W^nk)k≥0 converges to W^∈DMC∗,Y(i). Define the set
[TABLE]
For every m≥1, the set A∩DMC[m],Y(i) contains finitely many points. This means that A∩DMC[m],Y(i) is a finite union of singletons (which are closed in DMC[m],Y(i)), hence A∩DMC[m],Y(i) is closed in DMC[m],Y(i) for every m≥1. Therefore A is closed in (DMC∗,Y(i),Ts,∗,Y(i)).
Now define U=DMC∗,Y(i)∖A. Since A is strongly closed, U is strongly open. Moreover, U contains W^, so U is a neighborhood of W^. Therefore, there exists k0≥0 such that W^nk∈U for every k≥k0. Now since the input-rank of (W^nk)k≥0 strictly increases, we can find k≥k0 such that irank(W^nk)>irank(W^). This means that W^nk=W^ and so W^nk∈A. Therefore, W^nk∈/U which is a contradiction.
We conclude that every converging sequence in (DMC∗,Y(i),Ts,∗,Y(i)) must be rank-bounded.
Now let (W^n)n≥0 be a rank-bounded sequence in DMC∗,Y(i), i.e., there exists m≥1 such that W^n∈DMC[m],Y(i) for every n≥0. If (W^n)n≥0 converges in (DMC∗,Y(i),Ts,∗,Y(i)) then it converges in DMC[m],Y(i) since DMC[m],Y(i) is strongly closed.
Conversely, assume that (W^n)n≥0 converges in (DMC[m],Y(i),T[m],Y(i)) to W^∈DMC[m],Y(i). Let O be any neighborhood of W^ in (DMC∗,Y(i),Ts,∗,Y(i)). There exists a strongly open set U such that W^∈U⊂O. Since U∩DMC[m],Y(i) is open in (DMC[m],Y(i),T[m],Y(i)), there exists n0>0 such that W^n∈U∩DMC[m],Y(i) for every n≥n0. This implies that W^n∈O for every n≥n0. Therefore (W^n)n≥0 converges to W^ in (DMC∗,Y(i),Ts,∗,Y(i)).
∎
Corollary 7**.**
If ∣Y∣≥3, (DMC∗,Y(i),Ts,∗,Y(i)) is not first-countable anywhere, i.e., for every W^∈DMC∗,Y(i), there is no countable neighborhood basis of W^.
Proof.
Fix W^∈DMC∗,Y(i) and assume to the contrary that W^ admits a countable neighborhood basis {On}n≥1 in (DMC∗,Y(i),Ts,∗,Y(i)). For every n≥1, let Un′ be a strongly open set such that W^∈Un′⊂On. Define Un=i=1⋂nUn′. Un is strongly open because it is the intersection of finitely many strongly open sets. Moreover, Un⊂Om for every n≥m.
For every n≥1, Proposition 9 implies that Un (which is non-empty and strongly open) is rank-unbounded, so it cannot be contained in DMC[n],Y(i). Hence there exists W^n∈Un such that W^n∈/DMC[n],Y(i).
Since W^n∈/DMC[n],Y(i), we have irank(W^n)>n for every n≥1. Therefore, (W^n)n≥1 is rank-unbounded. Proposition 12 implies that (W^n)n≥1 does not converge in (DMC∗,Y(i),Ts,∗,Y(i)).
Now let O be a neighborhood of W^ in (DMC∗,Y(i),Ts,∗,Y(i)). Since {On}n≥1 is a neighborhood basis for W^, there exists n0≥1 such that On0⊂O. For every n≥n0, we have W^n∈Un⊂On0. This means that (W^n)n≥1 converges to W^ in (DMC∗,Y(i),Ts,∗,Y(i)) which is a contradiction. Therefore, W^ does not admit a countable neighborhood basis in (DMC∗,Y(i),Ts,∗,Y(i)).
∎
VII-A Compact subspaces of (DMC∗,Y(i),Ts,∗,Y(i))
It is well known that a compact subset of R is compact if and only if it is closed and bounded. The following proposition shows that a similar statement holds for (DMC∗,Y(i),Ts,∗,Y(i)).
Proposition 13**.**
A subspace of (DMC∗,Y(i),Ts,∗,Y(i)) is compact if and only if it is rank-bounded and strongly closed.
Proof.
If ∣Y∣=1, DMC∗,Y(i)=DMC[1],Y(i) consists of only one point, hence all subsets of DMC∗,Y(i) are rank-bounded, compact and strongly closed.
If ∣Y∣=2, DMC∗,Y(i)=DMC[2],Y(i) and Ts,∗,Y(i)=T[2],Y(i), hence all subsets of DMC∗,Y(i) are rank-bounded. But DMC[2],Y(i) is compact and Hausdorff. Therefore, a subset of DMC∗,Y(i) is compact if and only if it is closed in T[2],Y(i)=Ts,∗,Y(i).
Assume now that ∣Y∣≥3.
Let A be a subspace of (DMC∗,Y(i),Ts,∗,Y(i)). If A is rank-bounded and strongly closed, then there exists n≥1 such that A⊂DMC[n],Y(i). Since A is strongly closed, then A=A∩DMC[n],Y(i) is closed in DMC[n],Y(i) which is compact. Therefore, A is compact.
Now let A be a compact subspace of (DMC∗,Y(i),Ts,∗,Y(i)). Since (DMC∗,Y(i),Ts,∗,Y(i)) is Hausdorff, A is strongly closed. It remains to show that A is rank-bounded.
Assume to the contrary that A is rank-unbounded. We can construct a sequence (W^n)n≥0 in A where the input-rank is strictly increasing, i.e., irank(W^n)<irank(W^n′) for every 0≤n<n′. Since the input-rank of (W^n)n≥0 is strictly increasing, every subsequence of (W^n)n≥0 is rank-unbounded. Proposition 12 implies that every subsequence of (W^n)n≥0 does not converge in (DMC∗,Y(i),Ts,∗,Y(i)). On the other hand, we have:
•
A is countably compact because it is compact.
•
Since A is strongly closed and since (DMC∗,Y(i),Ts,∗,Y(i)) is a sequential space, A is sequential.
•
A is Hausdorff because (DMC∗,Y(i),Ts,∗,Y(i)) is Hausdorff.
Now since every countably compact sequential Hausdorff space is sequentially compact [9], A must be sequentially compact. Therefore, (W^n)n≥0 has a converging subsequence which is a contradiction. We conclude that A must be rank-bounded.
∎
VIII The similarity metric on the space of input-equivalent channels
We define the similarity metric on DMC∗,Y(i) as follows:
[TABLE]
Let T∗,Y(i) be the metric topology on DMC∗,Y(i) that is induced by d∗,Y(i). We call T∗,Y(i) the similarity topology on DMC∗,Y(i).
Clearly, T∗,Y(i) is natural because the restriction of d∗,Y(i) on DMC[n],Y(i) is exactly d[n],Y(i), and the topology induced by d[n],Y(i) is T[n],Y(i) (Theorem 4).
IX Continuity of channel parameters and operations
IX-A Channel parameters
The capacity of a channel W∈DMCX,Y is denoted as C(W).
An (n,M)-encoder on the alphabet X is a mapping E:M→Xn such that ∣M∣=M. The set M is the message set of E, n is the blocklength of E, M is the size of E, and n1logM is the rate of E (measured in nats). The error probability of the ML decoder for the encoder E when it is used for a channel W∈DMCX,Y is given by:
[TABLE]
where (E1(m),…,En(m))=E(m).
The optimal error probability of (n,M)-encoders for a channel W is given by:
[TABLE]
Since input-degradedness is a particular case of the Shannon ordering [1], we can easily see that if W and W′ are input-equivalent, then C(W)=C(W′) and Pe,n,M(W)=Pe,n,M(W′) for every n≥1 and every M≥1. Therefore, for every W^∈DMC∗,Y(i), we can define C(W^):=C(W′) for any W′∈W^. We can define Pe,n,M(W^) similarly. Moreover, due to Proposition 3, we can also define Pe,D(W^) for any decoder D on the output alphabet Y.
Proposition 14**.**
Let X and Y be two finite sets. We have:
•
C:DMCX,Y(i)→R+* is continuous on (DMCX,Y(i),TX,Y(i)).*
•
For every n≥1 and every M≥1, the mapping Pe,n,M:DMCX,Y(i)→[0,1] is continuous on (DMCX,Y(i),TX,Y(i)).
•
For every decoder D on Y, the mapping Pe,D:DMCX,Y(i)→[0,1] is continuous on (DMCX,Y(i),TX,Y(i)).
Proof.
Since C:DMCX,Y→R+ is continuous, and since C(W) depends only on the RX,Y(i), Lemma 1 implies that C:DMCX,Y(i)→R+ is continuous on (DMCX,Y(i),TX,Y(i)). We can show the continuity of Pe,n,M and Pe,D on (DMCX,Y(i),TX,Y(i)) similarly.
∎
The following lemma provides a way to check whether a mapping defined on (DMC∗,Y(i),Ts,∗,Y(i)) is continuous:
Lemma 4**.**
Let (S,V) be an arbitrary topological space. A mapping f:DMC∗,Y(i)→S is continuous on (DMC∗,Y(i),Ts,∗,Y(i)) if and only if it is continuous on (DMC[n],Y(i),T[n],Y(i)) for every n≥1.
Proof.
[TABLE]
∎
Proposition 15**.**
Let Y be a finite set. We have:
•
C:DMC∗,Y(i)→R+* is continuous on (DMC∗,Y(i),Ts,∗,Y(i)).*
•
For every n≥1 and every M≥1, the mapping Pe,n,M:DMC∗,Y(i)→[0,1] is continuous on (DMC∗,Y(i),Ts,∗,Y(i)).
•
For every decoder D on Y, the mapping Pe,D:DMC∗,Y(i)→[0,1] is continuous on (DMC∗,Y(i),Ts,∗,Y(i)).
Proof.
The proposition follows from Proposition 14 and Lemma 4.
∎
IX-B Channel operations
For every two channels W1∈DMCX1,Y1 and W2∈DMCX2,Y2, define the channel sumW1⊕W2∈DMCX1∐X2,Y1∐Y2 of W1 and W2 as:
[TABLE]
where X1∐X2=(X1×{1})∪(X2×{2}) is the disjoint union of X1 and X2. W1⊕W2 arises when the transmitter has two channels W1 and W2 at his disposal and he can use exactly one of them at each channel use.
We define the channel productW1⊗W2∈DMCX1×X2,Y1×Y2 of W1 and W2 as:
[TABLE]
W1⊗W2 arises when the transmitter has two channels W1 and W2 at his disposal and he uses both of them at each channel use. Channel sums and products were first introduced by Shannon in [10].
Channel sums and products can be “quotiented” by the input-equivalence relation. We just need to realize that the input-equivalence class of the resulting channel depends only on the input-equivalence classes of the channels that were used in the operation. Let us illustrate this in the case of channel sums:
Let W1,W1′∈DMCX1,Y1 and W2,W2′∈DMCX2,Y2 and assume that W1 is input-degraded from W1′ and W2 is input-degraded from W2′. There exists V1′∈DMCX1,X1 and V2′∈DMCX2,X2 such that W1=W1′∘V1′ and W2=W2′∘V2′. It is easy to see that W1⊕W2=(W1′⊕W2′)∘(V1′⊕V2′), which shows that W1⊕W2 is input-degraded from W1′⊕W2′.
Therefore, if W1 is input-equivalent to W1′ and W2 is input-equivalent to W2′, then W1⊕W2 is input-equivalent to W1′⊕W2′. This allows us to define the channel sum for every W^1∈DMCX1,Y1(i) and every W2∈DMCX2,Y2(i) as W^1⊕W2=W1′⊕W2′∈DMCX1∐X2,Y1∐Y2(i) for any W1′∈W^1 and any W2′∈W2, where W1′⊕W2′ is the RX1∐X2,Y1∐Y2(i)-equivalence class of W1′⊕W2′. We can define the product on the quotient spaces similarly.
Proposition 16**.**
We have:
•
The mapping (W^1,W2)→W^1⊕W2 from DMCX1,Y1(i)×DMCX2,Y2(i) to DMCX1∐X2,Y1∐Y2(i) is continuous.
•
The mapping (W^1,W2)→W^1⊗W2 from DMCX1,Y1(i)×DMCX2,Y2(i) to DMCX1×X2,Y1×Y2(i) is continuous.
Proof.
We only prove the continuity of the channel sum because the proof for the channel product is similar.
Let Proj:DMCX1∐X2,Y1∐Y2→DMCX1∐X2,Y1∐Y2(i) be the projection onto the RX1∐X2,Y1∐Y2(i)-equivalence classes. Define the mapping f:DMCX1,Y1×DMCX2,Y2→DMCX1∐X2,Y1∐Y2(i) as f(W1,W2)=Proj(W1⊕W2). Clearly, f is continuous.
Now define the equivalence relation R on DMCX1,Y1×DMCX2,Y2 as:
[TABLE]
The discussion before the proposition shows that f(W1,W2)=Proj(W1⊕W2) depends only on the R-equivalence class of (W1,W2). Lemma 1 now shows that the transcendent map of f defined on (DMCX1,Y1×DMCX2,Y2)/R is continuous.
Notice that (DMCX1,Y1×DMCX2,Y2)/R can be identified with DMCX1,Y1(i)×DMCX2,Y2(i). Therefore, we can define f on DMCX1,Y1(i)×DMCX2,Y2(i) through this identification. Moreover, since DMCX1,Y1 and DMCX2,Y2(i) are locally compact and Hausdorff, Corollary 1 implies that the canonical bijection between (DMCX1,Y1×DMCX2,Y2)/R and DMCX1,Y1(i)×DMCX2,Y2(i) is a homeomorphism.
Now since the mapping f on DMCX1,Y1(i)×DMCX2,Y2(i) is just the channel sum, we conclude that the mapping (W^1,W2)→W^1⊕W2 from DMCX1,Y1(i)×DMCX2,Y2(i) to DMCX1∐X2,Y1∐Y2(i) is continuous.
∎
Proposition 17**.**
Assume that all spaces of input-equivalent channels are endowed with the strong topology. We have:
•
The mapping (W^1,W2)→W^1⊕W2 from DMC∗,Y1(i)×DMCX2,Y2(i) to DMC∗,Y1∐Y2(i) is continuous.
•
The mapping (W^1,W2)→W^1⊗W2 from DMC∗,Y1(i)×DMCX2,Y2(i) to DMC∗,Y1×Y2(i) is continuous.
Proof.
We only prove the continuity of the channel sum because the proof of the continuity of the channel product is similar.
Due to the distributivity of the product with respect to disjoint unions, we have:
[TABLE]
and
[TABLE]
Therefore, the space DMC∗,Y1×DMCX2,Y2 is the topological disjoint union of the spaces (DMC[n],Y1×DMCX2,Y2)n≥1.
For every n≥1, let Projn be the projection onto the R[n]∐X2,Y1∐Y2(i)-equivalence classes and let in be the canonical injection from DMC[n]∐X2,Y1∐Y2(i) to DMC∗,Y1∐Y2(i).
Define the mapping f:DMC∗,Y1×DMCX2,Y2→DMC∗,Y1∐Y2(i) as
[TABLE]
where n is the unique integer satisfying W1∈DMC[n],Y1. W^1 and W2 are the R[n],Y1(i) and RX2,Y2(i)-equivalence classes of W1 and W2 respectively.
Clearly, the mapping f is continuous on DMC[n],Y1×DMCX2,Y2 for every n≥1. Therefore, f is continuous on (DMC∗,Y1×DMCX2,Y2,Ts,∗,Y1⊗TX2,Y2).
Let R be the equivalence relation defined on DMC∗,Y1×DMCX2,Y2 as follows: (W1,W2)R(W1′,W2′) if and only if W1R∗,Y(i)W1′ and W2RX2,Y2(i)W2′.
Since f(W1,W2) depends only on the R-equivalence class of (W1,W2), Lemma 1 implies that the transcendent mapping of f is continuous on (DMC∗,Y1×DMCX2,Y2)/R.
Since (DMC∗,Y1,Ts,∗,Y1) and DMCX2,Y2(i)=DMCX2,Y2/RX2,Y2(i) are Hausdorff and locally compact, Corollary 1 implies that the canonical bijection from DMC∗,Y1(i)×DMCX2,Y2(i) to (DMC∗,Y1×DMCX2,Y2)/R is a homeomorphism. We conclude that the channel sum is continuous on (DMC∗,Y1(i)×DMCX2,Y2(i),Ts,∗,Y1(i)⊗TX2,Y(i)).
∎
The reader might be wondering why the channel sum and the channel product were not shown to be continuous on the whole space DMC∗,Y1(i)×DMC∗,Y2(i) instead of the smaller space DMC∗,Y1(i)×DMCX2,Y2(i). The reason is because we cannot apply Corollary 1 to DMC∗,Y1×DMC∗,Y2 and DMC∗,Y1(i)×DMC∗,Y2(i) since neither DMC∗,Y1(i) nor DMC∗,Y2(i) is locally compact when ∣Y1∣,∣Y2∣≥3 (under the strong topology).
As in the case of the space of equivalent channels [4], one potential method to show the continuity of the channel sum on (DMC∗,Y1(i)×DMC∗,Y2(i),Ts,∗,Y1(i)⊗Ts,∗,Y2(i)) is as follows: let R be the equivalence relation on DMC∗,Y1×DMC∗,Y2 defined as (W1,W2)R(W1′,W2′) if and only if W1R∗,Y1(i)W1′ and W2R∗,Y2(i)W2′. We can identify (DMC∗,Y1×DMC∗,Y2)/R with DMC∗,Y1(i)×DMC∗,Y2(i) through the canonical bijection. Using Lemma 1, it is easy to see that the mapping (W^1,W2)→W^1⊕W2 is continuous from \big{(}\operatorname*{DMC}_{\ast,\mathcal{Y}_{1}}^{(i)}\times\operatorname*{DMC}_{\ast,\mathcal{Y}_{2}}^{(i)},(\mathcal{T}_{s,\ast,\mathcal{Y}_{1}}\otimes\mathcal{T}_{s,\ast,\mathcal{Y}_{2}})/R\big{)} to (DMC∗,Y1∐Y2(i),Ts,∗,Y1∐Y2(i)).
It was shown in [11] that the topology (Ts,∗,Y1⊗Ts,∗,Y2)/R is homeomorphic to κ(Ts,∗,Y1(i)⊗Ts,∗,Y2(i)) through the canonical bijection, where κ(Ts,∗,Y1(i)⊗Ts,∗,Y2(i)) is the coarsest topology that is both compactly generated and finer than Ts,∗,Y1(i)⊗Ts,∗,Y2(i). Therefore, the mapping (W^1,W2)→W^1⊕W2 is continuous on \big{(}\operatorname*{DMC}_{\ast,\mathcal{Y}_{1}}^{(i)}\times\operatorname*{DMC}_{\ast,\mathcal{Y}_{2}}^{(i)},\kappa(\mathcal{T}_{s,\ast,\mathcal{Y}_{1}}^{(i)}\otimes\mathcal{T}_{s,\ast,\mathcal{Y}_{2}}^{(i)})\big{)}. This means that if Ts,∗,Y1(i)⊗Ts,∗,Y2(i) is compactly generated, we will have Ts,∗,Y1(i)⊗Ts,∗,Y2(i)=κ(Ts,∗,Y1(i)⊗Ts,∗,Y2(i)) and so the channel sum will be continuous on (DMC∗,Y1(i)×DMC∗,Y2(i),Ts,∗,Y1(i)⊗Ts,∗,Y2(i)). Note that although Ts,∗,Y1(i) and Ts,∗,Y2(i) are compactly generated, their product Ts,∗,Y1(i)⊗Ts,∗,Y2(i) might not be compactly generated.
Proposition 18**.**
Let Y1 and Y2 be two finite set. Let W^1∈DMC∗,Y1(i) and W2∈DMC∗,Y2(i). We have:
[TABLE]
where ϕ1# and ϕ2# are the push-forwards by the canonical injections from Y1 and Y2 to Y1∐Y2 respectively. On the other hand,
Let T be a Hausdorff natural topology on DMC∗,Y(i). Since Ts,∗,Y(i) is the finest natural topology, we have T⊂Ts,∗,Y(i). Therefore, B(T)⊂B(Ts,∗,Y(i)), where B(T) and B(Ts,∗,Y(i)) are the Borel σ-algebras of T and Ts,∗,Y(i) respectively.
On the other hand, for every U∈Ts,∗,Y(i) and every n≥1, we have U∩DMC[n],Y(i)∈T[n],Y(i). But T is a natural topology, so there must exist Un∈T such that Un∩DMC[n],Y(i)=U∩DMC[n],Y(i). Since Un∈T, we have Un∈B(T). Moreover, DMC[n],Y(i) is T-closed (because it is compact and T is Hausdorff). Therefore, DMC[n],Y(i)∈B(T). This implies that U∩DMC[n],Y(i)=Un∩DMC[n],Y(i)∈B(T), hence
[TABLE]
Since this is true for every U∈Ts,∗,Y(i), we have Ts,∗,Y(i)⊂B(T) which implies that B(Ts,∗,Y(i))⊂B(T). We conclude that all Hausdorff natural topologies on DMC∗,Y(i) have the same σ-algebra. This σ-algebra deserves to be called the natural Borel σ-algebra on DMC∗,Y(i).
Note that for every n≥1, the inclusion mapping in:DMC[n],Y(i)→DMC∗,Y(i) is continuous from (DMC[n],Y(i),T[n],Y(i)) to (DMC∗,Y(i),Ts,∗,Y(i)), hence it is measurable. Therefore, for every B∈B(Ts,∗,Y(i)), we have in−1(B)=B∩DMC[n],Y(i)∈B(T[n],Y(i)). In the following, we show a converse for this statement.
Fix n≥1 and let U∈T[n],Y(i). There exists U′∈Ts,∗,Y(i) such that U=U′∩DMC[n],Y(i). Since U′ and DMC[n],Y(i) are respectively open and closed in the topology Ts,∗,Y(i), they are both in its Borel σ-algebra. Therefore, U=U′∩DMC[n],Y(i)∈B(Ts,∗,Y(i)) for every U∈T[n],Y(i). This means that T[n],Y(i)⊂B(Ts,∗,Y(i)) and B(T[n],Y(i))⊂B(Ts,∗,Y(i)) for every n≥1.
Assume now that A⊂DMC∗,Y(i) satisfies A∩DMC[n],Y(i)∈B(T[n],Y(i)) for every n≥1. This implies that A∩DMC[n],Y(i)∈B(Ts,∗,Y(i)) for every n≥1, hence
[TABLE]
We conclude that a subset A of DMC∗,Y(i) is in the natural Borel σ-algebra if and only if A∩DMC[n],Y(i)∈B(T[n],Y(i)) for every n≥1.
XI Conclusion
Since T∗,Y(i) is a natural topology, it is not completely metrizable because of Corollary 5. Therefore, the metric space (DMC∗,Y(i),d∗,Y(i)) is not complete. An interesting question to ask is: what does the completion of (DMC∗,Y(i),d∗,Y(i)) represent? Does it represent the space of all input-equivalent channels with output alphabet Y and arbitrary input alphabet (with arbitrary cardinality)?
Many other interesting questions remain open: Are all natural topologies Hausdorff? Can we find more topological properties that are common for all natural topologies? Is there a coarsest natural topology? Is there a natural topology that is coarser than the similarity one?
The continuity of the channel parameters C, Pe,n,M and Pe,D on T∗,Y(i) is an open problem. Also, the continuity of the channel sum and the channel product on the whole product space (DMC∗,Y1(i)×DMC∗,Y2(i),Ts,∗,Y1(i)⊗Ts,∗,Y2(i)) remains an open problem. As we explained in Section IX-B, it is sufficient to prove that the product topology Ts,∗,Y1(i)⊗Ts,∗,Y2(i) is compactly generated.
In [12], Raginsky introduced the Shannon deficiency. We can define the input-deficiency similarly. Like the Shannon deficiency, the input deficiency compares a particular channel with the input-equivalence class of another channel. The input deficiency is not a metric distance between input-equivalence classes of channels.
Acknowledgment
I would like to thank Emre Telatar for helpful discussions. I am also grateful to Maxim Raginsky for informing me about the work of Blackwell on statistical experiments.
If ∣Y∣=1, then ΔY contains only one point and so ∣CE(W)∣=1 for every W∈DMC[n],Y and every n≥1. Therefore, DMC[n],Y(i)=DMC[1],Y(i) for every n≥1.
If ∣Y∣=2, then ΔY is a one dimensional segment. Therefore, there are at most two convex-extreme points for any finite subset of ΔY. This means that ∣CE(W)∣≤2 for every W∈DMC[n],Y and every n≥2. Therefore, DMC[n],Y(i)=DMC[2],Y(i) for every n≥2.
Now assume that ∣Y∣≥3. Let U^ be an arbitrary non-empty open subset of (DMC[m],Y(i),T[m],Y(i)) and let Proj be the projection onto the R[m],Y(i)-equivalence classes. Proj−1(U^) is open in the metric space (DMC[m],Y,d[m],Y). Let W^∈U^ and define r=irank(W^). Let P1,…,Pr∈ΔY be such that CE(W^)={P1,…,Pr}. Define the channel W∈DMC[m],Y as follows:
[TABLE]
Clearly CE(W)=CE(W^) and so W∈W^ which implies that W∈Proj−1(U^). Since Proj−1(U^) is open in the metric space (DMC[m],Y,d[m],Y), there exists ϵ>0 such that Proj−1(U^) contains the open ball of center W and radius ϵ.
We will show that there exists W′∈DMC[m],Y such that irank(W′)=m>n and d[m],Y(W,W′)<ϵ. If r=irank(W)=m, take W′=W.
Assume that r=irank(W)<m. Since ∣Y∣≥3, the dimension of ΔY is at least 2. Therefore, we can find Pr+1∈ΔY such that ∥Pr−Pr+1∥TV<ϵ and CE({P1,…,Pr+1})={P1,…,Pr+1}. By repeating this procedure m−r times, we obtain Pr+1,…,Pm∈ΔY such that ∥Pr−Pi∥TV<ϵ for every r+1≤i≤m, and CE({P1,…,Pm})={P1,…,Pm}. Define the channel W′∈Δ[m],Y as:
[TABLE]
We have CE(W′)=CE({P1,…,Pm})={P1,…,Pm}. Therefore, irank(W′)=m. Moreover,
[TABLE]
This means that W′∈Proj−1(U^) and W′ is not input-equivalent to any channel in DMC[n],Y (see Proposition 2). Therefore, Proj(W′)∈U^ and Proj(W′)∈/DMC[n],Y(i) because W′ is not input-equivalent to any channel in DMC[n],Y. This shows that every non-empty open subset of DMC[m],Y(i) is not contained in DMC[n],Y(i). We conclude that the interior of DMC[n],Y(i) in DMC[m],Y(i) is empty.
Define DMC[0],Y(i)=\o, which is strongly closed in DMC∗,Y(i).
Let A and B be two disjoint strongly closed subsets of DMC∗,Y(i). For every n≥0, let An=A∩DMC[n],Y(i) and Bn=B∩DMC[n],Y(i). Since A and B are strongly closed in DMC∗,Y(i), An and Bn are closed in DMC[n],Y(i). Moreover, An∩Bn⊂A∩B=\o.
Construct the sequences (Un)n≥0,(Un′)n≥0,(Kn)n≥0 and (Kn′)n≥0 recursively as follows:
U0=U0′=K0=K0′=\o⊂DMC[0],Y(i). Since A0=B0=\o, we have A0⊂U0⊂K0 and B0⊂U0′⊂K0′. Moreover, U0 and U0′ are open in DMC[0],Y(i), K0 and K0′ are closed in DMC[0],Y(i), and K0∩K0′=\o.
Now let n≥1 and assume that we constructed (Uj)0≤j<n,(Uj′)0≤j<n,(Kj)0≤j<n and (Kj′)0≤j<n such that for every 0≤j<n, we have Aj⊂Uj⊂Kj⊂DMC[j],Y(i), Bj⊂Uj′⊂Kj′⊂DMC[j],Y(i), Uj and Uj′ are open in DMC[j],Y(i), Kj and Kj′ are closed in DMC[j],Y(i), and Kj∩Kj′=\o. Moreover, assume that Kj⊂Uj+1 and Kj′⊂Uj+1′ for every 0≤j<n−1.
Let Cn=An∪Kn−1 and Dn=Bn∪Kn−1′. Since Kn−1 and Kn−1′ are closed in DMC[n−1],Y(i) and since DMC[n−1],Y(i) is closed in DMC[n],Y(i), we can see that Kn−1 and Kn−1′ are closed in DMC[n],Y(i). Therefore, Cn and Dn are closed in DMC[n],Y(i). Moreover, we have
[TABLE]
where (a) follows from the fact that An∩Bn=Kn−1∩Kn−1′=\o and the fact that Kn−1⊂DMC[n−1],Y(i) and Kn−1′⊂DMC[n−1],Y(i).
Since DMC[n],Y(i) is normal (because it is metrizable), and since Cn and Dn are closed disjoint subsets of DMC[n],Y(i), there exist two sets Un,Un′⊂DMC[n],Y(i) that are open in DMC[n],Y(i) and two sets Kn,Kn′⊂DMC[n],Y(i) that are closed in DMC[n],Y(i) such that Cn⊂Un⊂Kn, Dn⊂Un′⊂Kn′ and Kn∩Kn′=\o. Clearly, An⊂Un⊂Kn⊂DMC[n],Y(i), Bn⊂Un′⊂Kn′⊂DMC[n],Y(i), Kn−1⊂Un and Kn−1′⊂Un′. This concludes the recursive construction.
Now define U=n≥0⋃Un=n≥1⋃Un and U′=n≥0⋃Un′=n≥1⋃Un′. Since An⊂Un for every n≥1, we have
[TABLE]
Moreover, for every n≥1 we have
[TABLE]
where (a) follows from the fact that Uj⊂Kj⊂Uj+1 for every j≥0, which means that the sequence (Uj)j≥1 is increasing.
For every j≥n, we have DMC[n],Y(i)⊂DMC[j],Y(i) and Uj is open in DMC[j],Y(i), hence Uj∩DMC[n],Y(i) is open in DMC[n],Y(i). Therefore, U∩DMC[n],Y(i)=j≥n⋃(Uj∩DMC[n],Y(i)) is open in DMC[n],Y(i). Since this is true for every n≥1, we conclude that U is strongly open in DMC∗,Y(i).
We can show similarly that B⊂U′ and that U′ is strongly open in DMC∗,Y(i). Finally, we have
[TABLE]
where (a) follows from the fact that for every n≥1 and every n′≥1, we have
[TABLE]
because (Un)n≥1 and (Un′)n≥1 are increasing. We conclude that (DMC∗,Y(i),Ts,∗,Y(i)) is normal.
Fix W^1,W^1′∈DMC∗,Y1(i) and W2,W2′∈DMC∗,Y2(i). Let R1∈R(co(W^1),co(W^1′)) and R2∈R(co(W2),co(W2′)). Fix 0≤λ≤1, (P1,P1′)∈R1 and (P2,P2′)∈R2. Let P=(1−λ)ϕ1#P1+λϕ2#P2 and P′=(1−λ)ϕ1#P1′+λϕ2#P2′, where ϕ1# and ϕ2# are the push-forwards by the canonical injections from Y1 and Y2 to Y1∐Y2 respectively. We have:
It is easy to see that R is a coupling of co(W^1⊕W2) and co(W^1′⊕W2′). We have:
[TABLE]
where (a) follows from (2). Since this is true for every R1∈R(co(W^1),co(W^1′)) and every R2∈R(co(W^2),co(W^2′)), we conclude that
[TABLE]
This shows that the mapping (W^1,W2)→W^1⊕W2 from DMC∗,Y1(i)×DMC∗,Y2(i) to DMC∗,Y1∐Y2(i) is continuous in the similarity topology.
Fix again R1∈R(co(W^1),co(W^1′)) and R2∈R(co(W2),co(W2′)). Let λ1,…,λk≥0 be such that i=1∑kλi=1. Let (P1,1,P1,1′),…,(P1,k,P1,k′)∈R1 and (P2,1,P2,1′),…,(P2,k,P2,k′)∈R2. Define P=i=1∑kλiP1,i×P2,i and P′=i=1∑kλiP1,i′×P2,i′. We have:
[TABLE]
where (a) follows from [4, App. B]. Proposition 18 shows that
[TABLE]
and
[TABLE]
Define R⊂co(W^1⊗W2)×co(W^1′⊗W2′) as follows:
[TABLE]
It is easy to see that R is a coupling of co(W^1⊗W2) and co(W^1′⊗W2′). We have:
[TABLE]
where (a) follows from (3). Since this is true for every R1∈R(co(W^1),co(W^1′)) and every R2∈R(co(W^2),co(W^2′)), we conclude that
[TABLE]
This shows that the mapping (W^1,W2)→W^1⊗W2 from DMC∗,Y1(i)×DMC∗,Y2(i) to DMC∗,Y1∐Y2(i) is continuous in the similarity topology.
Bibliography12
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] C. Shannon, “A note on a partial ordering for communication channels,” Inform. Contr. , vol. 1, pp. 390–397, 1958.
2[2] D. Blackwell, “Comparison of experiments,” in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability . University of California Press, 1951, pp. 93–102.
3[3] R. Nasser, “Topological structures on DMC spaces,” ar Xiv:1701.04467 , Jan 2017.
4[4] ——, “Continuity of channel parameters and operations under various DMC topologies,” ar Xiv:1701.04466 , Jan 2017.
5[5] R. Engelking, General topology , ser. Monografie matematyczne. PWN, 1977.
6[6] D. Du and P. Pardalos, Minimax and Applications , ser. Nonconvex Optimization and Its Applications. Springer US, 2013.
7[7] S. Sherman, “On a theorem of hardy, littlewood, polya, and blackwell,” Proceedings of the National Academy of Sciences of the United States of America , vol. 37, no. 12, pp. 826–831, 1951.
8[8] C. Stein, “Notes on a seminar on theoretical statistics. i. comparison of experiments,” Report, University of Chicago , 1951.