This paper characterizes the Shannon ordering of communication channels, showing a specific structural condition involving convex-product channels, and explores topologies and continuity properties on the space of Shannon-equivalent channels.
Contribution
It provides a new characterization of Shannon ordering using skew-composition and convex-product channels, extending the Blackwell-Sherman-Stein theorem.
Findings
01
A channel contains another iff it is a skew-composition with a convex-product channel.
02
Introduces the strong topology and BRM metric on Shannon-equivalent channels.
03
Studies continuity of channel parameters under the strong topology.
Abstract
The ordering of communication channels was first introduced by Shannon. In this paper, we aim to find a characterization of the Shannon ordering. We show that W′ contains W if and only if W is the skew-composition of W′ with a convex-product channel. This fact is used to derive a characterization of the Shannon ordering that is similar to the Blackwell-Sherman-Stein theorem. Two channels are said to be Shannon-equivalent if each one is contained in the other. We investigate the topologies that can be constructed on the space of Shannon-equivalent channels. We introduce the strong topology and the BRM metric on this space. Finally, we study the continuity of a few channel parameters and operations under the strong topology.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
A Characterization of the Shannon Ordering of Communication Channels
The ordering of communication channels was first introduced by Shannon. In this paper, we aim to find a characterization of the Shannon ordering. We show that W′ contains W if and only if W is the skew-composition of W′ with a convex-product channel. This fact is used to derive a characterization of the Shannon ordering that is similar to the Blackwell-Sherman-Stein theorem. Two channels are said to be Shannon-equivalent if each one is contained in the other. We investigate the topologies that can be constructed on the space of Shannon-equivalent channels. We introduce the strong topology and the BRM metric on this space. Finally, we study the continuity of a few channel parameters and operations under the strong topology.
I Introduction
The ordering of communication channels was first introduced by Shannon in [1]. A channel W′ is said to contain another channel W if W can be simulated from W′ by randomization at the input and the output using a shared randomness between the transmitter and the receiver. Shannon showed that the existence of an (n,M,ϵ) code for W implies the existence of an (n,M,ϵ) code for W′.
Another ordering that has been well studied is the degradedness between channels. A channel W is said to be degraded from another channel W′ if W can be simulated from W′ by randomization at the output, or more precisely, if W can be obtained from W′ by composing it with another channel. It is easy to see that degradedness is a special case of Shannon’s ordering. One can trace the roots of the notion of degradedness to the seminal work of Blackwell in the 1950’s about comparing statistical experiments [2]. Note that in the Shannon ordering, the input and output alphabets need not be the same, whereas in the degradedness definition, we have to assume that W and W′ share the same input alphabet X but they can have different output alphabets. A characterization of degradedness is given by the famous Blackwell-Sherman-Stein (BSS) theorem [2], [3], [4].
In [5], we introduced the input-degradedness ordering of communication channels. A channel W is said to be input-degraded from another channel W′ if W can be simulated from W′ by randomization at the input. Note that W and W′ must have the same output alphabet, but they can have different input alphabets. In [5], we provided two characterizations of input-degradedness, one of which is similar to the BSS theorem. The main purpose of this paper is to find a characterization of the Shannon ordering that is similar to the BSS theorem.
In [6], Raginsky introduced the Shannon deficiency which compares a particular channel with the Shannon-equivalence class of another channel. The Shannon deficiency is not a metric that compares two Shannon-equivalence classes of channels.
In [7] and [8], we constructed topologies for the space of equivalent channels and studied the continuity of various channel parameters and operations under these topologies. In this paper, we show that some of the results in [7] and [8] can be replicated (with some variation) for the space of Shannon-equivalent channels.
II Preliminaries
We assume that the reader is familiar with the basic concepts of general topology. The main concepts and theorems that we need can be found in the preliminaries section of [7].
II-A Set-theoretic notations
For every integer n>0, we denote the set {1,…,n} as [n].
Let (Ai)i∈I be a collection of arbitrary sets indexed by I. The disjoint union of (Ai)i∈I is defined as i∈I∐Ai=i∈I⋃(Ai×{i}). For every i∈I, the ith-canonical injection is the mapping ϕi:Ai→j∈I∐Aj defined as ϕi(xi)=(xi,i). If no confusions can arise, we can identify Ai with Ai×{i} through the canonical injection. Therefore, we can see Ai as a subset of j∈I∐Aj for every i∈I.
Let R be an equivalence relation on T. For every x∈T, the set x^={y∈T:xRy} is the R-equivalence class of x. The collection of R-equivalence classes, which we denote as T/R, forms a partition of T, and it is called the quotient space of T by R. The mapping ProjR:T→T/R defined as ProjR(x)=x^ for every x∈T is the projection mapping onto T/R.
II-B Measure theoretic notations
The set of probability measures on a measurable space (M,Σ) is denoted as P(M,Σ). For every P1,P2∈P(M,Σ), the total variation distance between P1 and P2 is defined as:
[TABLE]
If X is a finite set, we denote the set of probability distributions on X as ΔX. We always endow ΔX with the total variation distance and its induced topology.
II-C Quotient topology
Let (T,U) be a topological space and let R be an equivalence relation on T. The quotient topology on T/R is the finest topology that makes the projection mapping ProjR onto the equivalence classes continuous. It is given by
[TABLE]
Lemma 1**.**
Let f:T→S be a continuous mapping from (T,U) to (S,V). If f(x)=f(x′) for every x,x′∈T satisfying xRx′, then we can define a transcendent mappingf:T/R→S such that f(x^)=f(x′) for any x′∈x^. f is well defined on T/R . Moreover, f is a continuous mapping from (T/R,U/R) to (S,V).
Let (T,U) and (S,V) be two topological spaces and let R be an equivalence relation on T. Consider the equivalence relation R′ on T×S defined as (x1,y1)R′(x2,y2) if and only if x1Rx2 and y1=y2. A natural question to ask is whether the canonical bijection between \big{(}(T/R)\times S,(\mathcal{U}/R)\otimes\mathcal{V}\big{)} and \big{(}(T\times S)/R^{\prime},(\mathcal{U}\otimes\mathcal{V})/R^{\prime}\big{)} is a homeomorphism. It turns out that this is not the case in general. The following theorem, which is widely used in algebraic topology, provides a sufficient condition:
Theorem 1**.**
[9]**
If (S,V) is locally compact and Hausdorff, then the canonical bijection between \big{(}(T/R)\times S,(\mathcal{U}/R)\otimes\mathcal{V}\big{)} and \big{(}(T\times S)/R^{\prime},(\mathcal{U}\otimes\mathcal{V})/R^{\prime}\big{)} is a homeomorphism.
Corollary 1**.**
[8]** Let (T,U) and (S,V) be two topological spaces, and let RT and RS be two equivalence relations on T and S respectively. Define the equivalence relation R on T×S as (x1,y1)R(x2,y2) if and only if x1RTx2 and y1RSy2. If (S,V) and (T/RT,U/RT) are locally compact and Hausdorff, then the canonical bijection between \big{(}(T/R_{T})\times(S/R_{S}),(\mathcal{U}/R_{T})\otimes(\mathcal{V}/R_{S})\big{)} and \big{(}(T\times S)/R,(\mathcal{U}\otimes\mathcal{V})/R\big{)} is a homeomorphism.
II-D The space of channels from X to Y
Let DMCX,Y be the set of all channels having X as input alphabet and Y as output alphabet. For every W,W′∈DMCX,Y, define the distance between W and W′ as:
[TABLE]
Throughout this paper, we always associate the space DMCX,Y with the metric distance dX,Y and the metric topology TX,Y induced by it. It is easy to see that TX,Y is the same as the topology inherited from the Euclidean topology of RX×Y by relativization. It is also easy to see that the metric space DMCX,Y is compact and path-connected (see [7]).
For every W∈DMCX,Y and every V∈DMCY,Z, define the composition V∘W∈DMCX,Z as
[TABLE]
For every mapping f:X→Y, define the deterministic channel Df∈DMCX,Y as
[TABLE]
It is easy to see that if f:X→Y and g:Y→Z, then Dg∘Df=Dg∘f.
II-E Channel parameters
The capacity of a channel W∈DMCX,Y is denoted as C(W).
An (n,M)-encoder on the alphabet X is a mapping E:M→Xn such that ∣M∣=M. The set M is the message set of E, n is the blocklength of E, M is the size of E, and n1logM is the rate of E (measured in nats). The error probability of the ML decoder for the encoder E when it is used for a channel W∈DMCX,Y is given by:
[TABLE]
where (E1(m),…,En(m))=E(m).
The optimal error probability of (n,M)-encoders for a channel W is given by:
[TABLE]
II-F Channel operations
For every W1∈DMCX1,Y1 and W2∈DMCX2,Y2, define the channel sumW1⊕W2∈DMCX1∐X2,Y1∐Y2 of W1 and W2 as:
[TABLE]
where X1∐X2=(X1×{1})∪(X2×{2}) is the disjoint union of X1 and X2. W1⊕W2 arises when the transmitter has two channels W1 and W2 at his disposal and he can use exactly one of them at each channel use.
We define the channel productW1⊗W2∈DMCX1×X2,Y1×Y2 of W1 and W2 as:
[TABLE]
W1⊗W2 arises when the transmitter has two channels W1 and W2 at his disposal and he uses both of them at each channel use. Channel sums and products were first introduced by Shannon in [10].
III Shannon ordering and Shannon-equivalence
Let X,X′,Y and Y′ be three finite sets. Let W∈DMCX,Y and W′∈DMCX′,Y′. We say that W′ contains W if there exist n pairs of channels (Ri,Ti)1≤i≤n and a probability distribution α∈Δ[n] such that Ri∈DMCX,X′ and Ti∈DMCY′,Y for every 1≤i≤n, and W=i=1∑nα(i)Ti∘W′∘Ri, i.e.,
[TABLE]
The channels W and W′ are said to be Shannon-equivalent if each one contains the other.
A channel V∈DMCX×Y′,X′×Y is said to be a convex-product channel if it is the convex combination of the products of channels in DMCX,X′ with channels in DMCY′,Y. More precisely, V∈DMCX×Y′,X′×Y is a convex-product channel if there exist n pairs of channels (Ri,Ti)1≤i≤n and a probability distribution α∈Δ[n] such that Ri∈DMCX,X′ and Ti∈DMCY′,Y for every 1≤i≤n, and
[TABLE]
We denote the set of convex-product channels from X×Y′ to X′×Y as CPCX×Y′,X′×Y.
Proposition 1**.**
The space CPCX×Y′,X′×Y is a compact and convex subset of DMCX×Y′,X′×Y.
Proof.
Define the set of product channels
[TABLE]
Clearly, CPCX×Y′,X′×Y is the convex hull of PCX×Y′,X′×Y and so CPCX×Y′,X′×Y is convex. Now since PCX×Y′,X′×Y can be seen as a subset of RX×Y′×X′×Y, it follows from the Carathéodory theorem that every channel V in CPCX×Y′,X′×Y can be written as a convex combination of at most
[TABLE]
product channels in PCX×Y′,X′×Y. Define the mapping
[TABLE]
as
[TABLE]
Since Δ[n], DMCX,X′ and DMCY′,Y are compact, the space Δ[n]×(DMCX,X′×DMCY′,Y)n is compact. Moreover, since f is continuous, it follows that
[TABLE]
is compact.
∎
Let X,X′,X′′,Y,Y′ and Y′′ be finite sets. For every V∈CPCX×Y′,X′×Y and every V′∈DMCX′×Y′′,X′′×Y′, define the skew-compositionV∘sV′∈DMCX×Y′′,X′′×Y of V′ with V as follows:
[TABLE]
for every x′′∈X′′, y∈Y, x∈X and y′′∈Y′′. It may not be immediately clear from (1) that V∘sV′ is a valid channel in DMCX×Y′′,X′′×Y. In the following, we show that V∘sV′∈DMCX×Y′′,X′′×Y.
Let n≥1, α∈Δ[n], (Ri,Ti)1≤i≤n be such that Ri∈DMCX,X′ and Ti∈DMCY′,Y for every 1≤i≤n, and
[TABLE]
For every (x,y′′)∈X×Y′′, we have
[TABLE]
Therefore, V∘sV′∈DMCX×Y′′,X′′×Y. Note that if V∈DMCX×Y′,X′×Y and V∈/CPCX×Y′,X′×Y, then the skew-composition of V′ with V as defined in Equation (1) does not always yield a valid channel in DMCX×Y′′,X′′×Y.
Lemma 2**.**
If V∈CPCX×Y′,X′×Y and V′∈CPCX′×Y′′,X′′×Y′, then V∘sV′∈CPCX×Y′′,X′′×Y.
Proof.
Let n≥1, α∈Δ[n], (Ri,Ti)1≤i≤n be such that Ri∈DMCX,X′ and Ti∈DMCY′,Y for every 1≤i≤n, and
[TABLE]
Let n′≥1, α′∈Δ[n′], (Rj′,Tj′)1≤j≤n′ be such that Rj′∈DMCX′,X′′ and Tj′∈DMCY′′,Y′ for every 1≤j≤n′, and
[TABLE]
We have
[TABLE]
Therefore, V∘sV′∈CPCX×Y′′,X′′×Y.
∎
For every W′∈DMCX′,Y′ and every V∈CPCX×Y′,X′×Y, we define the skew-compositionV∘sW′∈DMCX,Y of W′ with V as follows:
[TABLE]
Note that Equation (2) can be seen as a particular case of Equation (1) if we let X′′=Y′′={0} (i.e., a singleton) and we identify DMCX′,Y′ with DMCX′×Y′′,X′′×Y′.
The following lemma is trivial:
Lemma 3**.**
Let W∈DMCX,Y and W′∈DMCX′,Y′. W′ contains W if and only if there exists V∈CPCX×Y′,X′×Y such that W=V∘sW′.
IV A characterization of the Shannon ordering
A blind randomized in the middle (BRM) game is a 6-tuple G=(U,X,Y,V,l,W) such that U,X,Y and V are finite sets, l is a mapping from U×V to R, and W∈DMCX,Y. The mapping l is called the payoff function of the BRM game G, and the channel W is called the randomizer of G. The BRM game consists of two players that we call Alice and Bob. The BRM game takes place in two stages:
•
Alice chooses a symbol u∈U and writes her choice on a piece of paper. Bob chooses two functions f:U→X and g:Y→V, and writes a description of f and g on a piece of paper. At this stage, no player has knowledge of the choice of the other player.
•
Alice and Bob simultaneously reveal their papers. They compute x=f(u)∈X and then randomly generate a symbol y∈Y according to the conditional probability distribution W(y∣x). Finally, v=g(y) is computed and then Alice pays111If l(u,v)<0, then Bob pays Alice an amount of money that is equal to −l(u,v). Bob an amount of money that is equal to l(u,v).
A strategy (for Bob) in the BRM game G is a 4-tuple S=(n,α,f,g) satisfying:
•
n≥1 is a strictly positive integer.
•
α∈Δ[n].
•
f=(fi)1≤i≤n∈(XU)n, where XU is the set of functions from U to X.
•
g=(gi)1≤i≤n∈(VY)n.
We denote n and α as nS and αS respectively. For every 1≤i≤n=nS, we denote fi and gi as fi,S and gi,S respectively. The set of strategies is denoted as SU,X,Y,V.
Bob implements the strategy S as follows: he randomly picks an index i∈{1,…,nS} according to the distribution αS, and then commits to the choice (fi,S,gi,S).
For every u∈U, the payoff gained by the strategy S for u in the BRM game G is given by:
[TABLE]
The payoff vector gained by the strategy S in the game G is given by:
[TABLE]
The achievable payoff region for the game G is given by:
[TABLE]
The average payoff for the strategy S∈SU,X,Y,V in the game G is given by:
[TABLE]
\hat{\}(S,\mathcal{G})istheexpectedgainofBobassumingthatAlicechoosesu\in\mathcal{U}$ uniformly at random.
The optimal average payoff for the game G is given by
[TABLE]
For every S∈SU,X,Y,V, we associate the convex-product channel VS∈CPCU×Y,X×V defined as
[TABLE]
For every u∈U, we have
[TABLE]
Lemma 4**.**
For every V∈CPCU×Y,X×V, there exists S∈SU,X,Y,V such that V=VS.
Proof.
Let n≥1, α∈Δ[n], (Ri,Ti)1≤i≤n be such that Ri∈DMCU,X and Ti∈DMCY,V for every 1≤i≤n, and
[TABLE]
Since every channel can be written as a convex combination of deterministic channels [1], we can rewrite (4) as a convex combination of products of deterministic channels. Therefore, there exists S∈SU,X,Y,V such that V=VS.
∎
Equation (3) and Lemma 4 imply that \{\operatorname*{ach}}(\mathcal{G})istheimageof\operatorname*{CPC}{\mathcal{U}\times\mathcal{Y},\mathcal{X}\times\mathcal{V}}byalinearfunction.Since\operatorname*{CPC}{\mathcal{U}\times\mathcal{Y},\mathcal{X}\times\mathcal{V}}is convex and compact (Proposition [1](#Thmmyprop1)),${\operatorname*{ach}}(\mathcal{G})$ is convex and compact as well.
Let U and V be two finite sets and let l:U×V→R be a payoff function. We say that l is normalized and positive if l(u,v)≥0 for every u∈U and every v∈V, and
[TABLE]
In other words, l is normalized and positive if l∈ΔU×V.
The following theorem provides a characterization of the Shannon ordering of communication channels that is similar to the BSS theorem.
Theorem 2**.**
Let X,X′,Y and Y′ be four finite sets. Let W∈DMCX,Y and W′∈DMCX′,Y′. The following conditions are equivalent:
(a)
W′* contains W.*
(b)
For every two finite sets U and V, and every payoff function l:U×V→R, we have
[TABLE]
(c)
For every two finite sets U and V, and every payoff function l:U×V→R, we have
[TABLE]
(d)
For every two finite sets U and V, and every normalized and positive payoff function l∈ΔU×V, we have
[TABLE]
(e)
For every two finite sets U and V, and every normalized and positive payoff function l∈ΔU×V, we have
[TABLE]
Proof.
Assume that (a) is true. Lemma 3 implies that there exists V∈CPCX×Y′,X′×Y such that W=V∘sW′. Let U and V be two finite sets, and let l:U×V→R be a payoff function. Define G=(U,X,Y,V,l,W) and G′=(U,X′,Y′,V,l,W′).
Fix \vec{v}\in\{\operatorname*{ach}}(\mathcal{G}).ThereexistsS\in\mathcal{S}{\mathcal{U},\mathcal{X},\mathcal{Y},\mathcal{V}}suchthat\vec{v}=\vec{$}(S,\mathcal{G})=\big{(}$(u,S,\mathcal{G})\big{)}_{u\in\mathcal{U}}$. From equation (3) we have:
[TABLE]
Lemma 2 implies that VS∘sV∈CPCU×Y′,X′×V and Lemma 4 implies that there exists S′∈SU,X′,Y′,V such that VS′=VS∘sV. Therefore,
[TABLE]
where (∗) follows from Equation (3). This shows that \vec{v}=\big{(}\(u,S^{\prime},\mathcal{G}^{\prime})\big{)}{u\in\mathcal{U}},hence${\operatorname*{ach}}(\mathcal{G})\subset$_{\operatorname*{ach}}(\mathcal{G}^{\prime})$. Therefore, (a) implies (b).
Now assume that (b) is true. Let U and V be two finite sets, and let l:U×V→R be a payoff function. Define G=(U,X,Y,V,l,W) and G′=(U,X′,Y′,V,l,W′). We have \{\operatorname*{ach}}(\mathcal{G})\subset${\operatorname*{ach}}(\mathcal{G}^{\prime})$. Therefore,
[TABLE]
where (∗∗) follows from the fact that \{\operatorname*{ach}}(\mathcal{G})\subset${\operatorname*{ach}}(\mathcal{G}^{\prime})$. This shows that (b) implies (c). We can show similarly that (d) implies (e).
Trivially, (b) implies (d), and (c) implies (e).
Now assume that (e) is true. For every normalized and positive payoff function l∈ΔX×Y, define the BRM games G=(X,X,Y,Y,l,W) and G′=(X,X′,Y′,Y,l,W′). We have \{\operatorname*{opt}}(\mathcal{G})\leq${\operatorname*{opt}}(\mathcal{G}^{\prime})$.
Fix a strategy S∈SX,X,Y,Y satisfying nS=1, f1,S(x)=x for all x∈X and g1,S(y)=y for all y∈Y. Clearly αS(1)=1, hence
Moreover, since ΔX×Y and CPCX×Y′,X′×Y are compact (see Proposition 1), the sup and the inf are attainable. Therefore, we can write:
[TABLE]
Since the function \displaystyle\sum_{\begin{subarray}{c}x\in\mathcal{X},\\
y\in\mathcal{Y}\end{subarray}}\big{(}W(y|x)-(V\circ_{s}W^{\prime})(y|x)\big{)}l(x,y) is affine in both l∈ΔX×Y and V∈CPCX×Y′,X′×Y, it is continuous, concave in l and convex in V. On the other hand, the sets ΔX×Y and CPCX×Y′,X′×Y are compact and convex (see Proposition 1). Therefore, we can apply the minimax theorem [11] to exchange the max and the min in Equation (5). We obtain:
[TABLE]
Therefore, there exists V∈CPCX×Y′,X′×Y such that
[TABLE]
where (††) follows from the fact that \displaystyle\sum_{\begin{subarray}{c}x\in\mathcal{X},\\
y\in\mathcal{Y}\end{subarray}}\big{(}W(y|x)-(V\circ_{s}W^{\prime})(y|x)\big{)}l(x,y) is maximized when we choose l∈ΔX,Y in such a way that l(x0,y0)=1 for any (x0,y0)∈X×Y satisfying
[TABLE]
We conclude that for every (x,y)∈X×Y, we have
[TABLE]
Now since y∈Y∑W(y∣x)=y∈Y∑(V∘sW′)(y∣x) for every x∈X, we must have W(y∣x)=(V∘sW′)(y∣x) for every (x,y)∈X×Y. Therefore, W=V∘sW′. Lemma 3 now implies that W′ contains W, hence (e) implies (a). We conclude that the conditions (a), (b), (c), (d) and (e) are equivalent.
∎
V Space of Shannon-equivalent channels from X to Y
V-A The DMCX,Y(s) space
Let X and Y be two finite sets. Define the equivalence relation RX,Y(s) on DMCX,Y as follows:
[TABLE]
Definition 1**.**
The space of Shannon-equivalent channels with input alphabet X and output alphabet Y is the quotient of the space of channels from X to Y by the Shannon-equivalence relation:
[TABLE]
We define the topology TX,Y(s) on DMCX,Y(s) as the quotient topology TX,Y/RX,Y(s).
Notation 1**.**
Let (U,X,Y,V,l,W) be a BRM game. Since U,X,Y and V are implicitly determined by l and W, we may simply write \_{\operatorname{opt}}(l,W)todenote$_{\operatorname*{opt}}(\mathcal{U},\mathcal{X},\mathcal{Y},\mathcal{V},l,W)$.*
Let W,W′∈DMCX,Y. Theorem 2 shows that W′ contains W if and only if \_{\operatorname{opt}}(l,W)\leq${\operatorname*{opt}}(l,W^{\prime})foreveryl\in\Delta{\mathcal{U}\times\mathcal{V}}andeverytwofinitesets\mathcal{U}and\mathcal{V}.Therefore,WR_{\mathcal{X},\mathcal{Y}}^{(s)}W^{\prime}ifandonlyif${\operatorname*{opt}}(l,W)=${\operatorname*{opt}}(l,W^{\prime})foreveryl\in\Delta_{\mathcal{U}\times\mathcal{V}}andeverytwofinitesets\mathcal{U}and\mathcal{V}.Thisshowsthat${\operatorname*{opt}}(l,W)onlydependsontheR{\mathcal{X},\mathcal{Y}}^{(s)}−equivalenceclassofW.Therefore,if\hat{W}\in\operatorname*{DMC}{\mathcal{X},\mathcal{Y}}^{(s)},wecandefine${\operatorname*{opt}}(l,\hat{W}):=$_{\operatorname*{opt}}(l,W^{\prime})foranyW^{\prime}\in\hat{W}$.*
Define the BRM metricdX,Y(s) on DMCX,Y(s) as follows:
[TABLE]
Proposition 2**.**
Let W1,W2∈DMCX,Y and let W^1 and W^2 be the RX,Y(s)-equivalence classes of W1 and W2 respectively. We have dX,Y(s)(W^1,W^2)≤dX,Y(W1,W2).
The topology induced by dX,Y(s) on DMCX,Y(s) is the same as the quotient topology TX,Y(s). Moreover, (DMCX,Y(s),dX,Y(s)) is compact and path-connected.
Proof.
Since (DMCX,Y,dX,Y) is compact and path-connected, the quotient space (DMCX,Y(s),TX,Y(s)) is compact and path-connected.
Define the mapping Proj:DMCX,Y→DMCX,Y(s) as Proj(W)=W^, where W^ is the RX,Y(s)-equivalence class of W. Proposition 2 implies that Proj is a continuous mapping from (DMCX,Y,dX,Y) to (DMCX,Y(s),dX,Y(s)). Since Proj(W) depends only on W^, Lemma 1 implies that the transcendent mapping of Proj defined on the quotient space (DMCX,Y(s),TX,Y(s)) is continuous. But the transcendent mapping of Proj is nothing but the identity on DMCX,Y(s). Therefore, the identity mapping id on DMCX,Y(s) is a continuous mapping from (DMCX,Y(s),TX,Y(s)) to (DMCX,Y(s),dX,Y(s)).
For every subset U of DMCX,Y(s) we have:
•
If U is open in (DMCX,Y(s),dX,Y(s)), then U=id−1(U) is open in (DMCX,Y(s),TX,Y(s)).
•
If U is open in (DMCX,Y(s),TX,Y(s)), then its complement Uc is closed in (DMCX,Y(s),TX,Y(s)) which is compact, hence Uc is compact in (DMCX,Y(s),TX,Y(s)). This shows that Uc=id(Uc) is a compact subset of (DMCX,Y(s),dX,Y(s)). But (DMCX,Y(s),dX,Y(s)) is a metric space, so Uc is closed in (DMCX,Y(s),dX,Y(s)). Therefore, U is open (DMCX,Y(s),dX,Y(s)).
We conclude that (DMCX,Y(s),TX,Y(s)) and (DMCX,Y(s),dX,Y(s)) have the same open sets. Therefore, the topology induced by dX,Y(s) on DMCX,Y(s) is the same as the quotient topology TX,Y(s). Now since (DMCX,Y(s),TX,Y(s)) is compact and path-connected, (DMCX,Y(s),dX,Y(s)) is compact and path-connected as well.
∎
Throughout this paper, we always associate DMCX,Y(s) with the BRM metric dX,Y(s) and the quotient topology TX,Y(s).
V-B Canonical embedding and canonical identification
Let X1,X2,Y1 and Y2 be four finite sets such that ∣X1∣≤∣X2∣ and ∣Y1∣≤∣Y2∣. We will show that there is a canonical embedding from DMCX1,Y1(s) to DMCX2,Y2(s). In other words, there exists an explicitly constructable compact subset A of DMCX2,Y2(s) such that A is homeomorphic to DMCX1,Y1(s). A and the homeomorphism depend only on X1,X2,Y1 and Y2 (this is why we say that they are canonical). Moreover, we can show that A depends only on ∣X1∣, ∣Y1∣, X2 and Y2.
Lemma 5**.**
For every W∈DMCX1,Y1, every surjection f from X2 to X1, and every injection g from Y1 to Y2, the channel W is Shannon-equivalent to Dg∘W∘Df.
Proof.
Clearly W contains Dg∘W∘Df. Now let f′ be any mapping from X1 to X2 such that f(f′(x1))=x1 for every x1∈X1, and let g′ be any mapping from Y2 to Y1 such that g′(g(y1))=y1 for every y1∈Y1. We have
[TABLE]
and so Dg∘W∘Df also contains W. Therefore, W and Dg∘W∘Df are Shannon-equivalent.
∎
Corollary 2**.**
For every W,W′∈DMCX1,Y1, every two surjections f,f′ from X2 to X1, and every two injections g,g′ from Y1 to Y2, we have:
[TABLE]
Proof.
Since W is Shannon-equivalent to Dg∘W∘Df and W′ is Shannon-equivalent to Dg′∘W′∘Df′, then W is Shannon-equivalent to W′ if and only if Dg∘W∘Df is Shannon-equivalent to Dg′∘W′∘Df′.
∎
For every W∈DMCX1,Y1, we denote the RX1,Y1(s)-equivalence class of W as W^, and for every W∈DMCX2,Y2, we denote the RX2,Y2(s)-equivalence class of W as W~.
Proposition 3**.**
Let X1,X2,Y1 and Y2 be four finite sets such that ∣X1∣≤∣X2∣ and ∣Y1∣≤∣Y2∣. Let f:X2→X1 be any fixed surjection from X2 to X1, and let g:Y1→Y2 be any fixed injection from Y1 to Y2. Define the mapping F:DMCX1,Y1(s)→DMCX2,Y2(s) as
F(W^)=Dg∘W′∘Df=Proj2(Dg∘W′∘Df), where W′∈W^, Dg∘W′∘Df is the RX2,Y2(s)-equivalence class of Dg∘W′∘Df, and Proj2 is the projection onto the RX2,Y2(s)-equivalence classes. We have:
•
F* is well defined, i.e., F(W^) does not depend on W′∈W^.*
•
F* is a homeomorphism between DMCX1,Y1(s) and F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}\subset\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}_{2}}^{(s)}.*
•
F* does not depend on the surjection f nor on the injection g. It depends only on X1, X2, Y1 and Y2, hence it is canonical.*
•
F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}* depends only on ∣X1∣, ∣Y1∣, X2 and Y2.*
•
For every W′∈W^ and every W′′∈F(W^), W′ is Shannon-equivalent to W′′.
If ∣X1∣=∣X2∣ and ∣Y1∣=∣Y2∣, there exists a canonical homeomorphism from DMCX1,Y1(s) to DMCX2,Y2(s) depending only on X1,Y1,X2 and Y2.
Proof.
Let f be a bijection from X2 to X1, and let g be a bijection from Y1 to Y2. Define the mapping F:DMCX1,Y1(s)→DMCX2,Y2(s) as
F(W^)=Dg∘W′∘Df=Proj2(Dg∘W′∘Df), where W′∈W^ and Proj2:DMCX2,Y2→DMCX2,Y2(s) is the projection onto the RX2,Y2(s)-equivalence classes.
Also, define the mapping F′:DMCX2,Y2(s)→DMCX1,Y1(s) as
[TABLE]
where V′∈V~ and Proj1:DMCX1,Y1→DMCX1,Y1(s) is the projection onto the RX1,Y1(s)-equivalence classes.
Proposition 3 shows that F and F′ are well defined.
For every W∈DMCX1,Y1, we have:
[TABLE]
where (a) follows from the fact that W∈W^ and (b) follows from the fact that Dg∘W∘Df∈Dg∘W∘Df.
We can similarly show that F(F′(V~))=V~ for every V~∈DMCX2,Y2(s). Therefore, both F and F′ are bijections. Proposition 3 now implies that F is a homeomorphism from DMCX1,Y1(s) to F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}=\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}_{2}}^{(s)}. Moreover, F depends only on X1,Y1,X2 and Y2.
∎
Corollary 3 allows us to identify DMCX,Y(s) with DMC[n],[m](s) through the canonical homeomorphism, where n=∣X∣, m=∣Y∣, [n]={1,…,n} and [m]={1,…,m}. Moreover, for every 1≤n≤n′ and 1≤m≤m′, Proposition 3 allows us to identify DMC[n],[m](s) with the canonical subspace of DMC[n′],[m′](s) that is homeomorphic to DMC[n],[m](s). In the rest of this paper, we consider that DMC[n],[m](s) is a compact subspace of DMC[n′],[m′](s).
Conjecture 1**.**
For every 1≤n<m, the interior of DMC[n],[n](s) in DMC[m],[m](s) is empty.
VI Space of Shannon-equivalent channels
The previous section showed that if we are interested in Shannon-equivalent channels, it is sufficient to study the spaces DMC[n],[m] and DMC[n],[m](s) for every n,m≥1. Define the space
[TABLE]
The subscripts ∗ indicate that the input and output alphabets of the considered channels are arbitrary but finite. We define the equivalence relation R∗,∗(s) on DMC∗,∗ as follows:
[TABLE]
Definition 2**.**
The space of Shannon-equivalent channels is the quotient of the space of channels by the Shannon-equivalence relation:
[TABLE]
Clearly, DMC[n],[m]/R∗,∗(s) can be canonically identified with DMC[n],[m]/R[n],[m](s)=DMC[n],[m](s) for every n,m≥1. Therefore, we can write
[TABLE]
Note that (a) follows from the fact that DMC[n],[m](s)⊂DMC[k],[k](s) (see Section V-B), where k=max{n,m}.
We define the Shannon-rank of W^∈DMC∗,∗(s) as:
[TABLE]
Clearly,
[TABLE]
A subset A of DMC∗,∗(s) is said to be rank-bounded if there exists n≥1 such that A⊂DMC[n],[n](s).
VI-A Natural topologies on DMC∗,∗(s)
Since DMC∗,∗(s) is the quotient of DMC∗,∗ and since DMC∗,∗ was not given any topology, there is no “standard topology” on DMC∗,∗(s). However, there are many properties that one may require from any “reasonable” topology on DMC∗,∗(s). In this paper, we focus on one particular requirement that we consider the most basic property required from any “acceptable” topology on DMC∗,∗(s):
Definition 3**.**
A topology T on DMC∗,∗(s) is said to be natural if it induces the quotient topology T[n],[m](s) on DMC[n],[m](s) for every n,m≥1.
The reason why we consider such topology as natural is because the quotient topology T[n],[m](s) is the “standard” and “most natural” topology on DMC[n],[m](s). Therefore, we do not want to induce any non-standard topology on DMC[n],[m](s) by relativization.
Proposition 4**.**
Every natural topology is σ-compact, separable and path-connected.
Proof.
Since DMC∗,∗(s) is the countable union of compact and separable subspaces (namely {DMC[n],[n](s)}n≥1), DMC∗,∗(s) is σ-compact and separable.
On the other hand, since n≥1⋂DMC[n],[n](s)=DMC[1],[1](s)=\o and since DMC[n],[n](s) is path-connected for every n≥1, the union DMC∗,∗(s)=n≥1⋃DMC[n],[n](s) is path-connected.
∎
Remark 1**.**
It is possible to show that if Conjecture 1 is true, then for every natural topology T on DMC∗,∗(s), we have:
•
Every open set is rank-unbounded.
•
For every n≥1, the interior of DMC[n],[n](s) in (DMC∗,∗(s),T) is empty.
•
If T is Hausdorff, then
–
(DMC∗,∗(s),T)* is not a Baire space, hence no natural topology can be completely metrized.*
–
(DMC∗,∗(s),T)* is not locally compact anywhere.*
VII Strong topology on DMC∗,∗(s)
Since the spaces {DMC[n],[m]}n,m≥1 are disjoint and since there is no a priori way to (topologically) compare channels in DMC[n],[m] with channels in DMC[n′],[m′] for (n,m)=(n′,m′), the “most natural” topology that we can define on DMC∗,∗ is the disjoint union topology Ts,∗,∗:=n,m≥1⨁T[n],[m]. Clearly, the space (DMC∗,∗,Ts,∗,∗) is disconnected. Moreover, Ts,∗,∗ is metrizable because it is the disjoint union of metrizable spaces. It is also σ-compact because it is the union of countably many compact spaces.
We added the subscript s to emphasize the fact that Ts,∗,∗ is a strong topology (remember that the disjoint union topology is the finest topology that makes the canonical injections continuous).
Definition 4**.**
We define the strong topology Ts,∗,∗(s) on DMC∗,∗(s) as the quotient topology Ts,∗,∗/R∗,∗(s).
We call open and closed sets in (DMC∗,∗(s),Ts,∗,∗(s)) as strongly open and strongly closed sets respectively.
Let Proj:DMC∗,∗→DMC∗,∗(s) be the projection onto the R∗,∗(s)-equivalence classes, and for every n,m≥1 let Projn,m:DMC[n],[m]→DMC[n],[m](s) be the projection onto the R[n],[m](s)-equivalence classes. Due to the identifications that we made in Section VI, we have Proj(W)=Projn,m(W) for every W∈DMC[n],[m]. Therefore, for every U⊂DMC∗,∗(s), we have
[TABLE]
Hence,
[TABLE]
where (a) and (c) follow from the properties of the quotient topology, and (b) follows from the properties of the disjoint union topology.
We conclude that U⊂DMC∗,∗(s) is strongly open in DMC∗,∗(s) if and only if U∩DMC[n],[m](s) is open in DMC[n],[m](s) for every n,m≥1. This shows that the topology on DMC[n],[m](s) that is inherited from (DMC∗,∗(s),Ts,∗,∗(s)) is exactly T[n],[m](s). Therefore, Ts,∗,∗(s) is a natural topology. On the other hand, if T is an arbitrary natural topology and U∈T, then U∩DMC[n],[m](s) is open in DMC[n],[m](s) for every n,m≥1, so U∈Ts,∗,∗(s). We conclude that Ts,∗,∗(s) is the finest natural topology.
We can also characterize the strongly closed subsets of DMC∗,∗(s) in terms of the closed sets of the DMC[n],[m](s) spaces:
[TABLE]
Lemma 6**.**
For every subset U of DMC∗,∗(s), we have:
•
U* is strongly open if and only if U∩DMC[n],[n](s) is open in DMC[n],[n](s) for every n≥1.*
•
U* is strongly closed if and only if U∩DMC[n],[n](s) is closed in DMC[n],[n](s) for every n≥1.*
Proof.
If U is strongly open then U∩DMC[n],[m](s) is open in DMC[n],[m](s) for every n,m≥1. This implies that U∩DMC[n],[n](s) is open in DMC[n],[n](s) for every n≥1.
Conversely, assume that U∩DMC[n],[n](s) is open in DMC[n],[n](s) for every n≥1. Fix n,m≥1 and let k=max{n,m}. We have DMC[n],[m](s)⊂DMC[k],[k](s). Since U∩DMC[k],[k](s) is open in DMC[k],[k](s), the set U∩DMC[n],[m](s)=(U∩DMC[k],[k](s))∩DMC[n],[m](s) is open in DMC[n],[m](s). Therefore, U∩DMC[n],[m](s) is open in DMC[n],[m](s) for every n,m≥1, which implies that U is strongly open.
We can similarly show that U is strongly closed if and only if U∩DMC[n],[n](s) is closed in DMC[n],[n](s) for every n≥1.
∎
Since DMC[n],[n](s) is metrizable for every n≥1, it is also normal. We can use this fact to prove that the strong topology on DMC∗,∗(s) is normal:
The following theorem shows that the strong topology satisfies many desirable properties.
Theorem 4**.**
(DMC∗,∗(s),Ts,∗,∗(s))* is a compactly generated, sequential and T4 space.*
Proof.
Since (DMC∗,∗,Ts,∗,∗) is metrizable, it is sequential. Therefore, (DMC∗,∗(s),Ts,∗,∗(s)), which is the quotient of a sequential space, is sequential.
Let us now show that DMC∗,∗(s) is T4. Fix W^∈DMC∗,∗(s). For every n≥1, we have {W^}∩DMC[n],[n](s) is either \o or {W^} depending on whether W^∈DMC[n],[n](s) or not. Since DMC[n],[n](s) is metrizable, it is T1 and so singletons are closed in DMC[n],[n](s). We conclude that in all cases, {W^}∩DMC[n],[n](s) is closed in DMC[n],[n](s) for every n≥1. Therefore, {W^} is strongly closed in DMC∗,∗(s). This shows that (DMC∗,∗(s),Ts,∗,∗(s)) is T1. On the other hand, Lemma 7 shows that (DMC∗,∗(s),Ts,∗,∗(s)) is normal. This means that (DMC∗,∗(s),Ts,∗,∗(s)) is T4, which implies that it is Hausdorff.
Now since (DMC∗,∗,Ts,∗,∗) is metrizable, it is compactly generated. On the other hand, the quotient space (DMC∗,∗(s),Ts,∗,∗(s)) was shown to be Hausdorff. We conclude that (DMC∗,∗(s),Ts,∗,∗(s)) is compactly generated.
∎
Remark 2**.**
It is possible to show that if Conjecture 1 is true, then we have:
•
Ts,∗,∗(s)* is not first-countable anywhere.*
•
A subset of DMC∗,∗(s) is compact in Ts,∗,∗ if and only if it is rank-bounded and strongly closed.
VIII The BRM metric on the space of Shannon-equivalent channels
We define the BRM metric on DMC∗,∗(s) as follows:
[TABLE]
Let T∗,∗(s) be the metric topology on DMC∗,∗(s) that is induced by d∗,∗(s). We call T∗,∗(s) the BRM topology on DMC∗,∗(s).
Clearly, T∗,∗(s) is natural because the restriction of d∗,∗(s) on DMC[n],[m](s) is exactly d[n],[m](s), and the topology induced by d[n],[m](s) is T[n],[m](s) (Theorem 3).
IX Continuity of channel parameters and operations in the strong topology
IX-A Channel parameters
For every W∈DMC∗,∗, C(W) depends only on the Shannon-equivalence class of W [1]. Therefore, for every W^∈DMC∗,∗(s), we can define C(W^):=C(W′) for any W′∈W^. We can define Pe,n,M(W^) similarly.
Proposition 5**.**
Let X and Y be two finite sets. We have:
•
C:DMCX,Y(s)→R+* is continuous on (DMCX,Y(s),TX,Y(s)).*
•
For every n≥1 and every M≥1, the mapping Pe,n,M:DMCX,Y(s)→[0,1] is continuous on (DMCX,Y(s),TX,Y(s)).
Proof.
Since C:DMCX,Y→R+ is continuous, and since C(W) depends only on the RX,Y(s)-equivalence class of W, Lemma 1 implies that C:DMCX,Y(s)→R+ is continuous on (DMCX,Y(s),TX,Y(s)). We can show the continuity of Pe,n,M on (DMCX,Y(s),TX,Y(s)) similarly.
∎
The following lemma provides a way to check whether a mapping defined on (DMC∗,∗(s),Ts,∗,∗(s)) is continuous:
Lemma 8**.**
Let (S,V) be an arbitrary topological space. A mapping f:DMC∗,∗(s)→S is continuous on (DMC∗,∗(s),Ts,∗,∗(s)) if and only if it is continuous on (DMC[n],[n](s),T[n],[n](s)) for every n≥1.
Proof.
[TABLE]
∎
Proposition 6**.**
We have:
•
C:DMC∗,∗(s)→R+* is continuous on (DMC∗,∗(s),Ts,∗,∗(s)).*
•
For every n≥1 and every M≥1, the mapping Pe,n,M:DMC∗,∗(s)→[0,1] is continuous on (DMC∗,∗(s),Ts,∗,∗(s)).
Proof.
The proposition follows from Proposition 5 and Lemma 8.
∎
IX-B Channel operations
Channel sums and products can be “quotiented” by the Shannon-equivalence relation. We just need to realize that the Shannon-equivalence class of the resulting channel depends only on the Shannon-equivalence classes of the channels that were used in the operation [1].
Proposition 7**.**
We have:
•
The mapping (W^1,W2)→W^1⊕W2 from DMCX1,Y1(s)×DMCX2,Y2(s) to DMCX1∐X2,Y1∐Y2(s) is continuous.
•
The mapping (W^1,W2)→W^1⊗W2 from DMCX1,Y1(s)×DMCX2,Y2(s) to DMCX1×X2,Y1×Y2(s) is continuous.
Proof.
We only prove the continuity of the channel sum because the proof for the channel product is similar.
Let Proj:DMCX1∐X2,Y1∐Y2→DMCX1∐X2,Y1∐Y2(s) be the projection onto the RX1∐X2,Y1∐Y2(s)-equivalence classes. Define the mapping f:DMCX1,Y1×DMCX2,Y2→DMCX1∐X2,Y1∐Y2(s) as f(W1,W2)=Proj(W1⊕W2). Clearly, f is continuous.
Now define the equivalence relation R on DMCX1,Y1×DMCX2,Y2 as:
[TABLE]
The discussion before the proposition shows that f(W1,W2)=Proj(W1⊕W2) depends only on the R-equivalence class of (W1,W2). Lemma 1 now shows that the transcendent map of f defined on (DMCX1,Y1×DMCX2,Y2)/R is continuous.
Notice that (DMCX1,Y1×DMCX2,Y2)/R can be identified with DMCX1,Y1(s)×DMCX2,Y2(s). Therefore, we can define f on DMCX1,Y1(s)×DMCX2,Y2(s) through this identification. Moreover, since DMCX1,Y1 and DMCX2,Y2(s) are locally compact and Hausdorff, Corollary 1 implies that the canonical bijection between (DMCX1,Y1×DMCX2,Y2)/R and DMCX1,Y1(s)×DMCX2,Y2(s) is a homeomorphism.
Now since the mapping f on DMCX1,Y1(s)×DMCX2,Y2(s) is just the channel sum, we conclude that the mapping (W^1,W2)→W^1⊕W2 from DMCX1,Y1(s)×DMCX2,Y2(s) to DMCX1∐X2,Y1∐Y2(s) is continuous.
∎
Proposition 8**.**
Assume that the space DMC∗,∗(s) is endowed with the strong topology. We have:
•
The mapping (W^1,W2)→W^1⊕W2 from DMC∗,∗(s)×DMCX2,Y2(s) to DMC∗,∗(s) is continuous.
•
The mapping (W^1,W2)→W^1⊗W2 from DMC∗,∗(s)×DMCX2,Y2(s) to DMC∗,∗(s) is continuous.
Proof.
We only prove the continuity of the channel sum because the proof of the continuity of the channel product is similar.
Due to the distributivity of the product with respect to disjoint unions, we have:
[TABLE]
and
[TABLE]
Therefore, the space DMC∗,∗×DMCX2,Y2 is the topological disjoint union of the spaces (DMC[n],[m]×DMCX2,Y2)n,m≥1.
For every n,m≥1, let Projn,m be the projection onto the R[n]∐X2,[m]∐Y2(s)-equivalence classes and let in,m be the canonical injection from DMC[n]∐X2,[m]∐Y2(s) to DMC∗,∗(s).
Define the mapping f:DMC∗,∗×DMCX2,Y2→DMC∗,∗(s) as
[TABLE]
where n and m are the unique integers satisfying W1∈DMC[n],[m]. W^1 and W2 are the R[n],[m](s) and RX2,Y2(s)-equivalence classes of W1 and W2 respectively.
Clearly, the mapping f is continuous on DMC[n],[m]×DMCX2,Y2 for every n,m≥1. Therefore, f is continuous on (DMC∗,∗×DMCX2,Y2,Ts,∗,∗⊗TX2,Y2).
Let R be the equivalence relation defined on DMC∗,∗×DMCX2,Y2 as follows: (W1,W2)R(W1′,W2′) if and only if W1R∗,∗(s)W1′ and W2RX2,Y2(s)W2′.
Since f(W1,W2) depends only on the R-equivalence class of (W1,W2), Lemma 1 implies that the transcendent mapping of f is continuous on (DMC∗,∗×DMCX2,Y2)/R.
Since (DMC∗,∗,Ts,∗,∗) and DMCX2,Y2(s)=DMCX2,Y2/RX2,Y2(s) are Hausdorff and locally compact, Corollary 1 implies that the canonical bijection from DMC∗,∗(s)×DMCX2,Y2(s) to (DMC∗,∗×DMCX2,Y2)/R is a homeomorphism. We conclude that the channel sum is continuous on (DMC∗,∗(s)×DMCX2,Y2(s),Ts,∗,∗(s)⊗TX2,Y2(s)).
∎
The reader might be wondering why the channel sum and the channel product were not shown to be continuous on the whole space DMC∗,∗(s)×DMC∗,∗(s) instead of the smaller space DMC∗,∗(s)×DMCX2,Y2(s). The reason is because we cannot apply Corollary 1 to DMC∗,∗×DMC∗,∗ and DMC∗,∗(s)×DMC∗,∗(s) since we do not know whether (DMC∗,∗(s),Ts,∗,∗(s)) is locally compact or not. Moreover, as we stated in Remark 1, if Conjecture 1 is true then (DMC∗,∗(s),Ts,∗,∗(s)) is not locally compact.
As in the case of the space of equivalent channels [8], one potential method to show the continuity of the channel sum on (DMC∗,∗(s)×DMC∗,∗(s),Ts,∗,∗(s)⊗Ts,∗,∗(s)) is as follows: let R be the equivalence relation on DMC∗,∗×DMC∗,∗ defined as (W1,W2)R(W1′,W2′) if and only if W1R∗,∗(s)W1′ and W2R∗,∗(s)W2′. We can identify (DMC∗,∗×DMC∗,∗)/R with DMC∗,∗(s)×DMC∗,∗(s) through the canonical bijection. Using Lemma 1, it is easy to see that the mapping (W^1,W2)→W^1⊕W2 is continuous from \big{(}\operatorname*{DMC}_{\ast,\ast}^{(s)}\times\operatorname*{DMC}_{\ast,\ast}^{(s)},(\mathcal{T}_{s,\ast,\ast}\otimes\mathcal{T}_{s,\ast,\ast})/R\big{)} to (DMC∗,∗(s),Ts,∗,∗(s)).
It was shown in [12] that the topology (Ts,∗,∗⊗Ts,∗,∗)/R is homeomorphic to κ(Ts,∗,∗(s)⊗Ts,∗,∗(s)) through the canonical bijection, where κ(Ts,∗,∗(s)⊗Ts,∗,∗(s)) is the coarsest topology that is both compactly generated and finer than Ts,∗,∗(s)⊗Ts,∗,∗(s). Therefore, the mapping (W^1,W2)→W^1⊕W2 is continuous on \big{(}\operatorname*{DMC}_{\ast,\ast}^{(s)}\times\operatorname*{DMC}_{\ast,\ast}^{(s)},\kappa(\mathcal{T}_{s,\ast,\ast}^{(s)}\otimes\mathcal{T}_{s,\ast,\ast}^{(s)})\big{)}. This means that if Ts,∗,∗(s)⊗Ts,∗,∗(s) is compactly generated, we will have Ts,∗,∗(s)⊗Ts,∗,∗(s)=κ(Ts,∗,∗(s)⊗Ts,∗,∗(s)) and so the channel sum will be continuous on (DMC∗,∗(s)×DMC∗,∗(s),Ts,∗,∗(s)⊗Ts,∗,∗(s)). Note that although Ts,∗,∗(s) and Ts,∗,∗(s) are compactly generated, their product Ts,∗,∗(s)⊗Ts,∗,∗(s) might not be compactly generated.
X Discussion and open problems
The following continuity-related problems remain open:
•
The continuity of the channel parameters C and Pe,n,M in the BRM topology T∗,∗(s).
•
The continuity of the channel sum and the channel product on the whole product space (DMC∗,∗(s)×DMC∗,∗(s),Ts,∗,∗(s)⊗Ts,∗,∗(s)). As we explained in Section IX-B, it is sufficient to prove that the product topology Ts,∗,∗(s)⊗Ts,∗,∗(s) is compactly generated.
•
The continuity of the channel sum and the channel product in the BRM topology.
Acknowledgment
I would like to thank Emre Telatar for helpful discussions. I am also grateful to Maxim Raginsky for informing me about the work of Blackwell on statistical experiments.
Fix n,m≥1 and let l∈Δ[n]×[m]. Define G1=([n],X,Y,[m],l,W1) and G2=([n],X,Y,[m],l,W2). For every S∈S[n],X,Y,[m], we have:
[TABLE]
where (a) follows from the fact that l(u,gi,S(y))≤1 (because l∈Δ[n]×[m]). Therefore,
[TABLE]
hence
[TABLE]
We can show similarly that \{\operatorname*{opt}}(\mathcal{G}{2})-${\operatorname*{opt}}(\mathcal{G}{2})\leq d_{\mathcal{X},\mathcal{Y}}(W_{1},W_{2})$. Therefore,
Corollary 2 implies that Proj2(Dg∘W∘Df)=Proj2(Dg∘W′∘Df) if and only if WRX1,Y1(s)W′. Therefore, Proj2(Dg∘W′∘Df) does not depend on W′∈W^, hence F is well defined. Corollary 2 also shows that Proj2(Dg∘W′∘Df) does not depend on the particular choice of the surjection f or the injection g, hence it is canonical (i.e., it depends only on X1,X2,Y1 and Y2).
On the other hand, the mapping W→Dg∘W∘Df is a continuous mapping from DMCX1,Y1 to DMCX2,Y2, and Proj2 is continuous. Therefore, the mapping W→Proj2(Dg∘W∘Df) is a continuous mapping from DMCX1,Y1 to DMCX2,Y2(s). Now since Proj2(Dg∘W∘Df) depends only on the RX1,Y1(s)-equivalence class W^ of W, Lemma 1 implies that the transcendent mapping of W→Proj2(Dg∘W∘Df) that is defined on DMCX1,Y1(s) is continuous. Therefore, F is a continuous mapping from (DMCX1,Y1(s),TX1,Y1(s)) to (DMCX2,Y2(s),TX2,Y2(s)). Moreover, we can see from Corollary 2 that F is an injection.
For every closed subset B of DMCX1,Y1(s), B is compact since DMCX1,Y1(s) is compact, hence F(B) is compact because F is continuous. This implies that F(B) is closed in DMCX2,Y2(s) since DMCX2,Y2(s) is Hausdorff (as it is metrizable). Therefore, F is a closed mapping.
Now since F is an injection that is both continuous and closed, F is a homeomorphism between DMCX1,Y1(s) and F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}\subset\operatorname*{DMC}_{\mathcal{X}_{2},\mathcal{Y}_{2}}^{(s)}.
We would like now to show that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)} depends only on ∣X1∣, ∣Y1∣, X2 and Y2. Let X1′ and Y1′ be two finite sets such that ∣X1∣=∣X1′∣ and ∣Y1∣=∣Y1′∣. For every W∈DMCX1′,Y1′, let W∈DMCX1′,Y1′(s) be the RX1′,Y1′(s)-equivalence class of W.
Let f′:X1→X1′ be a fixed bijection from X1 to X1′ and let f′′=f′∘f. Also, let g′:Y1′→Y1 be a fixed bijection from Y1′ to Y1 and let g′′=g∘g′. Define F′:DMCX1′,Y1′(s)→DMCX2,Y2(s) as F′(W)=Dg′′∘W′∘Df′′=Proj2(Dg′′∘W′∘Df′′), where W′∈W. As above, F′ is well defined, and it is a homeomorphism from DMCX1′,Y1′(s) to F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}_{1}^{\prime}}^{(s)}\big{)}. We want to show that F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}_{1}^{\prime}}^{(s)}\big{)}=F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}. For every W∈DMCX1′,Y1′(s), let W′∈W. We have
[TABLE]
Since this is true for every W∈DMCX1′,Y1′(s), we deduce that F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}_{1}^{\prime}}^{(s)}\big{)}\subset F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}. By exchanging the roles of (X1,Y1) and (X1′,Y1′) and using the fact that f=f′−1∘f′′ and g=g′′∘g′−1, we get F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}\subset F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}_{1}^{\prime}}^{(s)}\big{)}. We conclude that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)}=F^{\prime}\big{(}\operatorname*{DMC}_{\mathcal{X}_{1}^{\prime},\mathcal{Y}_{1}^{\prime}}^{(s)}\big{)}, which means that F\big{(}\operatorname*{DMC}_{\mathcal{X}_{1},\mathcal{Y}_{1}}^{(s)}\big{)} depends only on ∣X1∣, ∣Y1∣, X2 and Y2.
Finally, for every W′∈W^ and every W′′∈F(W^)=Dg∘W′∘Df, W′′ is Shannon-equivalent to Dg∘W′∘Df and Dg∘W′∘Df is Shannon-equivalent to W′ (by Lemma 5), hence W′′ is Shannon-equivalent to W′.
Define DMC[0],[0](s)=\o, which is strongly closed in DMC∗,∗(s).
Let A and B be two disjoint strongly closed subsets of DMC∗,∗(s). For every n≥0, let An=A∩DMC[n],[n](s) and Bn=B∩DMC[n],[n](s). Since A and B are strongly closed in DMC∗,∗(s), An and Bn are closed in DMC[n],[n](s). Moreover, An∩Bn⊂A∩B=\o.
Construct the sequences (Un)n≥0,(Un′)n≥0,(Kn)n≥0 and (Kn′)n≥0 recursively as follows:
U0=U0′=K0=K0′=\o⊂DMC[0],[0](s). Since A0=B0=\o, we have A0⊂U0⊂K0 and B0⊂U0′⊂K0′. Moreover, U0 and U0′ are open in DMC[0],[0](s), K0 and K0′ are closed in DMC[0],[0](s), and K0∩K0′=\o.
Now let n≥1 and assume that we constructed (Uj)0≤j<n,(Uj′)0≤j<n,(Kj)0≤j<n and (Kj′)0≤j<n such that for every 0≤j<n, we have Aj⊂Uj⊂Kj⊂DMC[j],[j](s), Bj⊂Uj′⊂Kj′⊂DMC[j],[j](s), Uj and Uj′ are open in DMC[j],[j](s), Kj and Kj′ are closed in DMC[j],[j](s), and Kj∩Kj′=\o. Moreover, assume that Kj⊂Uj+1 and Kj′⊂Uj+1′ for every 0≤j<n−1.
Let Cn=An∪Kn−1 and Dn=Bn∪Kn−1′. Since Kn−1 and Kn−1′ are closed in DMC[n−1],[n−1](s) and since DMC[n−1],[n−1](s) is closed in DMC[n],[n](s), we can see that Kn−1 and Kn−1′ are closed in DMC[n],[n](s). Therefore, Cn and Dn are closed in DMC[n],[n](s). Moreover, we have
[TABLE]
where (a) follows from the fact that An∩Bn=Kn−1∩Kn−1′=\o and the fact that Kn−1⊂DMC[n−1],[n−1](s) and Kn−1′⊂DMC[n−1],[n−1](s).
Since DMC[n],[n](s) is normal (because it is metrizable), and since Cn and Dn are closed disjoint subsets of DMC[n],[n](s), there exist two sets Un,Un′⊂DMC[n],[n](s) that are open in DMC[n],[n](s) and two sets Kn,Kn′⊂DMC[n],[n](s) that are closed in DMC[n],[n](s) such that Cn⊂Un⊂Kn, Dn⊂Un′⊂Kn′ and Kn∩Kn′=\o. Clearly, An⊂Un⊂Kn⊂DMC[n],[n](s), Bn⊂Un′⊂Kn′⊂DMC[n],[n](s), Kn−1⊂Un and Kn−1′⊂Un′. This concludes the recursive construction.
Now define U=n≥0⋃Un=n≥1⋃Un and U′=n≥0⋃Un′=n≥1⋃Un′. Since An⊂Un for every n≥1, we have
[TABLE]
Moreover, for every n≥1 we have
[TABLE]
where (a) follows from the fact that Uj⊂Kj⊂Uj+1 for every j≥0, which means that the sequence (Uj)j≥1 is increasing.
For every j≥n, we have DMC[n],[n](s)⊂DMC[j],[j](s) and Uj is open in DMC[j],[j](s), hence Uj∩DMC[n],[n](s) is open in DMC[n],[n](s). Therefore, U∩DMC[n],[n](s)=j≥n⋃(Uj∩DMC[n],[n](s)) is open in DMC[n],[n](s). Since this is true for every n≥1, we conclude that U is strongly open in DMC∗,∗(s).
We can show similarly that B⊂U′ and that U′ is strongly open in DMC∗,∗(s). Finally, we have
[TABLE]
where (a) follows from the fact that for every n≥1 and every n′≥1, we have
[TABLE]
because (Un)n≥1 and (Un′)n≥1 are increasing. We conclude that (DMC∗,∗(s),Ts,∗,∗(s)) is normal.
Bibliography12
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] C. Shannon, “A note on a partial ordering for communication channels,” Inform. Contr. , vol. 1, pp. 390–397, 1958.
2[2] D. Blackwell, “Comparison of experiments,” in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability . University of California Press, 1951, pp. 93–102.
3[3] S. Sherman, “On a theorem of hardy, littlewood, polya, and blackwell,” Proceedings of the National Academy of Sciences of the United States of America , vol. 37, no. 12, pp. 826–831, 1951.
4[4] C. Stein, “Notes on a seminar on theoretical statistics. i. comparison of experiments,” Report, University of Chicago , 1951.
5[5] R. Nasser, “On the input-degradedness and input-equivalence between channels,” Tech. Rep., 2017. [Online]. Available: http://infoscience.epfl.ch/record/225283
6[6] M. Raginsky, “Shannon meets blackwell and le cam: Channels, codes, and statistical experiments,” in 2011 IEEE International Symposium on Information Theory Proceedings , July 2011, pp. 1220–1224.
7[7] R. Nasser, “Topological structures on DMC spaces,” ar Xiv:1701.04467 , Jan 2017.
8[8] ——, “Continuity of channel parameters and operations under various DMC topologies,” ar Xiv:1701.04466 , Jan 2017.