Polar Codes for Arbitrary Classical-Quantum Channels and Arbitrary cq-MACs
Rajai Nasser, Joseph M. Renes

TL;DR
This paper extends polarization theory to arbitrary classical-quantum channels and cq-MACs, enabling efficient polar code construction with provably fast error decay.
Contribution
It proves polarization for arbitrary cq-channels using Abelian groups and constructs polar codes for these channels and cq-MACs with efficient encoding and decoding.
Findings
Polarization to deterministic homomorphism channels is achieved.
Encoder complexity is O(N log N).
Error probability decays faster than 2^{-N^{β}} for any β<1/2.
Abstract
We prove polarization theorems for arbitrary classical-quantum (cq) channels. The input alphabet is endowed with an arbitrary Abelian group operation and an Ar{\i}kan-style transformation is applied using this operation. It is shown that as the number of polarization steps becomes large, the synthetic cq-channels polarize to deterministic homomorphism channels which project their input to a quotient group of the input alphabet. This result is used to construct polar codes for arbitrary cq-channels and arbitrary classical-quantum multiple access channels (cq-MAC). The encoder can be implemented in operations, where is the blocklength of the code. A quantum successive cancellation decoder for the constructed codes is proposed. It is shown that the probability of error of this decoder decays faster than for any .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Polar Codes for Arbitrary Classical-Quantum Channels and Arbitrary cq-MACs
Rajai Nasser, and Joseph M. Renes
Abstract
We prove polarization theorems for arbitrary classical-quantum (cq) channels. The input alphabet is endowed with an arbitrary Abelian group operation and an Arıkan-style transformation is applied using this operation. It is shown that as the number of polarization steps becomes large, the synthetic cq-channels polarize to deterministic homomorphism channels which project their input to a quotient group of the input alphabet. This result is used to construct polar codes for arbitrary cq-channels and arbitrary classical-quantum multiple access channels (cq-MAC). The encoder can be implemented in operations, where is the blocklength of the code. A quantum successive cancellation decoder for the constructed codes is proposed. It is shown that the probability of error of this decoder decays faster than for any .
I Introduction
Polar coding is the first efficient coding technique that was shown to achieve the capacity of symmetric binary-input channels [1]. The code construction relies on a phenomenon called polarization: starting from a collection of independent copies of a given binary-input channel, one can recursively apply a polarization transformation on those channels and obtain synthetic channels that become extreme (i.e., either almost useless or almost perfect channels) as the number of polarization steps becomes large. This suggests sending information through the channels which are almost perfect, while sending frozen symbols through the almost useless channels. Since the total capacity is conserved by the applied transformations, we can reliably communicate using this method at a rate that is close to the capacity. Arıkan proposed a successive cancellation decoder for the constructed polar code and he showed that both the encoder and the decoder can be implemented in operations. The probability of error of the successive cancellation decoder was shown to decay faster than for any [2].
Since Arıkan’s polarization transformation for binary-input channels uses the XOR operation, the straightforward generalization of Arıkan’s construction to arbitrary discrete memoryless channel is to replace the XOR operation with a binary operation on the input alphabet. It was shown that polarization happens for a wide family of binary operations: addition modulo (where is prime) [3], addition modulo [4], arbitrary Abelian group operations [5] and arbitrary quasigroup operations [6]. This allowed the construction of polar codes for arbitrary discrete memoryless channels since any set can be endowed with an Abelian group operation. Note that in the case where the input alphabet size is not prime, the polarization may not be a two-level polarization to useless and perfect channels as in the binary-input case. We may have multilevel polarization where it is possible for the synthetic channels to converge to intermediate channels which are neither almost useless nor almost perfect. However, the polarized intermediate channels are “easy” in the sense that it is easy to reliably communicate information through them at a rate that is near their symmetric capacity. A complete characterization of binary operations which are polarizing was given in [7] and [8].
Polarization was also shown to happen in the multiple access setting. Polar codes were constructed for two-user MACs with inputs in [9], for -user binary-input MACs [10], and for arbitrary MACs [6, 8].
Wilde and Guha constructed polar codes for binary-input classical-quantum channels in [11]. They showed that using the same polarization transformation of Arıkan yields polarization of the synthetic cq-channels to almost useless and almost perfect channels. Wilde and Guha proposed a quantum successive cancellation decoder and showed that its probability of error decays faster than for any . In [12], Hirche et. al. constructed codes for binary-input cq-MAC codes by combining the polarization results of [11] with the monotone chain rule method of [13].
In this paper, we construct polar codes for arbitrary cq-channels and arbitrary cq-MACs by using arbitrary Abelian group operations on the input alphabets. The polarization transformation that we use is similar to the one in [5]. Since we are proving a quantum version of the results in [3] and [5], many ideas of those two papers were adopted and adapted to the quantum setting. However, some inequalities that were used in [3] and [5] do not have quantum analogues. Therefore, other inequalities that serve the same purpose have to be shown for cq-channels.
In section II we give useful definitions and basic results that we will use later. The polarization transformation is described in section III. Two-level polarization is shown in section IV for cq-channels having input in . In Section V, we prove multilevel polarization for arbitrary cq-channels using an arbitrary Abelian group operation on the input alphabet. We show that the synthetic cq-channels converge to deterministic homomorphism channels which project their input onto a quotient group of the input alphabet. The rate of polarization is discussed in section VI. Polar codes are constructed and studied in section VII. As in all polar coding schemes, the encoder can be implemented in operations, where is the blocklength of the polar code. We prove that the probability of error of the quantum successive cancellation decoder decays faster than for any , but we do not have an efficient implementation of the decoder. Finally, we discuss polarization of arbitrary cq-MACs in section VIII. We show that while cq-MAC polar codes may not achieve the whole symmetric capacity region, they always achieve points on the dominant face. We show that the whole symmetric capacity region can be achieved by combining our cq-channel polarization result with the rate-splitting method of [9] or with the monotone chain rule method of [13].
II Preliminaries
A classical-quantum (cq) channel takes a classical input and has a quantum output , where is the space of density matrices of dimension . We assume that the input alphabet is finite but its size can be arbitrary.
If the input to the cq-channel is uniformly distributed, we can describe the state of the joint input-output system as the state defined as:
[TABLE]
A very important quantity associated with is the symmetric Holevo information defined as:
[TABLE]
where is the von Neumann entropy of the density matrix :
[TABLE]
and is the natural logarithm operator. It is easy to show that
[TABLE]
The quantity is the capacity for transmitting classical information over the channel when the prior input distribution is restricted to be uniform in . We have .
Besides , we will need another parameter that measures the reliability of the channel . For the binary-input case, the fidelity between the two output states was used as a measure of reliability in [11]. In our case, we have output states, so we will consider the average pairwise fidelity between them (similarly to the average Bhattacharyya distance defined in [3]):
[TABLE]
where , and is the nuclear norm of the matrix :
[TABLE]
Clearly, . We adopt the convention if .
It was was shown in [14] that , where is the probability of error of the optimal decoder of . This shows that if is small then is also small and so is reliable. Intuitively, this is true because a small means that all the pairwise fidelities are small, which implies that all the output states are easily distinguishable from each other, which in turn should allow a reliable decoding.
The following proposition provides three inequalities that relate and .
Proposition 1**.**
We have:
- (i)
.
- (ii)
.
- (iii)
.
Proof.
See Appendix A. ∎
In the above proposition, the first inequality implies that if is close to 0 then is close to 1. The same inequality also implies that if is close to 0 then is close to . The second inequality implies that if is close to then is close to 0. The third inequality implies that if is close to then is close to 0.
II-A Non-commutative union bound
Sen proved in [15] the following “non-commutative union bound”:
[TABLE]
where are projection operators. This inequality was used in [11] to upper bound the probability of error of the quantum-successive cancellation decoder of the polar code constructed for a binary-input cq-channel. This was possible because the measurements used in [11] are projective. In this paper, the quantum successive cancellation decoder that we propose uses general POVM measurement. Therefore, we cannot use the inequality (1).
We provide a “non-commutative union bound” that is looser than (1) by a multiplicative factor of , but it is more general so that it can be applied to general POVMs.
Lemma 1**.**
Let be positive operators satisfying . We have:
[TABLE]
Proof.
See Appendix B. ∎
III Polarization process
Since any set can be endowed with an Abelian group operation, we may assume that one such operation on is fixed. We will denote this Abelian group operation additively.
Let be a cq-channel. Define the channels and as:
[TABLE]
and
[TABLE]
Moreover for every and every , define .
Remark 1**.**
* and can be constructed as follows:*
- •
Two independent and uniform random variables are generated in .
- •
* and are computed.*
- •
* is sent through one copy of the channel . Let be the quantum system describing the output.*
- •
* is sent through another copy of the channel (independent from the one that was used for ). Let be the quantum system describing the output.*
It can be easily seen that the channels and simulate and respectively.
We have:
[TABLE]
This shows that the total symmetric Holevo information is conserved. Moreover,
[TABLE]
and
[TABLE]
Let us now study the reliability of the channel and how it is affected after one step of polarization. But first let us define the quantity for every :
[TABLE]
Clearly, and . Note that
[TABLE]
Define . Clearly, .
Proposition 2**.**
For every , we have:
- •
.
- •
.
Proof.
See Appendix C. ∎
Corollary 1**.**
We have:
- •
.
- •
.
- •
F(W^{+})\leq\min\Big{\{}F(W),\;(q-1)^{2}F(W)^{2}\Big{\}}.
- •
**
Proof.
First equation:
[TABLE]
Second equation:
[TABLE]
First part of third equation:
[TABLE]
Second part of third equation:
[TABLE]
First inequality of the fourth equation:
[TABLE]
Second inequality of the fourth equation:
[TABLE]
∎
The following lemma is very useful to prove polarization results.
Lemma 2**.**
[5*]**
Let be a sequence of independent and uniformly distributed -valued random variables. Suppose and are two processes adapted to the process satisfying:*
- (1)
.
- (2)
* converges almost surely to a random variable .*
- (3)
.
- (4)
* when .*
- (5)
There exists a function (depending only on ) satisfying such that for all , if then .
- (6)
There exists a function (depending only on ) satisfying such that for all , if then .
Then exists almost surely. Moreover, we have and with probability 1.
IV Polarization for
In this section, we focus on the particular case where where is prime. The main result of this section is the following theorem.
Theorem 1**.**
Let be a cq-channel with input in . For every , we have:
[TABLE]
Moreover, for every , we have:
[TABLE]
Proof.
Let be a sequence of independent and uniformly distributed -valued random variables. Define the cq-channel-valued process as follows:
- •
.
- •
for every .
Let and . Let us check the conditions of Lemma 2. Conditions (1) and (3) follow from the properties of and . Condition (4) is satisfied because of Corollary 1.
We have . This shows that is a bounded martingale and so it converges almost surely. This shows that condition (2) is satisfied.
Condition (5) follows from the following inequality:
[TABLE]
where (a) is from Proposition 1. By choosing , we can see that condition (5) is satisfied.
In order to show condition (6), we need to prove that if is close to then is close to 0. Let be such that . We have:
[TABLE]
Therefore, for every we have and so
[TABLE]
Assume that is high enough so that
[TABLE]
Now let be such that . Define and let . We have:
[TABLE]
where (a) follows from the fact that is a metric distance [16]. (a), (b) and (c) are true because is a decreasing function on and we assumed Equation (4). We deduce that
[TABLE]
By combining Equation (5) and inequality (iii) of Proposition 1, we get condition (6) of Lemma 2. Therefore, all the conditions of Lemma 2 are satisfied. We conclude that converges almost surely to a random variable . This proves Equation (2).
From Corollary 1 we can deduce that and . Therefore, we can apply the same techniques that were used to prove [17, Theorem 3.5] in order to get Equation (3). ∎
Theorem 3 can be used to construct polar codes for any cq-channel whose input alphabet size is prime. The polar code construction, encoder and decoder are similar to the one described in [11]. The main idea is to send information only through synthetic cq-channels for which the symmetric Holevo information is close to and for which the average pairwise fidelity is less than , where is the blocklength of the polar code and . We send frozen symbols that are known to the receiver through the remaining synthetic cq-channels. A quantum successive cancellation decoder that is similar to the one in [11] is applied. The probability of error can be shown to decay faster than for any . We postpone the accurate description and the study of the polar code till section VII where we construct polar codes in the more general case where is an arbitrary Abelian group.
V Polarization for arbitrary
In this section, is an arbitrary Abelian group. For every cq-channel and for every subgroup of , define the channel as follows:
[TABLE]
can be simulated as follows: if a coset is chosen as input, a random variable is chosen uniformly from and then sent through the channel .
It is easy to see that if , then .
The main result of this section is the following theorem.
Theorem 2**.**
Let be a cq-channel. For every , we have:
[TABLE]
Theorem 2 can be interpreted as follows: As the number of polarization steps becomes large, the synthetic cq-channels polarize to homomorphism channels projecting their input onto a quotient group of . The inequality \big{|}I(W^{s}[H_{s}])-\log|G/H_{s}|\big{|}<\delta means that from the output of , one can determine with high probability the coset of to which the input belongs. The inequality \big{|}I(W^{s})-\log|G/H_{s}|\big{|}<\delta means that there is almost no other information about the input that can be determined from the output of .
In order to prove Theorem 2 we need several definitions and lemmas. Let be a sequence of independent and uniformly distributed -valued random variables. Define the cq-channel-valued process as follows:
- •
.
- •
for every .
Lemma 3**.**
For every subgroup of , the process is a sub-martingale.
Proof.
It is sufficient to show that . Let and be as in Remark 1. We have:
[TABLE]
∎
Let be two subgroups of . For every coset of , let be the set of cosets of which are subsets of . Define the channel as follows:
[TABLE]
can be simulated as follows: if a coset is chosen as input, a random variable is chosen uniformly from and then sent through the channel .
Define the following:
- •
.
- •
.
The following lemma relates to .
Lemma 4**.**
.
Proof.
Let . We have and . Therefore,
[TABLE]
where (a) follows from the fact that conditioning on , the state of the input-output system becomes and so the mutual information between and becomes exactly . ∎
The following lemma relates to .
Lemma 5**.**
For every , we have:
- (1)
.
- (2)
There exists depending only on such that if is maximal in (i.e., is prime) and if , then
[TABLE]
Proof.
See Appendix D. ∎
Lemma 6**.**
For every two subgroups of where is maximal in (i.e., is prime), the process converges almost surely to a random variable and the process converges almost surely to a random variable .
Proof.
Let and . We will show that and satisfy the conditions of Lemma 2, where is replaced with . Conditions (1) and (3) are obviously satisfied. Condition (4) is also satisfied because of Proposition 2.
Since and since and are sub-martingales by Lemma 3, we conclude that converges almost surely. Therefore, condition (2) is satisfied.
To see that condition (5) is satisfied, assume that is close to zero, then the first inequality of Lemma 5 implies that is close to zero for every . The first inequality of Proposition 1 then shows that is close to , for every . Lemma 4 now implies that is close to .
To see that condition (6) is satisfied, assume that is close to 1, then the second inequality of Lemma 5 implies that is close to 1 for every . The third inequality of Proposition 1 then shows that is close to zero, for every . Lemma 4 now implies that is close to zero.
We conclude that converges almost surely to a random variable taking values in and converges almost surely to a random variable taking values in . ∎
Lemma 7**.**
Let . If for all , then
[TABLE]
Proof.
We may assume without loss of generality that and . Define , and for every , let .
For every , we have \displaystyle 1-F_{d_{i}}(W)=\frac{1}{q}\sum_{x\in G}\big{(}1-F(\rho_{x},\rho_{x+d_{i}})\big{)}. Therefore, for every , we have 1-F(\rho_{x},\rho_{x+d_{i}})\leq q\big{(}1-F_{d_{i}}(W)\big{)} and so F(\rho_{x},\rho_{x+d_{i}})\geq 1-q\big{(}1-F_{d_{i}}(W)\big{)}. Therefore,
[TABLE]
where (a) follows from the fact that is a metric distance [16]. (a) and (b) are true because is a decreasing function on and we assumed that for every . We conclude that
[TABLE]
∎
Lemma 8**.**
Let be such that and let be the subgroup generated by . We have:
- •
If for every maximal subgroup of .
- •
If for every maximal subgroup of , then
[TABLE]
Proof.
Let be a maximal subgroup of . Since , then we must have and . Therefore,
[TABLE]
Now let be the maximal subgroups of . For every , let be such that and . It was shown in [5] that , which means that there are such that . Moreover, can be chosen so that .
Since for all , Lemma 7 implies that
[TABLE]
where (a) and (b) are true because is decreasing on and because we assumed that for all . ∎
Proposition 3**.**
For every , the process converges almost surely to a random variable . Moreover, the random set is almost surely a subgroup of .
Proof.
Let be such that . Let be the subgroup generated by . Lemma 6 shows that for every maximal subgroup of , the process converges almost surely to a random variable taking values in .
Take a sample of the process for which converges to either 0 or 1 for every maximal subgroup of . We have:
- •
If there exists a maximal subgroup of for which converges to 0, then the first point of Lemma 8 implies that converges to 0 as well.
- •
If converges to 1 for all maximal subgroups of , then the second point of Lemma 8 implies that converges to 1 as well.
We conclude that for every , the process converges almost surely to a random variable . (Note that for , we have for all .)
Now take a sample of the process for which converges to either 0 or 1 for every . If are such that and converge to 1, then Lemma 7 implies that converges to 1 as well. We conclude that the set \big{\{}d\in G:\;\{F_{d}(W_{n})\}_{n\geq 0}\;\text{converges to}\;1\big{\}} is a subgroup of . ∎
Corollary 2**.**
For every , we have
[TABLE]
Lemma 9**.**
For every , there exists depending only on and such that for every cq-channel , if there exists a subgroup of satisfying for all and for all , then \big{|}I(W)-\log|G/H|\big{|}<\delta and \big{|}I(W[H])-\log|G/H|\big{|}<\delta.
Proof.
If , then and so \big{|}I(W[G])-\log|G/G|\big{|}=0<\delta. On the other hand, since , we have for every . Therefore, . The third inequality of Proposition 1 now implies for some function (depending only on and ) which satisfies .
Now assume that . We have
[TABLE]
where (a) follows from the first inequality of Lemma 5. The first inequality of Proposition 1 implies that for some function (depending only on and ) which satisfies .
On the other hand, we have . Assume that , where is given by Lemma 5. For every , we have
[TABLE]
This means that is close to 1 as well. The third inequality of Proposition 1 now implies that for some function (depending only on and ) which satisfies . We conclude that
[TABLE]
where (a) follows from Lemma 4. We conclude that
[TABLE]
If we define , we get \big{|}I(W)-\log|G/H|\big{|}<\delta_{q}(\epsilon) and \big{|}I(W[H])-\log|G/H|\big{|}<\delta_{q}(\epsilon) in all cases. Moreover, .
This concludes the proof of the lemma. ∎
The proof of Theorem 2 now follows immediately from Corollary 2 and Lemma 9.
VI Rate of polarization
In order to derive the rate of polarization (i.e., how fast do synthetic cq-channels polarize), we need the following two lemmas.
Lemma 10**.**
For every subgroup of , we have:
- •
.
- •
.
Proof.
See Appendix E ∎
Lemma 11**.**
For any and any , we have
[TABLE]
Proof.
The lemma is trivial if , so let us assume that . Let be a sequence of subgroups of satisfying:
- •
.
- •
is maximal in for every .
Let be the process defined in the previous section. Lemma 6 implies that converges almost surely to a random variable . On the other hand, we have
[TABLE]
This shows that the process converges almost surely to a random variable satisfying
[TABLE]
Due to the relations between the quantities and in Proposition 1, we can see that converges to 0 whenever converges to , and there is a number such that whenever converges to a number in other than . Therefore, we can say that almost surely, we have:
[TABLE]
Now from Lemma 10, we have and . By applying exactly the same techniques that were used to prove [17, Theorem 3.5] we get:
[TABLE]
By examining the explicit expression of this probability we get the lemma. ∎
Theorem 3**.**
The polarization of is almost surely fast:
[TABLE]
for any and any .
Proof.
For every subgroup of , define:
[TABLE]
[TABLE]
and
[TABLE]
If \displaystyle s\in E_{1}/\Big{(}\bigcup_{H\;\text{subgroup of}\;G}E_{H}\Big{)} then . Therefore,
[TABLE]
and . By Theorem 2 and Lemma 11 we have:
[TABLE]
∎
VII Polar code construction
Let be an arbitrary cq-channel.
Choose and , and let be an integer such that
[TABLE]
where
[TABLE]
Such an integer exists due to Theorem 3. For every choose a subgroup of as follows:
- •
If , define . We clearly have .
- •
If , choose a subgroup of such that , \big{|}I(W^{s})-\log|G/H_{s}|\big{|}<\frac{\delta}{2} and \big{|}I(W^{s}[H_{s}])-\log|G/H_{s}|\big{|}<\frac{\delta}{2}.
Now for every , let be a frozen mapping (in the sense that the receiver knows ) such that for all . We call such mapping a section mapping of . Let be a random coset chosen uniformly in and we let . Note that if the receiver can determine accurately, then he can also determine since he knows .
If , we have some freedom on the choice of the section mapping . We will analyze the performance of polar codes averaged on all possible section mappings. I.e., we assume that is chosen uniformly from all the possible section mappings of . We can easily see that the induced distributions of \big{\{}U^{s}:\;s\in\{-,+\}^{n}\big{\}} are independent and uniform in . Note that for every , the receiver has to determine in order to successfully determine .
VII-A Encoder
We associate the set with the strict total order defined as if and only if for some and for all .
For every , every and every , define recursively on as follows:
- •
if and .
- •
if , and .
- •
if , and .
For every , we write as and as .
Let be a set of independent copies of the channel . should not be confused with : is a copy of the channel and is a synthetic cq-channel obtained from as before.
Let be the sequence of independent random variables that were defined before. For every , and , define U_{s^{\prime}}^{s^{\prime\prime}}=\mathcal{E}_{s^{\prime}}^{s^{\prime\prime}}\big{(}(U^{s})_{s\in S_{n}}\big{)}. We have:
- •
if and .
- •
if , and .
- •
if , and .
For every , let . It is easy to see that are independent and uniformly distributed in .
For every , we send through the channel . Let be the system describing the output of the channel , and let . We can prove by backward induction on that the channel U_{s^{\prime}}^{s^{\prime\prime}}\rightarrow\big{(}\{B_{s}\}_{s\;has\;s^{\prime}\;as\;prefix},\{U_{s^{\prime}}^{r}\}_{r<s^{\prime\prime}}\big{)} is equivalent to the channel for every , and . In particular, the channel U^{s}\rightarrow\big{(}B,\{U^{r}\}_{r<s}\big{)} is equivalent to the channel for every .
Note that the encoding algorithm described above has a complexity of , where is the blocklength of the polar code.
VII-B Quantum successive cancellation decoder
Before describing the decoder, let us fix a few useful notations.
For every , define and . For every , define the following:
- •
For every , let .
- •
For every , let .
- •
Define . This means that if for every , then the receiver sees the state at the output.
It is easy to see that for every , we have , where
[TABLE]
and
[TABLE]
Moreover, we have , where
[TABLE]
and
[TABLE]
Lemma 12**.**
For every , there exists a POVM such that the POVM defined as
[TABLE]
satisfies
[TABLE]
Proof.
See Appendix F. ∎
For every , every and every , define the POVM as:
[TABLE]
Now we are ready to describe the quantum successive cancellation decoder. We will decode successively by respecting the order on . At the stage , we would have decoded and obtained an estimate of it, so we apply the POVM on the output system and we let be the measurement result. We assume that the the POVM measurement is designed so that if was the state of the system before the measurement, and if the output occurs, then the post-measurement state is .
The whole procedure is equivalent to applying the POVM defined as:
[TABLE]
where are the elements of ordered according to the order relation .
It is easy to see that for every , and .
VII-C Performance of polar codes
For every , let be the set of section mappings of . We have:
[TABLE]
It is easy to see that . Define
[TABLE]
For every and every , define f(\tilde{u})=\big{(}f_{s}(\tilde{u}^{s})\big{)}_{s\in S_{n}}\in G^{S_{n}}.
The probability of error of the quantum successive cancellation decoder for a particular choice of is given by:
[TABLE]
where is uniformly distributed in .
The probability of error averaged over all the choices of is:
[TABLE]
where is uniformly distributed in , and U=(U^{s})_{s\in S_{n}}=F(U)=\big{(}F_{s}(\tilde{U}^{s})\big{)}_{s\in S_{n}}. It is easy to see that are independent and uniformly distributed in . We have:
[TABLE]
where (a) follows from the “non-commutative union bound” of Lemma 1. (b) follows from the concavity of the square root. (c) follows from the fact that , which implies that and , which in turn implies that . (d) follows from the fact that depends only on and . (e) follows from the fact that for every and every , we have:
[TABLE]
On the other hand, we have:
[TABLE]
Therefore,
[TABLE]
where (a) follows from Lemma 12.
The above upper bound was calculated on average over a random choice of the frozen section mappings. Therefore, there is at least one choice of the frozen section mappings for which the upper bound of the probability of error still holds.
It remains to study the rate of the constructed polar code. The rate at which we are communicating is . On the other hand, we have \big{|}I(W^{s})-\log|G/H_{s}|\big{|}<\frac{\delta}{2} for all . Now since we have , we conclude that:
[TABLE]
where .
To this end we have proven the following theorem which is the main result of this paper:
Theorem 4**.**
Let be an arbitrary cq-channel, where the input alphabet is endowed with an Abelian group operation. For every and every , there exists a polar code of blocklength based on the group operation which has a rate and an encoder algorithm of complexity . Moreover, the probability of error of the quantum successive cancellation decoder is less than .
VIII Polar codes for arbitrary classical-quantum MACs
An -user classical-quantum multiples access channel (cq-MAC)
[TABLE]
takes classical inputs from the users and produces a quantum output . We assume that the input alphabets are finite but their sizes can be arbitrary.
The achievable rate-region is described by a collection of inequalities [18]:
[TABLE]
where , , , and the mutual information is computed according to following state:
[TABLE]
for some independent probability distributions on for .
We are interested in the case where the probability distributions of are uniform in respectively. We define the symmetric capacity region of as
[TABLE]
where is computed according to
[TABLE]
The set \big{\{}(R_{1},\ldots,R_{m})\in\mathcal{J}(W):\;R_{1}+\ldots+R_{m}=I(W)\big{\}} is called the dominant face of , where is the symmetric sum-capacity of .
For every , we fix an Abelian group operation on and we denote it additively. It is possible to construct cq-MAC codes which achieve the rates in the region using one of the following two methods:
- •
By using the monotone chain rule method of Arıkan [13] and applying a polarization transformation using the Abelian group operation for each user.
- •
By using the rate-splitting method described in [9] and applying a polarization transformation using the Abelian group operation for each user.
By using the cq-channel polarization results of this paper and a similar analysis as in [13], [9] and [12], we can show that both methods yield cq-MAC codes that achieve the whole region for which the probability of error of the quantum successive cancellation decoder decays faster than for any , where is the blocklength of the code.
However, one may hesitate to call the codes obtained using these methods as cq-MAC polar codes because they are not based on the polarization of cq-MACs. These methods are hybrid schemes which combine cq-channel polarization (not cq-MAC polarization) with other techniques. Moreover, the code construction for these methods is more complicated than cq-MAC polar codes. In the rest of this section, we describe how cq-MAC polar codes are constructed.
We define the cq-MACs and as follows:
[TABLE]
[TABLE]
where
[TABLE]
and
[TABLE]
Note that the cq-MAC can be seen as a cq-channel with input in . Moreover, and when seen as cq-channels can be obtained from the cq-channel by applying the polarization transformation which uses the Abelian group operation of the product group . Therefore, the cq-channel polarization results of the previous sections can be applied to . In particular, we have:
- •
. This shows that the symmetric sum-capacity is conserved by the polarization transformation and that for every , the region contains points on the dominant face of .
- •
For every subgroup of , we have by Lemma 3. Therefore, for every , we have
[TABLE]
where,
[TABLE]
Equation (6) shows that although the symmetric-sum capacity is conserved by polarization, the highest achievable individual rates can decrease. In other words, polarization can induce a loss in the symmetric capacity region.
- •
Theorem 3 implies that
[TABLE]
In other words, as the number of polarization steps becomes large, the synthetic cq-MACs become close to deterministic homomorphism channels which project the input onto some quotient group of the product group .
One can employ the properties of subgroups of product groups to show that the polarized cq-MAC is an “easy” cq-MAC in a sense similar to the way easy MACs were defined in [8]. This allows the construction of cq-MAC polar codes for which the probability of error of the quantum successive cancellation decoder decays faster than for any , where is the blocklength of the code. The region of rates that are achievable by cq-MAC polar codes is given by:
[TABLE]
where
[TABLE]
The cq-MAC polar codes can be compared to the two cq-MAC coding methods that were described at the beginning of this section:
- •
The cq-MAC polar codes has the advantage that the code construction is simpler.
- •
The other two coding methods have the advantage that they always achieve the whole symmetric capacity region , which may not be the case for cq-MAC polar codes in general.
IX Conclusion
We have shown that using a polarization transformation that is based on an Abelian group operation on the input alphabet yields multi-level polarization for arbitrary classical-quantum channels in a similar way as in the case of classical channels. This result made it possible to construct polar codes for arbitrary cq-channels and arbitrary cq-MACs.
One weakness of the results presented here is that the proposed quantum successive cancellation decoder does not seem to have an efficient implementation. This was also the case for the polar codes that were constructed for binary-input cq-channels in [11]. Finding an efficient decoder for the polar codes remains an open problem.
If we define cq-polarizing binary operations as those which can polarize an arbitrary cq-channel to “easy” cq-channels in a sense similar to the definition of classical polarizing binary operations [8], then this paper has shown that Abelian group operations are cq-polarizing. Therefore, being an Abelian group operation is a sufficient condition to be cq-polarizing. On the other hand, from the results of [8] we can deduce that being uniformity-preserving and having a right-inverse that is strongly ergodic are necessary conditions because classical channels are a particular case of cq-channels. Finding a necessary and sufficient condition for a binary operation to be cq-polarizing remains an open problem. Trying to prove a quantum version of the results in [8] by using a similar approach may not be successful because the proof of the sufficient condition in [8] relies heavily on the entropy of the input conditioned on a particular output symbol, and this does not have an analogue in the case of cq-channels.
We have shown that cq-MAC polarization can induce a loss in the symmetric capacity region. A necessary and sufficient condition for in the case of classical MACs was given in [19]. Generalizing the results of [19] to cq-MACs is an open problem. We note that the condition in [19] was given in terms of the Fourier transform of the probability distribution of one input conditioned on the output and on the other input. Since this conditional probability does not have an analogue in the case of cq-MACs, generalizing the results of [19] to cq-MACs might be challenging and a completely different approach might be needed.
Appendix A Proof of Proposition 1
In [20, Prop. 1], it was shown that for every , we have:
[TABLE]
By taking , we obtain:
[TABLE]
where (a) follows from the fact that .
In order to prove the second inequality, define the channel as follows:
[TABLE]
The two additional systems and can be interpreted as additional side information about the input which is provided to the receiver. Note that if are traced out, we recover the channel .
Let . We have:
[TABLE]
where (a) follows from the fact that given , the conditional probability distribution of is uniform in . (b) follows from the fact that the distribution of is uniform in the set
[TABLE]
(c) is true because conditioning on and then tracing out gives the state which just represents with uniform input, where is the binary-input cq-channel defined as and . In other words, the channel is obtained from by restricting the input to .
Now since is a binary-input cq-channel, we have from [11, Prop. 1] that
[TABLE]
Therefore,
[TABLE]
where the last inequality follows from the concavity of the function .
It remains to show the last inequality of Proposition 1. Define the following:
- •
.
- •
, where is an optimal POVM that decodes with the lowest probability of error.
We have:
- •
.
- •
.
From [16, Sec 9.2.3], we have
[TABLE]
where is the trace distance between and . We have:
[TABLE]
where (a) follows from the concavity of the fidelity.
Now let be the probability of correct guess of the optimal decoder . We have:
[TABLE]
Therefore,
[TABLE]
where (a) follows from the fact that a random guess gives a probability of correct guess .
On the other hand, we know that . Therefore,
[TABLE]
where (b) follows from the fact that .
By combining (7), (8) and (9), we get:
[TABLE]
Thus,
[TABLE]
which implies that
[TABLE]
where (a) follows from [21, Prop 4.3] and the operational interpretation of the conditional min-entropy of a cq-state in terms of the guessing probability [22]. Therefore,
[TABLE]
Appendix B Proof of Lemma 1
Let . We have:
[TABLE]
where (a) follows from the fact that for every , and the fact that if and , then . (b) follows from the fact that for every , , and the fact that if are two positive operators with , then . (c) follows from the fact that for every positive operator (see [23]). (d) follows from the concavity of the square root.
Appendix C Proof of Proposition 2
Lemma 13**.**
Let and be two positive semi-definite matrices. We have111The proof of Lemma 13 is due to Martin Argerami who thankfully answered our question on Math Stack Exchange. In an earlier version of this paper, we used a weaker inequality which we proved using Weyl’s inequality [24] that relates the eigenvalues of with those of and .:
[TABLE]
Proof.
Let us first assume that and are invertible. Since the mapping is monotonically decreasing [25], we have . Moreover, since the square root is operator monotone [25], we have . Similarly, . Therefore,
[TABLE]
where (a) follows from the fact that if and , then .
Now let and be two arbitrary positive semi-definite matrices. We have:
[TABLE]
∎
Lemma 14**.**
Let and be density matrices of the same dimension. Let and be probability distributions on and respectively. We have:
[TABLE]
Proof.
It is sufficient to show the lemma for the case where :
[TABLE]
where (a) follows from Lemma 13. ∎
Now we are ready to prove Proposition 2:
[TABLE]
[TABLE]
where (a) follows from the joint concavity of the fidelity.
[TABLE]
where (a) follows from Lemma 14.
Appendix D Proof of Lemma 5
[TABLE]
where (a) follows from Lemma 14, and (b) follows from the fact that and the fact that {: , and } if and only if {, and }.
Now let us show the second inequality of Lemma 5. Assume that is maximal in and let be such that and . Since \displaystyle 1-F_{d}(W)=\frac{1}{q}\sum_{x\in G}\big{(}1-F(\rho_{x},\rho_{x+d})\big{)}, we have F(\rho_{x},\rho_{x+d})\geq 1-q(1-F_{d}(W))=1-q\big{(}1-F_{\max}^{M|H}(W)\big{)} for every .
For every , we have:
[TABLE]
where (a) follows from the fact that (see [16]). (here is the trace distance between and .) (b) follows from the fact that (see [16]).
Now let be such that . Since is prime, we can write for some . We have:
[TABLE]
where (a) follows from the fact that is a metric [16]. Note that since is a decreasing function on , (a), (b) and (c) become true if we assume that \displaystyle 1-\sqrt{1-\Big{(}1-q\big{(}1-F_{\max}^{M|H}(W)\big{)}\Big{)}^{2}}\geq\cos\left(\frac{\pi}{2(q-1)}\right). In other words, we can take
[TABLE]
We conclude that
[TABLE]
Appendix E Proof of Lemma 10
Lemma 15**.**
For every subgroup of , we have:
[TABLE]
Proof.
[TABLE]
where (a) follows from the concavity of the fidelity and from the fact that . ∎
Now we are ready to prove Lemma 10. The lemma is trivial for . Assume that . We have:
[TABLE]
where (a) follows from Lemma 5. (b) follows from Proposition 2. (c) follows from the fact that for every , if then either or , and so . (d) follows from Lemma 15.
On the other hand,
[TABLE]
where (a) follows from Lemma 5, (b) follows from Proposition 2 and (c) follows from Lemma 15.
Appendix F Proof of Lemma 12
It is sufficient to show the following simpler version:
Lemma 16**.**
If is a cq-channel such that
[TABLE]
where and is an orthonormal basis of the Hilbert space of dimension , then for every , there exists a POVM such that the POVM defined as
[TABLE]
satisfies
[TABLE]
Proof.
For every , define the cq-channel . The optimal decoder for satisfies [14]. Therefore, there exists a POVM satisfying,
[TABLE]
For every , define
[TABLE]
It is easy to see that is a valid POVM. We have:
[TABLE]
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. Arıkan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” Information Theory, IEEE Transactions on , vol. 55, no. 7, pp. 3051 –3073, 2009.
- 2[2] E. Arıkan and E. Telatar, “On the rate of channel polarization,” in Information Theory, 2009. ISIT 2009. IEEE International Symposium on , 28 2009.
- 3[3] E. Şaşoğlu, E. Telatar, and E. Arıkan, “Polarization for arbitrary discrete memoryless channels,” in Information Theory Workshop, 2009. ITW 2009. IEEE , 2009, pp. 144 –148.
- 4[4] W. Park and A. Barg, “Polar codes for q 𝑞 q -ary channels,,” Information Theory, IEEE Transactions on , vol. 59, no. 2, pp. 955–969, 2013.
- 5[5] A. G. Sahebi and S. S. Pradhan, “Multilevel channel polarization for arbitrary discrete memoryless channels,” IEEE Transactions on Information Theory , vol. 59, no. 12, pp. 7839–7857, Dec 2013.
- 6[6] R. Nasser and E. Telatar, “Polar codes for arbitrary dmcs and arbitrary macs,” IEEE Transactions on Information Theory , vol. 62, no. 6, pp. 2917–2936, June 2016.
- 7[7] R. Nasser, “Ergodic theory meets polarization. I: An ergodic theory for binary operations,” Co RR , vol. abs/1406.2943, 2014. [Online]. Available: http://arxiv.org/abs/1406.2943
- 8[8] ——, “Ergodic theory meets polarization. II: A foundation of polarization theory,” Co RR , vol. abs/1406.2949, 2014. [Online]. Available: http://arxiv.org/abs/1406.2949
