
TL;DR
This paper provides a simplified proof of the fast polarization property of polar codes, which was previously established for binary kernels and generalized by Sasoglu, enhancing understanding of polar code behavior.
Contribution
The paper introduces a simplified proof of the fast polarization phenomenon in polar codes, making the concept more accessible and easier to understand.
Findings
Simplified proof of fast polarization for binary polarizing kernels
Clarifies the theoretical foundation of polar code polarization
Potentially facilitates further research and applications in coding theory
Abstract
Fast polarization is an important and useful property of polar codes. It was proved for the binary polarizing kernel by Arikan and Telatar. The proof was later generalized by Sasoglu. We give a simplified proof.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsError Correcting Code Techniques · Advanced Wireless Communication Techniques · Coding theory and cryptography
A Simple Proof of Fast Polarization
Ido Tal
Department of Electrical Engineering
Technion – Haifa 32000, Israel
E-mail: [email protected]
Abstract
Fast polarization is an important and useful property of polar codes. It was proved for the binary polarizing kernel by Arıkan and Telatar. The proof was later generalized by Şaşoğlu. We give a simplified proof.
Index Terms:
polar codes, fast polarization
I Introduction
Polar codes are a novel family of error correcting codes, invented by Arıkan [1]. The seminal definitions and assumptions in [1] were soon expanded and generalized. Key to almost all the results involving polar codes is the concept of fast polarization. The essence of fast polarization is the phenomenon stated in the following lemma. The lemma was used implicitly by Korada, Şaşoğlu, and Urbanke [2, proof of Theorem 11], and is a generalization of a result by Arıkan and Telatar [3, Theorem 3]. Its explicit formulation and full proof are due to Şaşoğlu [4, Lemma 5.9].
Lemma 1
Let be an i.i.d. process where is uniformly distributed over . Let be a -valued random process where
[TABLE]
We assume and . Suppose also that converges almost surely to a -valued random variable . Then, for any
[TABLE]
we have
[TABLE]
The lemma is used to prove that the Bhattacharyya parameter associated with a random variable that underwent polarization (for example, a synthesized channel) polarizes to [math] at a rate faster than polynomial [4, Theorem 5.4]. A similar claim holds in the case of polarization of the Bhattacharyya parameter to [5, Theorem 16].
The original proof [4, Lemma 5.9] of Lemma 1 is somewhat involved. To summarize, if were equal to , the proof would follow almost directly from the strong law of large numbers. However, for , a sequence of bootstrapping arguments is applied. That is, a current bound on the rate of convergence of to [math] is used to derive a stronger bound, and the process is repeated.
The main aim of this paper is to give a simpler proof of Lemma 1. Thus, we hopefully give insight into the simple mechanics that are at play. Our simpler proof also leads to a stronger result. That is, we will prove the following, which implies Lemma 1.
Lemma 2
Let , , , and be as in Lemma 1. Then, for ,
[TABLE]
Note that Lemma 2 has an “almost sure flavor” [6, page 69, Equation (2)], while Lemma 1 has an “in probability flavor” [6, page 70, Equation (5)]. We prove Lemma 2 in Section II and show that it implies Lemma 1 in Section III.
II Proof of Lemma 2
Let and be given constants, specified towards the end. We now define three events, denoted , , and .
[TABLE]
Recall that the converge almost surely to . Thus, essentially by definition (see [6, Theorem 4.1.1.]), we have for any fixed that
[TABLE]
Note that event is concerned with the frequency of in the subsequence of i.i.d. random variables , which are uniform over . Thus, by the strong law of large numbers111The strong law of large numbers is applied times. Each application is with respect to the indicators , where . As before, we use [6, Theorem 4.1.1.]. [6, Theorem 5.4.2.], we have for any fixed and that
[TABLE]
We deduce that for any there exist such that
[TABLE]
and
[TABLE]
Hence,
[TABLE]
Let us see what the event implies. For fixed , let and be as above. Define the shorthand
[TABLE]
Note that is non-negative, and approaches [math] as approaches [math]. By the definition of the events and , we have that when . Thus, when . Hence, we simplify (1) to
[TABLE]
The above equation is the heart of the proof: we have effectively managed to “make equal ” — the simple case discussed earlier. We have “paid” for this simplification by having the exponents be instead of the original . However, since can be made arbitrarily close to [math], this will not be a problem. Essentially, all that remains is some simple algebra, followed by taking the relevant parameters small/large enough. We do this now.
Let us assume is small enough such that for all and that . Recall also that . Combining (10) with event , we deduce that for all ,
[TABLE]
where the above “” notation is in fact a function of , defined as
[TABLE]
By the definition of event , we have that . We will further assume that . Hence, (11) simplifies to the claim that for all ,
[TABLE]
where
[TABLE]
In light of (3), our goal is to show that for a given and , we can choose , and as above such that . We do this by showing that each of the three sums in (13) can be made smaller than . Recalling that goes to [math] as tends to [math], we deduce that the first sum can be made smaller than by taking small enough. Similarly, we can make the second sum smaller than by taking small enough. For the third sum, we first fix large enough such that (7) holds (note that event is a function of , which is by now fixed). Lastly, we take large enough such that the third sum is smaller than for all , and (8) holds (again, note that event is a function of and , which have been fixed).
Recall that our aim is to prove (3). We deduce from (9), (12), and the above paragraph that for all and ,
[TABLE]
Indeed, we have just proved that for the parameters fixed as above, implies , for . Since we are taking the limit of a strictly increasing sequence, the assertion follows (and the limit exists, since the sequence is bounded).
Since the above inequality holds for all , it must also hold for . Thus, all that remains is to prove that
[TABLE]
Assume to the contrary that there exists and such that
[TABLE]
Clearly, this implies that
[TABLE]
Hence,
[TABLE]
contradicting that fact that the sequence converges almost surely to .
III Proof of Lemma 1
We now explain why Lemma 2 implies Lemma 1. That is, why (3) implies (2). Clearly, (3) implies
[TABLE]
Thus, the claim will follow if we prove that
[TABLE]
Assume to the contrary that there exists such that
[TABLE]
The above implies that the cannot converge in probability to [6, page 70, Equation (5)]. This contradicts the fact that almost sure convergence implies convergence in probability [6, Theorem 4.2].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. Arıkan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inform. Theory , vol. 55, no. 7, pp. 3051–3073, July 2009.
- 2[2] S. B. Korada, E. Şaşoğlu, and R. Urbanke, “Polar codes: Characterization of exponent, bounds, and constructions,” IEEE Trans. Inform. Theory , vol. 56, no. 12, pp. 6253–6264, December 2010.
- 3[3] E. Arıkan and E. Telatar, “On the rate of channel polarization,” in Proc. IEEE Int’l Symp. Inform. Theory (ISIT’2009) , Seoul, South Korea, 2009, pp. 1493–1495.
- 4[4] E. Şaşoğlu, “Polarization and polar codes,” in Found. and Trends in Commun. and Inform. Theory , vol. 8, no. 4, 2012, pp. 259–381.
- 5[5] S. B. Korada and R. Urbanke, “Polar codes are optimal for lossy source coding,” IEEE Trans. Inform. Theory , vol. 56, no. 4, pp. 1751–1768, April 2010.
- 6[6] K. L. Chung, A Course in Probability Theory , 3rd ed. San Diego: Academic Press, 2001.
