A Simple Proof of Fast Polarization

Ido Tal

arXiv:1704.07179·cs.IT·April 25, 2017

A Simple Proof of Fast Polarization

Ido Tal

PDF

Open Access

TL;DR

This paper provides a simplified proof of the fast polarization property of polar codes, which was previously established for binary kernels and generalized by Sasoglu, enhancing understanding of polar code behavior.

Contribution

The paper introduces a simplified proof of the fast polarization phenomenon in polar codes, making the concept more accessible and easier to understand.

Findings

01

Simplified proof of fast polarization for binary polarizing kernels

02

Clarifies the theoretical foundation of polar code polarization

03

Potentially facilitates further research and applications in coding theory

Abstract

Fast polarization is an important and useful property of polar codes. It was proved for the binary polarizing $2 \times 2$ kernel by Arikan and Telatar. The proof was later generalized by Sasoglu. We give a simplified proof.

Equations50

Z_{m+1}\leq K\cdot Z_{m}^{D_{t}}\;,\quad\mbox{whenever $B_{m}=t$}\;.

Z_{m+1}\leq K\cdot Z_{m}^{D_{t}}\;,\quad\mbox{whenever $B_{m}=t$}\;.

0 < β < E ≜ \frac{1}{ℓ} t = 1 \sum ℓ lo g_{ℓ} D_{t}

0 < β < E ≜ \frac{1}{ℓ} t = 1 \sum ℓ lo g_{ℓ} D_{t}

m \to \infty lim Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] = Pr [Z_{\infty} = 0] .

m \to \infty lim Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] = Pr [Z_{\infty} = 0] .

\lim_{m_{0}\to\infty}\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]=\mathrm{Pr}[Z_{\infty}=0]\;.

\lim_{m_{0}\to\infty}\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]=\mathrm{Pr}[Z_{\infty}=0]\;.

A :

A :

B :

C :

m_{a} \to \infty lim Pr [A] = 1 .

m_{a} \to \infty lim Pr [A] = 1 .

m_{b} \to \infty lim Pr [B] = 1 .

m_{b} \to \infty lim Pr [B] = 1 .

Pr [A] \geq 1 - δ_{a}

Pr [A] \geq 1 - δ_{a}

Pr [B] \geq 1 - δ_{b} .

Pr [B] \geq 1 - δ_{b} .

Pr [A \cap B \cap C] \geq Pr [Z_{\infty} = 0] - δ_{a} - δ_{b} .

Pr [A \cap B \cap C] \geq Pr [Z_{\infty} = 0] - δ_{a} - δ_{b} .

θ ≜ - lo g_{ϵ_{a}} K .

θ ≜ - lo g_{ϵ_{a}} K .

Z_{m+1}\leq Z_{m}^{D_{t}-\theta}\;,\quad\mbox{whenever $m\geq m_{a}$ and $B_{m}=t$}\;.

Z_{m+1}\leq Z_{m}^{D_{t}-\theta}\;,\quad\mbox{whenever $m\geq m_{a}$ and $B_{m}=t$}\;.

Z_{m} \leq Z_{m_{a}}^{\prod_{t = 1}^{ℓ} (D_{t} - θ)^{(m - m_{a}) \cdot (1/ ℓ \pm ϵ_{b})}},

Z_{m} \leq Z_{m_{a}}^{\prod_{t = 1}^{ℓ} (D_{t} - θ)^{(m - m_{a}) \cdot (1/ ℓ \pm ϵ_{b})}},

\pm\triangleq\begin{cases}+&\mbox{if $D_{t}-\theta\leq 1$,}\\ -&\mbox{otherwise.}\end{cases}

\pm\triangleq\begin{cases}+&\mbox{if $D_{t}-\theta\leq 1$,}\\ -&\mbox{otherwise.}\end{cases}

Z_{m} \leq 2^{- \prod_{t = 1}^{ℓ} (D_{t} - θ)^{(m - m_{a}) \cdot (1/ ℓ \pm ϵ_{b})}} = 2^{- ℓ^{(E - Δ) m}},

Z_{m} \leq 2^{- \prod_{t = 1}^{ℓ} (D_{t} - θ)^{(m - m_{a}) \cdot (1/ ℓ \pm ϵ_{b})}} = 2^{- ℓ^{(E - Δ) m}},

Δ = t = 1 \sum ℓ \frac{1}{ℓ} lo g_{ℓ} (\frac{D _{t}}{D _{t} - θ}) - t = 1 \sum ℓ \pm ϵ_{b} lo g_{ℓ} (D_{t} - θ) + t = 1 \sum ℓ \frac{m _{a}}{m} (\frac{1}{ℓ} \pm ϵ_{b}) lo g_{ℓ} (D_{t} - θ) .

Δ = t = 1 \sum ℓ \frac{1}{ℓ} lo g_{ℓ} (\frac{D _{t}}{D _{t} - θ}) - t = 1 \sum ℓ \pm ϵ_{b} lo g_{ℓ} (D_{t} - θ) + t = 1 \sum ℓ \frac{m _{a}}{m} (\frac{1}{ℓ} \pm ϵ_{b}) lo g_{ℓ} (D_{t} - θ) .

\lim_{m_{0}\to\infty}\mathrm{Pr}[\overbrace{\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}}^{D}]\geq\mathrm{Pr}[Z_{\infty}=0]-\delta_{a}-\delta_{b}\;.

\lim_{m_{0}\to\infty}\mathrm{Pr}[\overbrace{\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}}^{D}]\geq\mathrm{Pr}[Z_{\infty}=0]-\delta_{a}-\delta_{b}\;.

\lim_{m_{0}\to\infty}\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]\leq\mathrm{Pr}[Z_{\infty}=0]\;.

\lim_{m_{0}\to\infty}\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]\leq\mathrm{Pr}[Z_{\infty}=0]\;.

\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]>\mathrm{Pr}[Z_{\infty}=0]\;.

\mathrm{Pr}[\mbox{$Z_{m}\leq 2^{-\ell^{\beta\cdot m}}$ for all $m\geq m_{0}$}]>\mathrm{Pr}[Z_{\infty}=0]\;.

Pr [m \to \infty lim Z_{m} = 0] > Pr [Z_{\infty} = 0] .

Pr [m \to \infty lim Z_{m} = 0] > Pr [Z_{\infty} = 0] .

Pr [m \to \infty lim Z_{m} = Z_{\infty}] < 1,

Pr [m \to \infty lim Z_{m} = Z_{\infty}] < 1,

m \to \infty lim inf Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] \geq Pr [Z_{\infty} = 0] .

m \to \infty lim inf Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] \geq Pr [Z_{\infty} = 0] .

m \to \infty lim sup Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] \leq Pr [Z_{\infty} = 0] .

m \to \infty lim sup Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] \leq Pr [Z_{\infty} = 0] .

m \to \infty lim sup Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] > Pr [Z_{\infty} = 0] .

m \to \infty lim sup Pr [Z_{m} \leq 2^{- ℓ^{β \cdot m}}] > Pr [Z_{\infty} = 0] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsError Correcting Code Techniques · Advanced Wireless Communication Techniques · Coding theory and cryptography

Full text

A Simple Proof of Fast Polarization

Ido Tal

Department of Electrical Engineering

Technion – Haifa 32000, Israel

E-mail: [email protected]

Abstract

Fast polarization is an important and useful property of polar codes. It was proved for the binary polarizing $2\times 2$ kernel by Arıkan and Telatar. The proof was later generalized by Şaşoğlu. We give a simplified proof.

Index Terms:

polar codes, fast polarization

I Introduction

Polar codes are a novel family of error correcting codes, invented by Arıkan [1]. The seminal definitions and assumptions in [1] were soon expanded and generalized. Key to almost all the results involving polar codes is the concept of fast polarization. The essence of fast polarization is the phenomenon stated in the following lemma. The lemma was used implicitly by Korada, Şaşoğlu, and Urbanke [2, proof of Theorem 11], and is a generalization of a result by Arıkan and Telatar [3, Theorem 3]. Its explicit formulation and full proof are due to Şaşoğlu [4, Lemma 5.9].

Lemma 1

Let $B_{0},B_{1},\ldots$ be an i.i.d. process where $B_{0}$ is uniformly distributed over $\{1,2,\ldots,\ell\}$ . Let $Z_{0},Z_{1},\ldots$ be a $[0,1]$ -valued random process where

[TABLE]

We assume $K\geq 1$ and $D_{1},D_{2},\ldots,D_{\ell}>0$ . Suppose also that $Z_{m}$ converges almost surely to a $\{0,1\}$ -valued random variable $Z_{\infty}$ . Then, for any

[TABLE]

we have

[TABLE]

The lemma is used to prove that the Bhattacharyya parameter associated with a random variable that underwent polarization (for example, a synthesized channel) polarizes to [math] at a rate faster than polynomial [4, Theorem 5.4]. A similar claim holds in the case of polarization of the Bhattacharyya parameter to $1$ [5, Theorem 16].

The original proof [4, Lemma 5.9] of Lemma 1 is somewhat involved. To summarize, if $K$ were equal to $1$ , the proof would follow almost directly from the strong law of large numbers. However, for $K>1$ , a sequence of bootstrapping arguments is applied. That is, a current bound on the rate of convergence of $Z_{m}$ to [math] is used to derive a stronger bound, and the process is repeated.

The main aim of this paper is to give a simpler proof of Lemma 1. Thus, we hopefully give insight into the simple mechanics that are at play. Our simpler proof also leads to a stronger result. That is, we will prove the following, which implies Lemma 1.

Lemma 2

Let $\{B_{m}\}_{m=0}^{\infty}$ , $\{Z_{m}\}_{m=0}^{\infty}$ , $K$ , and $E$ be as in Lemma 1. Then, for $0<\beta<E$ ,

[TABLE]

Note that Lemma 2 has an “almost sure flavor” [6, page 69, Equation (2)], while Lemma 1 has an “in probability flavor” [6, page 70, Equation (5)]. We prove Lemma 2 in Section II and show that it implies Lemma 1 in Section III.

II Proof of Lemma 2

Let $\epsilon_{a},\epsilon_{b}>0$ and $m_{a}<m_{b}$ be given constants, specified towards the end. We now define three events, denoted $A$ , $B$ , and $C$ .

[TABLE]

Recall that the $Z_{m}$ converge almost surely to $Z_{\infty}$ . Thus, essentially by definition (see [6, Theorem 4.1.1.]), we have for any fixed $\epsilon_{a}>0$ that

[TABLE]

Note that event $B$ is concerned with the frequency of $t$ in the subsequence of i.i.d. random variables $B_{m_{a}},B_{m_{a}+1},\ldots,B_{m-1}$ , which are uniform over $\{1,2,\ldots,\ell\}$ . Thus, by the strong law of large numbers111The strong law of large numbers is applied $\ell$ times. Each application is with respect to the indicators $B_{i}=t$ , where $1\leq t\leq\ell$ . As before, we use [6, Theorem 4.1.1.]. [6, Theorem 5.4.2.], we have for any fixed $\epsilon_{b}$ and $m_{a}$ that

[TABLE]

We deduce that for any $\delta_{a},\delta_{b}>0$ there exist $m_{a}<m_{b}$ such that

[TABLE]

and

[TABLE]

Hence,

[TABLE]

Let us see what the event $A\cap B\cap C$ implies. For fixed $0<\epsilon_{a},\epsilon_{b},\delta_{a},\delta_{b}<1$ , let $m_{a}$ and $m_{b}$ be as above. Define the shorthand

[TABLE]

Note that $\theta$ is non-negative, and approaches [math] as $\epsilon_{a}$ approaches [math]. By the definition of the events $A$ and $C$ , we have that $Z_{m}\leq\epsilon_{a}$ when $m\geq m_{a}$ . Thus, $K\leq Z_{m}^{-\theta}$ when $m\geq m_{a}$ . Hence, we simplify (1) to

[TABLE]

The above equation is the heart of the proof: we have effectively managed to “make $K$ equal $1$ ” — the simple case discussed earlier. We have “paid” for this simplification by having the exponents be $D_{t}-\theta$ instead of the original $D_{t}$ . However, since $\theta$ can be made arbitrarily close to [math], this will not be a problem. Essentially, all that remains is some simple algebra, followed by taking the relevant parameters small/large enough. We do this now.

Let us assume $\epsilon_{a}$ is small enough such that $D_{t}-\theta>0$ for all $1\leq t\leq\ell$ and that $\epsilon_{b}<1/\ell$ . Recall also that $Z_{m_{a}}\in[0,1]$ . Combining (10) with event $B$ , we deduce that for all $m\geq m_{b}$ ,

[TABLE]

where the above “ $\pm$ ” notation is in fact a function of $t$ , defined as

[TABLE]

By the definition of event $A$ , we have that $Z_{m_{a}}\leq\epsilon_{a}$ . We will further assume that $\epsilon_{a}\leq 1/2$ . Hence, (11) simplifies to the claim that for all $m\geq m_{b}$ ,

[TABLE]

where

[TABLE]

In light of (3), our goal is to show that for a given $\beta<E$ and $\delta_{a},\delta_{b}>0$ , we can choose $m_{a}<m_{b}$ , and $\epsilon_{a},\epsilon_{b}>0$ as above such that $\Delta<E-\beta$ . We do this by showing that each of the three sums in (13) can be made smaller than $(E-\beta)/3$ . Recalling that $\theta$ goes to [math] as $\epsilon_{a}$ tends to [math], we deduce that the first sum can be made smaller than $(E-\beta)/3$ by taking $\epsilon_{a}$ small enough. Similarly, we can make the second sum smaller than $(E-\beta)/3$ by taking $\epsilon_{b}$ small enough. For the third sum, we first fix $m_{a}$ large enough such that (7) holds (note that event $A$ is a function of $\epsilon_{a}$ , which is by now fixed). Lastly, we take $m_{b}$ large enough such that the third sum is smaller than $(E-\beta)/3$ for all $m\geq m_{b}$ , and (8) holds (again, note that event $B$ is a function of $m_{a}$ and $\epsilon_{b}$ , which have been fixed).

Recall that our aim is to prove (3). We deduce from (9), (12), and the above paragraph that for all $\delta_{a},\delta_{b}>0$ and $0<\beta<E$ ,

[TABLE]

Indeed, we have just proved that for the parameters fixed as above, $A\cap B\cap C$ implies $D$ , for $m_{0}=m_{b}$ . Since we are taking the limit of a strictly increasing sequence, the assertion follows (and the limit exists, since the sequence is bounded).

Since the above inequality holds for all $\delta_{a},\delta_{b}>0$ , it must also hold for $\delta_{a}=\delta_{b}=0$ . Thus, all that remains is to prove that

[TABLE]

Assume to the contrary that there exists $0<\beta<E$ and $m_{0}$ such that

[TABLE]

Clearly, this implies that

[TABLE]

Hence,

[TABLE]

contradicting that fact that the sequence $Z_{m}$ converges almost surely to $Z_{\infty}$ .

III Proof of Lemma 1

We now explain why Lemma 2 implies Lemma 1. That is, why (3) implies (2). Clearly, (3) implies

[TABLE]

Thus, the claim will follow if we prove that

[TABLE]

Assume to the contrary that there exists $0<\beta<E$ such that

[TABLE]

The above implies that the $Z_{m}$ cannot converge in probability to $Z_{\infty}$ [6, page 70, Equation (5)]. This contradicts the fact that almost sure convergence implies convergence in probability [6, Theorem 4.2].

Bibliography6

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. Arıkan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inform. Theory , vol. 55, no. 7, pp. 3051–3073, July 2009.
2[2] S. B. Korada, E. Şaşoğlu, and R. Urbanke, “Polar codes: Characterization of exponent, bounds, and constructions,” IEEE Trans. Inform. Theory , vol. 56, no. 12, pp. 6253–6264, December 2010.
3[3] E. Arıkan and E. Telatar, “On the rate of channel polarization,” in Proc. IEEE Int’l Symp. Inform. Theory (ISIT’2009) , Seoul, South Korea, 2009, pp. 1493–1495.
4[4] E. Şaşoğlu, “Polarization and polar codes,” in Found. and Trends in Commun. and Inform. Theory , vol. 8, no. 4, 2012, pp. 259–381.
5[5] S. B. Korada and R. Urbanke, “Polar codes are optimal for lossy source coding,” IEEE Trans. Inform. Theory , vol. 56, no. 4, pp. 1751–1768, April 2010.
6[6] K. L. Chung, A Course in Probability Theory , 3rd ed. San Diego: Academic Press, 2001.