Comparison of Polar Decoders with Existing Low-Density Parity-Check and   Turbo Decoders

Alexios Balatsoukas-Stimming; Pascal Giard; and Andreas Burg

arXiv:1702.04707·cs.IT·May 10, 2017

Comparison of Polar Decoders with Existing Low-Density Parity-Check and Turbo Decoders

Alexios Balatsoukas-Stimming, Pascal Giard, and Andreas Burg

PDF

TL;DR

This paper compares polar decoders with LDPC and Turbo decoders in terms of error-correction performance and hardware efficiency to identify their practical advantages and guide future research.

Contribution

It provides a comprehensive comparison of polar, LDPC, and Turbo decoders, highlighting their relative strengths and hardware implementation considerations.

Findings

01

Polar decoders can outperform LDPC and Turbo decoders in certain applications.

02

Hardware efficiency varies significantly among the different decoder types.

03

The study identifies specific scenarios where polar codes are most advantageous.

Abstract

Polar codes are a recently proposed family of provably capacity-achieving error-correction codes that received a lot of attention. While their theoretical properties render them interesting, their practicality compared to other types of codes has not been thoroughly studied. Towards this end, in this paper, we perform a comparison of polar decoders against LDPC and Turbo decoders that are used in existing communications standards. More specifically, we compare both the error-correction performance and the hardware efficiency of the corresponding hardware implementations. This comparison enables us to identify applications where polar codes are superior to existing error-correction coding solutions as well as to determine the most promising research direction in terms of the hardware implementation of polar decoders.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Comparison of Polar Decoders with Existing Low-Density Parity-Check and Turbo Decoders

Alexios Balatsoukas-Stimming, Pascal Giard, and Andreas Burg

Telecommunications Circuits Laboratory, EPFL, Switzerland

Email: {alexios.balatsoukas,pascal.giard,andreas.burg}@epfl.ch

Abstract

Polar codes are a recently proposed family of provably capacity-achieving error-correction codes that received a lot of attention. While their theoretical properties render them interesting, their practicality compared to other types of codes has not been thoroughly studied. Towards this end, in this paper, we perform a comparison of polar decoders against LDPC and Turbo decoders that are used in existing communications standards. More specifically, we compare both the error-correction performance and the hardware efficiency of the corresponding hardware implementations. This comparison enables us to identify applications where polar codes are superior to existing error-correction coding solutions as well as to determine the most promising research direction in terms of the hardware implementation of polar decoders.

I Introduction

Polar codes [1] are a new class of provably capacity-achieving channel codes which have attracted significant attention due to their interesting theoretical properties and their low-complexity encoding and decoding algorithms. The most popular decoding algorithms for polar codes are simple successive-cancellation (SC) decoding [1], successive-cancellation list (SCL) decoding [2], and belief-propagation (BP) decoding [3].

Polar codes are currently under consideration for potential adoption in future 5G standards. A crucial factor in this discussion is the demonstration of polar decoder ASIC implementations that can outperform existing decoders for error-correction codes. A comparison against low-density parity-check (LDPC) decoders and Turbo decoders, both in terms of the error-correction performance and in terms of the hardware efficiency, is of particular interest. We note that other decoder properties, such as the flexibility, the decoding latency, and the energy efficiency, are also of great importance [4], but they are beyond the scope of this paper.

Over the past few years, significant advances have been achieved in the hardware implementation of decoders for polar codes. A brief overview of the most important ones can be found in [5]. Error-correction performance comparisons of polar decoders against that of decoders for other error-correction codes can sporadically already be found in the literature. For example, [3] compared an FPGA-based BP polar decoder with an FPGA-based decoder for the Turbo code of the IEEE 802.16e standard. Moreover, [6] compared an FPGA-based SC decoder with an FPGA-based decoder for the LDPC code of the IEEE 802.3an standard. The authors of [7] compared the error-correction performance of SCL decoding with the error-correction performance of the LDPC code used in the IEEE 802.16e standard. Finally, [8] compared SCL decoding with the LDPC codes used in the IEEE 802.11n and IEEE 802.3an standards. However, these comparisons were not systematic and no comparison of the corresponding hardware implementations was made.

Contribution: In this paper, we compare the error-correction performance and hardware efficiency of polar decoders—for the three main decoding algorithms—with those of decoders for the LDPC codes used in the IEEE 802.11ad (WiGig) [9], IEEE 802.11n (Wi-Fi) [10], and IEEE 802.3an ( $10$ Gb/s Ethernet) [11] standards, as well as those of decoders for the Turbo code used in the 3GPP LTE [12] standard.

II Comparison Methodology

Most hardware implementations of polar decoders in the literature focused on either on SC, BP, or SCL decoding algorithms. Thus, our comparison is for polar decoders based on these algorithms. The comparison against LDPC and Turbo decoders has two aspects as we are interested in both error-correction capabilities and hardware efficiency.

For the error-correction performance comparison of the various polar decoders with that of LDPC and Turbo decoders, floating-point versions of all decoding algorithms are used. The quantization parameters of the hardware decoders are usually chosen so that the performance loss with respect to the floating-point implementation is negligible. Moreover, for all simulations the encoded codewords obtained from random data are modulated using binary phase-shift keying (BPSK) and are transmitted over an additive white Gaussian noise (AWGN) channel. For almost all decoders for polar and LDPC codes, (scaled or offset) min-sum approximations are used for check node updates. The scaling and/or offset factors are given, whenever applicable. Our Turbo decoder uses the max-log approximation. All polar codes are designed using the Monte Carlo based method proposed by Arıkan [1]. In order to speed up our simulations of BP decoding for polar codes, we used the $\mathbf{G}$ matrix based early termination method of [13], which has negligible impact on the error-correction performance. For the CRC-aided SCL decoders, we use the following CRC polynomial $g_{8}(x)=x^{8}+x^{5}+x^{4}+x^{3}+1$ . The hardware comparison is performed by selecting parameters for the polar decoders (e.g., blocklength, list size, number of iterations) that lead to an error-correction performance that is close to that of the competing LDPC or Turbo codes. Unfortunately, power results for polar decoders are scarce in the literature, making a useful power comparison with existing LDPC and Turbo decoders difficult. Thus, only area and decoding time complexity (which is the inverse of the decoding throughput) are considered for the comparison. We plot these metrics against each other on a double-logarithmic plot where the area and time complexity are on the vertical and horizontal axes, respectively. We note that hardware efficiency is defined as unit area per decoded bit and is measured in mm2/bits/s. Thus, on the aforementioned double-logarithmic plots, lines with a slope of $-1$ correspond to iso-hardware efficiency lines.

In order to scale the area of all decoders appropriately, the following assumptions are made. First, all synthesis results are scaled to a $90$ nm CMOS technology using standard Dennard scaling laws [14], so that the area scales as $s^{2}$ and the operating frequency scales as $1/s$ , where $s$ is the technology feature size. Moreover, the area of the SC and SCL decoders scales linearly with the blocklength, and the area of the BP decoders scales as $N\log N$ . The area of the SCL decoders scales linearly with the list size. The decoding latency of the BP decoders scales linearly with the maximum number of iterations. As it is very difficult to predict the frequency scaling with respect to the blocklength and list size parameters, we only use technology scaling for the operating frequency as already explained.

III Comparison of Polar Codes with LDPC and Turbo Codes

Figure 1 provides a summary of the area and time complexity of ASIC implementations of SC, BP, and SCL polar decoders used as reference for the comparison with the LDPC and Turbo codes. All decoders are scaled to $N=1024$ , and the SCL decoders are scaled to $L=4$ .

We observe that BP decoders generally provide very high throughputs, although they are matched by some of the most recent fast-SSC-based SC decoders. We note that the fast-SSC decoder of [27] is specialized for a small set of polar codes and that BP decoding provides soft output values, which are required for iterative receivers. Moreover, the BP decoders also generally have the highest area requirements of all decoders. SCL decoders generally have the lowest throughput of all decoders, as well as higher area requirements than SC decoders and similar area requirements to BP decoders. However, SCL decoders provide significantly improved error-correction performance with respect to both SC and BP decoding.

III-A Polar Codes vs. IEEE 802.11ad LDPC Codes

The IEEE 802.11ad standard [9] uses QC-LDPC codes with a blocklength of $N=672$ and code rates $R\in\left\{\frac{1}{2},\frac{5}{8},\frac{3}{4},\frac{13}{16}\right\}$ . We simulated the performance of this LDPC code using a layered offset min-sum decoding algorithm with a maximum of $I=5$ iterations and an offset of $\beta=0.2$ , which are numbers commonly found in the literature. A comparison for the lowest and highest rates $\left(R\in\left\{\frac{1}{2},\frac{13}{16}\right\}\right)$ found in the IEEE 802.11ad standard is provided.

Figure 2 shows that SC and BP decoding of a $N=1024$ polar code performs very similarly to the LDPC codes of the IEEE 802.11ad standard. Moreover, SCL decoding of a $N=512$ polar code with $L=2$ and an 8-bit CRC is sufficient to match the error-correction performance of the LDPC code. We note that both the $N=512$ codes used for SCL decoding and the $N=1024$ codes used for SC and BP decoding were designed for an SNR of $1$ dB and $4$ dB for $R=\frac{1}{2}$ and $R=\frac{13}{16}$ , respectively.

From Figure 3, it can be seen that all BP as well as the best SC polar decoders compete well in terms of area, throughput, and hardware efficiency against LDPC decoders. While the hardware efficiency of SCL decoders is similar to IEEE 802.11ad LDPC decoders due to their lower area requirements, most SCL decoders have lower throughput.

III-B Polar Codes vs. IEEE 802.11n LDPC Codes

The IEEE 802.11n standard [10] uses QC-LDPC codes with blocklengths of $N\in\{648,1296,1944\}$ and code rates $R\in\left\{\frac{1}{2},\frac{2}{3},\frac{3}{4},\frac{5}{6}\right\}$ . We simulated the performance of this LDPC code using a layered offset min-sum decoding algorithm with a maximum of $I=12$ iterations and an offset of $\beta=0.5$ , which are numbers commonly found in the literature. We provide a comparison for $N=1944$ and for the lowest rate $\left(R=\frac{1}{2}\right)$ and the highest rate $\left(R=\frac{5}{6}\right)$ found in the IEEE 802.11n standard.

In Figure 4, we observe that a polar code with $N=8192$ under SC decoding has a small loss of $0.5$ dB with respect to the IEEE 802.11n LDPC code with $N=1944$ at a FER of $10^{-5}$ for $R=\frac{1}{2}$ , while the error-correction performance for $R=\frac{5}{6}$ is very similar. Moreover, a polar code with $N=1024$ under SCL decoding with $L=8$ and an $8$ -bit CRC has practically identical performance with the aforementioned polar code with $N=8192$ under SC decoding for both $R=\frac{1}{2}$ and $R=\frac{5}{6}$ . Unfortunately, the polar code with $N=8192$ under BP decoding cannot reach the performance of the IEEE 802.11n LDPC code, even when a maximum of $I=40$ iterations are performed. We note that the polar codes with $N=8192$ used for SC and BP decoding were designed for an SNR of $-1$ dB and $3$ dB for $R=\frac{1}{2}$ and $\frac{5}{6}$ , respectively, while the polar codes with $N=1024$ used for SCL decoding with $L=8$ were designed for an SNR of [math] dB and $4$ dB for $R=\frac{1}{2}$ and $\frac{5}{6}$ , respectively.

In Figure 5, we observe that, on average, the SCL decoders have the highest hardware efficiency out of the polar decoders. Both the SC and the BP decoders have significantly higher area requirements when trying to match the FER performance of the IEEE 802.11n LDPC codes. Finally, we observe that, on average, the IEEE 802.11n LDPC decoders have a slightly higher hardware efficiency than the polar decoders.

III-C Polar Codes vs. IEEE 802.3an LDPC Codes

The IEEE 802.3an standard [11] uses a $(6,32)$ -regular LDPC code with blocklength $N=2048$ and code design rate $R=\frac{13}{16}$ . In our simulations, the LDPC code is decoded using a flooding sum-product decoder with $I=8$ maximum decoding iterations, which is a number that is commonly found in the literature (we note that $4$ - $5$ layered iterations provide similar error-correction performance to $8$ - $10$ flooding iterations).

SCL decoding with $N=1024$ , $L=4$ , and an $8$ -bit CRC already performs better than the IEEE 802.3an LDPC code down to a FER of $10^{-6}$ . In Figure 6, we observe that a polar code with $N=4096$ under SC decoding has better error-correction performance than the IEEE 802.3an LDPC code down to a FER of $10^{-6}$ . BP decoding with $I=40$ for the same polar code, however, has a small loss of $0.5$ dB with respect to the IEEE 802.3an LDPC code at a FER of $10^{-5}$ . We note, however, that the FER curve of the IEEE 802.3an LDPC code has a steeper slope and this code will thus perform better than polar codes at lower FERs. The polar code for $N=1024$ and $R=\frac{13}{16}$ used for SCL decoding was designed for an SNR of $4$ dB, while the polar code for $N=4096$ and $R=\frac{13}{16}$ used for SC and BP decoding was designed for an SNR of $3$ dB.

In Figure 7, we observe that, on average, the polar decoders have lower hardware efficiency than the IEEE 802.3an LDPC decoders. In terms of decoding throughput, only the BP decoders and a few SC decoders can approach the IEEE 802.3an LDPC decoders, albeit with slightly higher area requirements.

III-D Polar Codes vs. 3GPP LTE Turbo Codes

The 3GPP LTE standard [12] defines a baseline Turbo code with rate $R=\frac{1}{3}$ and information bit interleaver block sizes ranging from $K=40$ to $K=6144$ bits. Multiple code rates are supported, both higher and lower than $R=\frac{1}{3}$ , which are obtained by puncturing and parity bit repetition, respectively. We simulated the performance of this Turbo code for the largest supported interleaver length $K=6144$ under max-log decoding with $I=6$ iterations, which is a number that is commonly found in the hardware implementation literature. We note that an interleaver length of $K=6144$ leads to a codeword blocklength $N=12288$ for rate $R=\frac{1}{2}$ and a codeword blocklength of $N=18432$ for rate $R=\frac{1}{3}$ . We provide a comparison for $R=\frac{1}{3}$ and $R=\frac{1}{2}$ .

In Figure 8, we observe that a polar code with $N=16384$ under SC decoding has a small loss of $0.5$ dB with respect to the LTE Turbo code with $K=6144$ at a FER of $10^{-5}$ for both $R=\frac{1}{3}$ and $R=\frac{1}{2}$ and a polar code with $N=2048$ under SCL decoding with $L=8$ and an $8$ -bit CRC has the same loss of $0.5$ dB with respect to the LTE Turbo code with $K=6144$ at a FER of $10^{-5}$ for both $R=\frac{1}{3}$ and $R=\frac{1}{2}$ . We note, however, that at higher FERs the LTE Turbo code has significantly better performance than the polar codes. The polar codes only reach the performance of the LTE Turbo code at low FERs because the latter exhibits a relatively high error floor. Unfortunately, the polar code with $N=16384$ under BP decoding cannot reach the performance of the LTE Turbo code, even when a maximum of $I=30$ iterations are performed. We note that there exist Turbo codes that can even outperform the LTE Turbo code [71], thus increasing the potential gap in performance between polar codes and Turbo codes.

In Figure 9, we observe that, even though an SC decoder has the best hardware efficiency, on average the SCL decoders have the best hardware efficiency among the polar decoders. Both the SC and BP decoders have very high area requirements when matching the FER performance of the LTE Turbo codes. We also observe that, on average, the LTE Turbo decoders have a similar hardware efficiency to the polar decoders.

IV Conclusion

In this paper, we compared polar decoders both in terms of error-correction performance and hardware efficiency against LDPC and Turbo decoders for existing communications standards. Comparisons were made for the IEEE 802.11ad [9], IEEE 802.11n [10], and IEEE 802.3an [11], and 3GPP LTE [12] communications standards. In most cases, BP and SC decoding are not powerful enough and more complex algorithms, such as SCL decoding, are needed in order to match the error-correction performance of LDPC or Turbo codes. Moreover, we have seen that the polar decoders that can match the error-correction performance of LDPC and Turbo codes usually have lower hardware efficiency than their LDPC and Turbo decoder counterparts. The low hardware efficiency stems mainly from the low throughput that these decoders achieve, and not so much from their area requirements. In conclusion, while significant improvements have been achieved over the past few years in the polar decoding literature, further work is required in order to match and surpass existing channel coding solutions. In particular, the direction of increasing the throughput of SCL decoders seems promising, since SCL decoders have the lowest area requirements and generally the best hardware efficiency out of the polar decoders in all comparisons of this paper.

Acknowledgement

The authors would like to thank Huawei Technologies for financial support.

Bibliography71

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. Arıkan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inf. Theory , vol. 55, no. 7, 2009.
2[2] I. Tal and A. Vardy, “List decoding of polar codes,” in IEEE Int. Symp. on Inf. Theory (ISIT) , 2011.
3[3] A. Pamuk, “An FPGA implementation architecture for decoding of polar codes,” in Int. Symp. on Wireless Commun. Syst. (ISWCS) , 2011.
4[4] F. Kienle, N. Wehn, and H. Meyr, “On complexity, energy- and implementation-efficiency of channel decoders,” IEEE Trans. Commun. , vol. 59, no. 12, pp. 3301–3310, Dec. 2011.
5[5] P. Giard, G. Sarkis, A. Balatsoukas-Stimming, Y. Fan, C. y. Tsui, A. Burg, C. Thibeault, and W. J. Gross, “Hardware decoders for polar codes: An overview,” in IEEE Int. Symp. on Circ. and Syst. (ISCAS) , May 2016, pp. 149–152.
6[6] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. J. Gross, “Fast polar decoders: Algorithm and implementation,” IEEE J. Sel. Areas Commun. , vol. 32, no. 5, 2014.
7[7] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. Theory , vol. 61, no. 5, 2015.
8[8] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. J. Gross, “Fast list decoders for polar codes,” IEEE J. Sel. Areas Commun. , vol. 34, no. 2, 2016.