Latent-attention Based Transformer for Near ML Polar Decoding in Short-code Regime
Hongzhi Zhu, Wei Xu, and Xiaohu You

TL;DR
This paper introduces a novel latent-attention based transformer decoder for short polar codes that significantly improves decoding performance and generalization, approaching ML decoding accuracy.
Contribution
The paper proposes a new latent-attention mechanism, an advanced training framework, and a code-aware mask scheme, enhancing transformer-based decoding for polar codes in short-code regimes.
Findings
Achieves near-ML performance in BER and BLER for short polar codes.
Demonstrates robust generalization across different code rates and lengths.
Outperforms existing transformer decoders in accuracy and adaptability.
Abstract
Transformer architectures have emerged as promising deep learning (DL) tools for modeling complex sequence-to-sequence interactions in channel decoding. However, current transformer-based decoders for error correction codes (ECCs) demonstrate inferior performance and generalization capabilities compared to conventional algebraic decoders, especially in short-code regimes. In this work, we propose a novel latent-attention based transformer (LAT) decoder for polar codes that addresses the limitations on performance and generalization through three pivotal innovations. First, we develop a latent-attention mechanism that supersedes the conventional self-attention mechanism. This architectural modification enables independent learning of the Query and Key matrices for code-aware attention computation, decoupling them from the Value matrix to emphasize position-wise decoding interactions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
