Hybrid Mamba-Transformer Decoder for Error-Correcting Codes

Shy-el Cohen; Yoni Choukroun; Eliya Nachmani

arXiv:2505.17834·cs.IT·May 26, 2025

Hybrid Mamba-Transformer Decoder for Error-Correcting Codes

Shy-el Cohen, Yoni Choukroun, Eliya Nachmani

PDF

TL;DR

This paper presents a hybrid deep learning decoder combining Mamba architecture and Transformer layers, utilizing a novel masking strategy and progressive loss to improve error correction performance.

Contribution

It introduces a new hybrid Mamba-Transformer decoder with layer-wise masking and progressive supervision, advancing decoding accuracy for error-correcting codes.

Findings

01

Outperforms Transformer-only decoders

02

Surpasses standard Mamba models in accuracy

03

Effective across various linear codes

Abstract

We introduce a novel deep learning method for decoding error correction codes based on the Mamba architecture, enhanced with Transformer layers. Our approach proposes a hybrid decoder that leverages Mamba's efficient sequential modeling while maintaining the global context capabilities of Transformers. To further improve performance, we design a novel layer-wise masking strategy applied to each Mamba layer, allowing selective attention to relevant code features at different depths. Additionally, we introduce a progressive layer-wise loss, supervising the network at intermediate stages and promoting robust feature extraction throughout the decoding process. Comprehensive experiments across a range of linear codes demonstrate that our method significantly outperforms Transformer-only decoders and standard Mamba models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Mamba: Linear-Time Sequence Modeling with Selective State Spaces