Semantic Ordered Statistics Decoding
Chentao Yue, Branka Vucetic, and Yonghui Li

TL;DR
The paper introduces Sem-OSD, a novel soft decoding method for short linear block codes that integrates byte-level language models to improve error correction, especially for natural-language text sources.
Contribution
It presents a new decoding algorithm that combines channel reliability with language model priors, enhancing performance over traditional methods for short codes.
Findings
Achieves BLER below normal-approximation bound on AWGN for BCH and RS codes.
Provides 1.5 dB coding gain over Fossorier OSD on AWGN.
Offers 4 dB and 1 dB gains over Berlekamp--Massey and OSD on burst-error channels.
Abstract
We propose a Semantic Ordered Statistics Decoder (sem-OSD), a soft decoder for short linear block codes carrying byte-streamed sources such as natural-language text. Sem-OSD injects a byte-level language-model (LM) prior into ordered statistics decoding (OSD) through a fused bit-level score that combines channel reliability with the LM prior, and uses it for the most-reliable basis (MRB) selection and the codeword candidate scoring. Sem-OSD enumerates two complementary test-error-pattern (TEP) families: a bit-flip family that flips up to bits, and an LM-driven family of up to byte substitutions that reaches error patterns the bit-flip family cannot. The LM prior is computed by a byte-level Transformer fine-tuned for byte-level denoising. Simulation results show that, on AWGN, sem-OSD achieves block error rates (BLERs) below the finite-blocklength normal-approximation bound…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
