Semantic Ordered Statistics Decoding

Chentao Yue; Branka Vucetic; and Yonghui Li

arXiv:2605.02296·cs.IT·May 5, 2026

Semantic Ordered Statistics Decoding

Chentao Yue, Branka Vucetic, and Yonghui Li

PDF

TL;DR

The paper introduces Sem-OSD, a novel soft decoding method for short linear block codes that integrates byte-level language models to improve error correction, especially for natural-language text sources.

Contribution

It presents a new decoding algorithm that combines channel reliability with language model priors, enhancing performance over traditional methods for short codes.

Findings

01

Achieves BLER below normal-approximation bound on AWGN for BCH and RS codes.

02

Provides 1.5 dB coding gain over Fossorier OSD on AWGN.

03

Offers 4 dB and 1 dB gains over Berlekamp--Massey and OSD on burst-error channels.

Abstract

We propose a Semantic Ordered Statistics Decoder (sem-OSD), a soft decoder for short linear block codes carrying byte-streamed sources such as natural-language text. Sem-OSD injects a byte-level language-model (LM) prior into ordered statistics decoding (OSD) through a fused bit-level score that combines channel reliability with the LM prior, and uses it for the most-reliable basis (MRB) selection and the codeword candidate scoring. Sem-OSD enumerates two complementary test-error-pattern (TEP) families: a bit-flip family that flips up to $m$ bits, and an LM-driven family of up to $ω$ byte substitutions that reaches error patterns the bit-flip family cannot. The LM prior is computed by a byte-level Transformer fine-tuned for byte-level denoising. Simulation results show that, on AWGN, sem-OSD achieves block error rates (BLERs) below the finite-blocklength normal-approximation bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.