LL-SDR: Low-Latency Speech enhancement through Discrete Representations
Jingyi Li, Luca Della Libera, Mirco Ravanelli, Cem Subakan

TL;DR
LL-SDR introduces a novel token-based speech enhancement framework that leverages discretization and specialized quantization to improve separation of speech and noise, achieving low-latency performance in various noisy environments.
Contribution
The paper presents a new discretization method with VO-RVQ and a latent-space discriminator, enhancing speech-noise separation and enabling efficient, low-latency speech enhancement.
Findings
Outperforms continuous baselines in speech enhancement tasks.
Matches autoregressive token-based approaches in performance.
Enables lightweight, real-time processing in noisy environments.
Abstract
Many speech enhancement (SE) methods rely on continuous representations. Recently, discrete audio tokens have been explored to enable autoregressive generation for SE. However, it remains unclear whether discretization itself consistently improves SE performance. In this paper, we introduce LL-SDR, a token-based speech enhancement framework that explicitly leverages discretization to better separate speech and noise. Our first contribution is a Variance-Ordered Residual Vector Quantizer (VO-RVQ), designed to disentangle speech and noise distributions during tokenization. Second, we propose a latent-space discriminator to better align enhanced embeddings with semantic embeddings. Experiments show that LL-SDR outperforms continuous baselines and matches the performance of autoregressive token-based approaches, while enabling lightweight, low-latency speech enhancement in both reverberant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Hearing Loss and Rehabilitation
