Large Speech Model Enabled Semantic Communication

Yun Tian; Zhijin Qin; Guocheng Lv; Ye Jin; Kaibin Huang; Zhu Han

arXiv:2512.04711·cs.SD·December 5, 2025

Large Speech Model Enabled Semantic Communication

Yun Tian, Zhijin Qin, Guocheng Lv, Ye Jin, Kaibin Huang, Zhu Han

PDF

Open Access

TL;DR

This paper introduces LargeSC, a semantic communication system leveraging large speech models for adaptive, robust, and efficient speech transmission over lossy channels, outperforming traditional methods in quality and latency.

Contribution

It proposes a novel Large Speech Model enabled Semantic Communication system with adaptive control, in-band UEP, and generative recovery, advancing the robustness and efficiency of speech transmission.

Findings

01

Supports bandwidths from 550 bps to 2.06 kbps.

02

Outperforms conventional baselines in speech quality under high packet loss.

03

Achieves approximately 460 ms latency for real-time deployment.

Abstract

Existing speech semantic communication systems mainly based on Joint Source-Channel Coding (JSCC) architectures have demonstrated impressive performance, but their effectiveness remains limited by model structures specifically designed for particular tasks and datasets. Recent advances indicate that generative large models pre-trained on massive datasets, can achieve outstanding performance arexhibit exceptional performance across diverse downstream tasks with minimal fine-tuning. To exploit the rich semantic knowledge embedded in large models and enable adaptive transmission over lossy channels, we propose a Large Speech Model enabled Semantic Communication (LargeSC) system. Simultaneously achieving adaptive compression and robust transmission over lossy channels remains challenging, requiring trade-offs among compression efficiency, speech quality, and latency. In this work, we employ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Speech and Audio Processing · Speech Recognition and Synthesis