Error-Resilient Semantic Communication for Speech Transmission over Packet-Loss Networks

Zhuohang Han; Jincheng Dai; Shengshi Yao; Junyi Wang; Yanlong Li; Kai Niu; Wenjun Xu; and Ping Zhang

arXiv:2512.08203·cs.SD·December 10, 2025

Error-Resilient Semantic Communication for Speech Transmission over Packet-Loss Networks

Zhuohang Han, Jincheng Dai, Shengshi Yao, Junyi Wang, Yanlong Li, Kai Niu, Wenjun Xu, and Ping Zhang

PDF

Open Access

TL;DR

This paper introduces Glaris, a novel semantic communication framework that enhances speech transmission robustness over packet-loss networks by using generative latent priors for high-quality packet loss concealment and error resilience.

Contribution

Glaris is the first to integrate generative latent priors into semantic speech communication for improved error resilience and compatibility with existing digital systems.

Findings

01

Glaris outperforms traditional FEC in robustness and efficiency.

02

It maintains semantic and reconstruction quality under high packet loss.

03

It reduces redundancy overhead compared to existing methods.

Abstract

Real-time speech communication over wireless networks remains challenging, as conventional channel protection mechanisms cannot effectively counter packet loss under stringent bandwidth and latency constraints. Semantic communication has emerged as a promising paradigm for enhancing the robustness of speech transmission by means of joint source-channel coding (JSCC). However, its cross-layer design hinders practical deployment due to the incompatibility with existing digital communication systems. In this case, the robustness of speech communication is consequently evaluated primarily by the error-resilience to packet loss over wireless networks. To address these challenges, we propose \emph{Glaris}, a generative latent-prior-based resilient speech semantic communication framework that performs resilient speech coding in the generative latent space. Generative latent priors enable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Speech Recognition and Synthesis