Semantic-preserved Communication System for Highly Efficient Speech Transmission
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

TL;DR
This paper introduces a deep learning-based semantic communication system for speech that transmits only the essential semantic information, significantly reducing transmission data while maintaining high accuracy in speech recognition and quality.
Contribution
The paper proposes a novel end-to-end semantic-oriented speech transmission method that efficiently encodes semantic information and improves transmission efficiency for speech recognition and reconstruction.
Findings
Outperforms existing methods in speech-to-text accuracy and speech quality.
Uses only 16% of symbols compared to traditional methods for speech-to-text.
Achieves 10% reduction in WER with significantly less data transmitted.
Abstract
Deep learning (DL) based semantic communication methods have been explored for the efficient transmission of images, text, and speech in recent years. In contrast to traditional wireless communication methods that focus on the transmission of abstract symbols, semantic communication approaches attempt to achieve better transmission efficiency by only sending the semantic-related information of the source data. In this paper, we consider semantic-oriented speech transmission which transmits only the semantic-relevant information over the channel for the speech recognition task, and a compact additional set of semantic-irrelevant information for the speech reconstruction task. We propose a novel end-to-end DL-based transceiver which extracts and encodes the semantic information from the input speech spectrums at the transmitter and outputs the corresponding transcriptions from the decoded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Advanced Data Compression Techniques
