Semantic Communications for Speech Recognition
Zhenzi Weng, Zhijin Qin, and Geoffrey Ye Li

TL;DR
This paper introduces DeepSC-SR, a deep learning-based semantic communication system for speech recognition that transmits only critical semantic features, reducing bandwidth use while maintaining high recognition accuracy and robustness in varying channel conditions.
Contribution
The paper presents a novel end-to-end deep learning system, DeepSC-SR, for semantic speech communication that efficiently transmits semantic features and is robust across different channel environments.
Findings
DeepSC-SR reduces data transmission compared to traditional systems.
DeepSC-SR achieves lower character-error-rate and word-error-rate.
DeepSC-SR maintains performance in low SNR conditions.
Abstract
The traditional communications transmit all the source data represented by bits, regardless of the content of source and the semantic information required by the receiver. However, in some applications, the receiver only needs part of the source data that represents critical semantic information, which prompts to transmit the application-related information, especially when bandwidth resources are limited. In this paper, we consider a semantic communication system for speech recognition by designing the transceiver as an end-to-end (E2E) system. Particularly, a deep learning (DL)-enabled semantic communication system, named DeepSC-SR, is developed to learn and extract text-related semantic features at the transmitter, which motivates the system to transmit much less than the source speech data without performance degradation. Moreover, in order to facilitate the proposed DeepSC-SR for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Signal Modulation Classification · Speech Recognition and Synthesis · Speech and Audio Processing
