Semantic Communications for Speech Recognition

Zhenzi Weng; Zhijin Qin; and Geoffrey Ye Li

arXiv:2107.11190·eess.AS·April 30, 2024

Semantic Communications for Speech Recognition

Zhenzi Weng, Zhijin Qin, and Geoffrey Ye Li

PDF

Open Access

TL;DR

This paper introduces DeepSC-SR, a deep learning-based semantic communication system for speech recognition that transmits only critical semantic features, reducing bandwidth use while maintaining high recognition accuracy and robustness in varying channel conditions.

Contribution

The paper presents a novel end-to-end deep learning system, DeepSC-SR, for semantic speech communication that efficiently transmits semantic features and is robust across different channel environments.

Findings

01

DeepSC-SR reduces data transmission compared to traditional systems.

02

DeepSC-SR achieves lower character-error-rate and word-error-rate.

03

DeepSC-SR maintains performance in low SNR conditions.

Abstract

The traditional communications transmit all the source data represented by bits, regardless of the content of source and the semantic information required by the receiver. However, in some applications, the receiver only needs part of the source data that represents critical semantic information, which prompts to transmit the application-related information, especially when bandwidth resources are limited. In this paper, we consider a semantic communication system for speech recognition by designing the transceiver as an end-to-end (E2E) system. Particularly, a deep learning (DL)-enabled semantic communication system, named DeepSC-SR, is developed to learn and extract text-related semantic features at the transmitter, which motivates the system to transmit much less than the source speech data without performance degradation. Moreover, in order to facilitate the proposed DeepSC-SR for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWireless Signal Modulation Classification · Speech Recognition and Synthesis · Speech and Audio Processing