Semantic-aware Speech to Text Transmission with Redundancy Removal
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

TL;DR
This paper introduces a deep learning-based semantic speech-to-text transmission system that efficiently compresses data by focusing on semantic features and removing redundancy, resulting in improved accuracy and transmission efficiency.
Contribution
A novel end-to-end DL transceiver with attention-based alignment and redundancy removal modules, plus a two-stage training scheme for faster convergence.
Findings
Outperforms existing methods in text accuracy and transmission efficiency
Reduces model size and runtime
Effectively removes semantic redundancy
Abstract
Deep learning (DL) based semantic communication methods have been explored for the efficient transmission of images, text, and speech in recent years. In contrast to traditional wireless communication methods that focus on the transmission of abstract symbols, semantic communication approaches attempt to achieve better transmission efficiency by only sending the semantic-related information of the source data. In this paper, we consider semantic-oriented speech to text transmission. We propose a novel end-to-end DL-based transceiver, which includes an attention-based soft alignment module and a redundancy removal module to compress the transmitted data. In particular, the former extracts only the text-related semantic features, and the latter further drops the semantically redundant content, greatly reducing the amount of semantic redundancy compared to existing methods. We also propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Wireless Signal Modulation Classification
