Semantic-aware Speech to Text Transmission with Redundancy Removal

Tianxiao Han; Qianqian Yang; Zhiguo Shi; Shibo He; Zhaoyang Zhang

arXiv:2202.03211·eess.AS·February 8, 2022

Semantic-aware Speech to Text Transmission with Redundancy Removal

Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

PDF

Open Access

TL;DR

This paper introduces a deep learning-based semantic speech-to-text transmission system that efficiently compresses data by focusing on semantic features and removing redundancy, resulting in improved accuracy and transmission efficiency.

Contribution

A novel end-to-end DL transceiver with attention-based alignment and redundancy removal modules, plus a two-stage training scheme for faster convergence.

Findings

01

Outperforms existing methods in text accuracy and transmission efficiency

02

Reduces model size and runtime

03

Effectively removes semantic redundancy

Abstract

Deep learning (DL) based semantic communication methods have been explored for the efficient transmission of images, text, and speech in recent years. In contrast to traditional wireless communication methods that focus on the transmission of abstract symbols, semantic communication approaches attempt to achieve better transmission efficiency by only sending the semantic-related information of the source data. In this paper, we consider semantic-oriented speech to text transmission. We propose a novel end-to-end DL-based transceiver, which includes an attention-based soft alignment module and a redundancy removal module to compress the transmitted data. In particular, the former extracts only the text-related semantic features, and the latter further drops the semantically redundant content, greatly reducing the amount of semantic redundancy compared to existing methods. We also propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Wireless Signal Modulation Classification