Semantic Communication Systems for Speech Transmission

Zhenzi Weng; Zhijin Qin

arXiv:2102.12605·eess.SP·September 9, 2021·IEEE J. Sel. Areas Commun.·6 cites

Semantic Communication Systems for Speech Transmission

Zhenzi Weng, Zhijin Qin

PDF

Open Access 1 Repo

TL;DR

This paper introduces DeepSC-S, a deep learning-based semantic communication system for speech transmission that emphasizes semantic-level accuracy, utilizing attention mechanisms to enhance essential information recovery and robustness across varying channel conditions.

Contribution

The paper presents a novel deep learning semantic communication system for speech, incorporating attention mechanisms and a general model for dynamic channels, outperforming traditional methods.

Findings

01

DeepSC-S outperforms traditional communication systems in speech quality metrics.

02

The system is robust to channel variations, especially at low SNR.

03

DeepSC-S effectively captures essential speech information using attention mechanisms.

Abstract

Semantic communications could improve the transmission efficiency significantly by exploring the semantic information. In this paper, we make an effort to recover the transmitted speech signals in the semantic communication systems, which minimizes the error at the semantic level rather than the bit or symbol level. Particularly, we design a deep learning (DL)-enabled semantic communication system for speech signals, named DeepSC-S. In order to improve the recovery accuracy of speech signals, especially for the essential information, DeepSC-S is developed based on an attention mechanism by utilizing a squeeze-and-excitation (SE) network. The motivation behind the attention mechanism is to identify the essential speech information by providing higher weights to them when training the neural network. Moreover, in order to facilitate the proposed DeepSC-S for dynamic channel environments,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zhenzi-Weng/DeepSC-S
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Wireless Signal Modulation Classification · Speech Recognition and Synthesis