Variational Speech Waveform Compression to Catalyze Semantic Communications
Shengshi Yao, Zixuan Xiao, Sixian Wang, Jincheng Dai, Kai Niu, Ping, Zhang

TL;DR
This paper introduces a neural waveform compression method utilizing nonlinear transforms and variational modeling to improve speech compression efficiency, especially for semantic communication applications.
Contribution
It presents a novel neural waveform codec with flexible rate optimization and residual coding, outperforming existing codecs in compression rate and adaptability.
Findings
Achieves up to 27% lower coding rate than AMR-WB.
Effectively captures speech dependencies with nonlinear transforms.
Supports optimization for perceptual and semantic loss functions.
Abstract
We propose a novel neural waveform compression method to catalyze emerging speech semantic communications. By introducing nonlinear transform and variational modeling, we effectively capture the dependencies within speech frames and estimate the probabilistic distribution of the speech feature more accurately, giving rise to better compression performance. In particular, the speech signals are analyzed and synthesized by a pair of nonlinear transforms, yielding latent features. An entropy model with hyperprior is built to capture the probabilistic distribution of latent features, followed with quantization and entropy coding. The proposed waveform codec can be optimized flexibly towards arbitrary rate, and the other appealing feature is that it can be easily optimized for any differentiable loss function, including perceptual loss used in semantic communications. To further improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
