Diffusion models for audio semantic communication

Eleonora Grassucci; Christian Marinoni; Andrea Rodriguez; and Danilo; Comminiello

arXiv:2309.07195·cs.SD·September 15, 2023

Diffusion models for audio semantic communication

Eleonora Grassucci, Christian Marinoni, Andrea Rodriguez, and Danilo, Comminiello

PDF

Open Access

TL;DR

This paper introduces a robust audio semantic communication framework using diffusion models that transmits lower-dimensional representations, effectively restoring audio content and semantics even under noisy channel conditions.

Contribution

It proposes a novel generative framework that treats audio transmission as an inverse problem, improving robustness to noise and corruption through diffusion-based generation.

Findings

01

Outperforms existing methods in real-world noisy channels

02

Effectively restores corrupted and missing audio parts

03

Focuses on semantic content preservation during transmission

Abstract

Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication approach proposes to send the semantics and then regenerate semantically consistent content at the receiver without exactly recovering the bitstream. In this paper, we propose a generative audio semantic communication framework that faces the communication problem as an inverse problem, therefore being robust to different corruptions. Our method transmits lower-dimensional representations of the audio signal and of the associated semantics to the receiver, which generates the corresponding signal with a particular focus on its meaning (i.e., the semantics) thanks to the conditional diffusion model at its core. During the generation process, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies