Diffusion models for audio semantic communication
Eleonora Grassucci, Christian Marinoni, Andrea Rodriguez, and Danilo, Comminiello

TL;DR
This paper introduces a robust audio semantic communication framework using diffusion models that transmits lower-dimensional representations, effectively restoring audio content and semantics even under noisy channel conditions.
Contribution
It proposes a novel generative framework that treats audio transmission as an inverse problem, improving robustness to noise and corruption through diffusion-based generation.
Findings
Outperforms existing methods in real-world noisy channels
Effectively restores corrupted and missing audio parts
Focuses on semantic content preservation during transmission
Abstract
Directly sending audio signals from a transmitter to a receiver across a noisy channel may absorb consistent bandwidth and be prone to errors when trying to recover the transmitted bits. On the contrary, the recent semantic communication approach proposes to send the semantics and then regenerate semantically consistent content at the receiver without exactly recovering the bitstream. In this paper, we propose a generative audio semantic communication framework that faces the communication problem as an inverse problem, therefore being robust to different corruptions. Our method transmits lower-dimensional representations of the audio signal and of the associated semantics to the receiver, which generates the corresponding signal with a particular focus on its meaning (i.e., the semantics) thanks to the conditional diffusion model at its core. During the generation process, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
