Gaussian Flow Bridges for Audio Domain Transfer with Unpaired Data

Eloi Moliner; Sebastian Braun; Hannes Gamper

arXiv:2405.19497·eess.AS·May 31, 2024

Gaussian Flow Bridges for Audio Domain Transfer with Unpaired Data

Eloi Moliner, Sebastian Braun, Hannes Gamper

PDF

Open Access 1 Repo

TL;DR

This paper introduces Gaussian Flow Bridges, a novel unsupervised generative approach for audio domain transfer that manipulates audio characteristics without requiring paired data, showing promising results in reverberation and distortion tasks.

Contribution

The paper presents a new framework using Gaussian Flow Bridges for unpaired audio domain transfer, enabling continuous control over target domain properties.

Findings

01

Competitive performance in reverberation manipulation

02

Effective distortion modification without paired data

03

Potential for further research in unsupervised audio transfer

Abstract

Audio domain transfer is the process of modifying audio signals to match characteristics of a different domain, while retaining the original content. This paper investigates the potential of Gaussian Flow Bridges, an emerging approach in generative modeling, for this problem. The presented framework addresses the transport problem across different distributions of audio signals through the implementation of a series of two deterministic probability flows. The proposed framework facilitates manipulation of the target distribution properties through a continuous control variable, which defines a certain aspect of the target domain. Notably, this approach does not rely on paired examples for training. To address identified challenges on maintaining the speech content consistent, we recommend a training strategy that incorporates chunk-based minibatch Optimal Transport couplings of data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/GFB-audio-control
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Acoustic Wave Phenomena Research