Rethinking Flow and Diffusion Bridge Models for Speech Enhancement

Dahan Wang; Jun Gao; Tong Lei; Yuxiang Hu; Changbao Zhu; Kai Chen; and Jing Lu

arXiv:2602.18355·eess.AS·February 23, 2026·AAAI

Rethinking Flow and Diffusion Bridge Models for Speech Enhancement

Dahan Wang, Jun Gao, Tong Lei, Yuxiang Hu, Changbao Zhu, Kai Chen, and Jing Lu

PDF

Open Access 1 Video

TL;DR

This paper unifies flow and diffusion bridge models for speech enhancement, revealing their predictive nature, and introduces an improved model that outperforms existing methods with fewer resources.

Contribution

It provides a unified framework for flow and diffusion models, linking them to predictive speech enhancement, and proposes an enhanced model with better performance and efficiency.

Findings

01

Outperforms existing flow and diffusion baselines

02

Uses fewer parameters and less computation

03

Highlights predictive nature limits performance

Abstract

Flow matching and diffusion bridge models have emerged as leading paradigms in generative speech enhancement, modeling stochastic processes between paired noisy and clean speech signals based on principles such as flow matching, score matching, and Schr\"odinger bridge. In this paper, we present a framework that unifies existing flow and diffusion bridge models by interpreting them as constructions of Gaussian probability paths with varying means and variances between paired data. Furthermore, we investigate the underlying consistency between the training/inference procedures of these generative models and conventional predictive models. Our analysis reveals that each sampling step of a well-trained flow or diffusion bridge model optimized with a data prediction loss is theoretically analogous to executing predictive speech enhancement. Motivated by this insight, we introduce an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Rethinking Flow and Diffusion Bridge Models for Speech Enhancement· underline

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Speech Recognition and Synthesis