FlowSE: Flow Matching-based Speech Enhancement

Seonggyu Lee; Sein Cheong; Sangwook Han; Jong Won Shin

arXiv:2508.06840·eess.AS·August 12, 2025·ICASSP

FlowSE: Flow Matching-based Speech Enhancement

Seonggyu Lee, Sein Cheong, Sangwook Han, Jong Won Shin

PDF

Open Access

TL;DR

FlowSE introduces a flow matching-based speech enhancement method that achieves diffusion-model-like performance with significantly fewer function evaluations, reducing computational complexity without fine-tuning.

Contribution

This paper presents a novel conditional flow matching approach for speech enhancement that matches diffusion model performance with fewer function evaluations and no fine-tuning.

Findings

01

Achieved diffusion-level performance with NFE of 5

02

Matched diffusion model performance without fine-tuning

03

Reduced computational complexity in speech enhancement

Abstract

Diffusion probabilistic models have shown impressive performance for speech enhancement, but they typically require 25 to 60 function evaluations in the inference phase, resulting in heavy computational complexity. Recently, a fine-tuning method was proposed to correct the reverse process, which significantly lowered the number of function evaluations (NFE). Flow matching is a method to train continuous normalizing flows which model probability paths from known distributions to unknown distributions including those described by diffusion processes. In this paper, we propose a speech enhancement based on conditional flow matching. The proposed method achieved the performance comparable to those for the diffusion-based speech enhancement with the NFE of 60 when the NFE was 5, and showed similar performance with the diffusion model correcting the reverse process at the same NFE from 1 to 5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing