SF-Flow: Sound field magnitude estimation via flow matching guided by sparse measurements
Ege Erdem, Shoichi Koyama, Tomohiko Nakamura, Orchisama Das, Zoran Cvetkovi\'c

TL;DR
SF-Flow introduces a novel flow matching-based method for reconstructing 3D sound field magnitudes from sparse measurements, improving accuracy and training efficiency in spatial audio applications.
Contribution
A new framework using flow matching and a 3D U-Net for sparse microphone data-based sound field reconstruction, with enhanced speed and dataset scalability.
Findings
Accurate reconstruction up to 1 kHz.
Faster training compared to autoencoder baseline.
Performance improves with larger datasets.
Abstract
Reconstructing a 3D sound field from sparse microphone measurements is a fundamental yet ill-posed problem, which we address through Acoustic Transfer Function (ATF) magnitude estimation. ATF magnitude encapsulates key perceptual and acoustic properties of a physical space with applications in room characterization and correction. Although recent generative paradigms such as Flow Matching (FM) have achieved state-of-the-art performance in speech and music generation, their potential in spatial audio remains underexplored. We propose a novel framework for 3D ATF magnitude reconstruction as a guided generation task, with a 3D U-Net conditioned by a permutation-invariant set encoder. This architecture enables reconstruction from an arbitrary number of sparse inputs while leveraging the stable and efficient training properties of FM. Experimental results demonstrate that SF-Flow achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
