Few-shot Acoustic Synthesis with Multimodal Flow Matching
Amandine Brunetto

TL;DR
This paper presents FLAC, a probabilistic few-shot acoustic synthesis method using flow matching and diffusion transformers to generate room impulse responses with minimal data, outperforming existing methods.
Contribution
Introduces FLAC, a novel flow-matching probabilistic approach for few-shot acoustic synthesis that models RIR distributions conditioned on scene cues, enabling efficient and scene-generalizable sound rendering.
Findings
FLAC outperforms state-of-the-art eight-shot baselines with only one shot.
Introduces AGREE, a metric for geometry-consistent evaluation of RIRs.
First application of generative flow matching to explicit RIR synthesis.
Abstract
Generating audio that is acoustically consistent with a scene is essential for immersive virtual environments. Recent neural acoustic field methods enable spatially continuous sound rendering but remain scene-specific, requiring dense audio measurements and costly training for each environment. Few-shot approaches improve scalability across rooms but still rely on multiple recordings and, being deterministic, fail to capture the inherent uncertainty of scene acoustics under sparse context. We introduce flow-matching acoustic generation (FLAC), a probabilistic method for few-shot acoustic synthesis that models the distribution of plausible room impulse responses (RIRs) given minimal scene context. FLAC leverages a diffusion transformer trained with a flow-matching objective to generate RIRs at arbitrary positions in novel scenes, conditioned on spatial, geometric, and acoustic cues. FLAC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Generative Adversarial Networks and Image Synthesis · Music Technology and Sound Studies
