FlowSynth: Instrument Generation Through Distributional Flow Matching and Test-Time Search
Qihui Yang, Randal Leistikow, Yongyi Zang

TL;DR
FlowSynth introduces a probabilistic flow matching approach combined with test-time search to generate high-quality, consistent virtual instrument sounds across pitches and velocities, outperforming existing models.
Contribution
It pioneers the use of distributional flow matching with uncertainty modeling and test-time optimization for improved instrument synthesis.
Findings
Outperforms TokenSynth in quality and consistency
Effectively models uncertainty for better timbre preservation
Enables real-time, professional-quality instrument generation
Abstract
Virtual instrument generation requires maintaining consistent timbre across different pitches and velocities, a challenge that existing note-level models struggle to address. We present FlowSynth, which combines distributional flow matching (DFM) with test-time optimization for high-quality instrument synthesis. Unlike standard flow matching that learns deterministic mappings, DFM parameterizes the velocity field as a Gaussian distribution and optimizes via negative log-likelihood, enabling the model to express uncertainty in its predictions. This probabilistic formulation allows principled test-time search: we sample multiple trajectories weighted by model confidence and select outputs that maximize timbre consistency. FlowSynth outperforms the current state-of-the-art TokenSynth baseline in both single-note quality and cross-note consistency. Our approach demonstrates that modeling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
