Room Impulse Response Completion Using Signal-Prediction Diffusion Models Conditioned on Simulated Early Reflections
Zeyu Xu, Andreas Brendel, Albert G. Prinn, Emanu\"el A. P. Habets

TL;DR
This paper introduces a diffusion-based method for completing room impulse responses by conditioning on simulated early reflections, improving realism and flexibility over existing techniques.
Contribution
The proposed approach combines signal-prediction diffusion models with geometric simulations, removing fixed duration constraints and enhancing RIR realism.
Findings
Outperforms state-of-the-art baseline in RIR completion
Achieves better energy decay curve reconstruction
Effectively guides generation toward realistic RIRs
Abstract
Room impulse responses (RIRs) are fundamental to audio data augmentation, acoustic signal processing, and immersive audio rendering. While geometric simulators such as the image source method (ISM) can efficiently generate early reflections, they lack the realism of measured RIRs due to missing acoustic wave effects. We propose a diffusion-based RIR completion method using signal-prediction conditioned on ISM-simulated direct-path and early reflections. Unlike state-of-the-art methods, our approach imposes no fixed duration constraint on the input early reflections. We further incorporate classifier-free guidance to steer generation toward a target distribution learned from physically realistic RIRs simulated with the Treble SDK. Objective evaluation demonstrates that the proposed method outperforms a state-of-the-art baseline in early RIR completion and energy decay curve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Generative Adversarial Networks and Image Synthesis
