Room Impulse Response Generation Conditioned on Acoustic Parameters
Silvia Arellano, Chunghsin Yeh, Gautam Bhattacharya, Daniel Arteaga

TL;DR
This paper introduces a novel method for generating room impulse responses conditioned on acoustic parameters, enabling more flexible and perceptually relevant sound space modeling compared to traditional geometry-based approaches.
Contribution
It proposes conditioning RIR generation directly on acoustic parameters, moving beyond geometric room descriptions, and evaluates multiple generative models for improved perceptual realism.
Findings
MaskGIT achieves the best performance among evaluated models.
Proposed models match or outperform state-of-the-art methods.
Conditioning on acoustic parameters offers more flexible RIR generation.
Abstract
The generation of room impulse responses (RIRs) using deep neural networks has attracted growing research interest due to its applications in virtual and augmented reality, audio postproduction, and related fields. Most existing approaches condition generative models on physical descriptions of a room, such as its size, shape, and surface materials. However, this reliance on geometric information limits their usability in scenarios where the room layout is unknown or when perceptual realism (how a space sounds to a listener) is more important than strict physical accuracy. In this study, we propose an alternative strategy: conditioning RIR generation directly on a set of RIR acoustic parameters. These parameters include various measures of reverberation time and direct sound to reverberation ratio, both broadband and bandwise. By specifying how the space should sound instead of how it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing
