Room Impulse Response Generation Conditioned on Acoustic Parameters

Silvia Arellano; Chunghsin Yeh; Gautam Bhattacharya; Daniel Arteaga

arXiv:2507.12136·cs.SD·July 17, 2025

Room Impulse Response Generation Conditioned on Acoustic Parameters

Silvia Arellano, Chunghsin Yeh, Gautam Bhattacharya, Daniel Arteaga

PDF

Open Access

TL;DR

This paper introduces a novel method for generating room impulse responses conditioned on acoustic parameters, enabling more flexible and perceptually relevant sound space modeling compared to traditional geometry-based approaches.

Contribution

It proposes conditioning RIR generation directly on acoustic parameters, moving beyond geometric room descriptions, and evaluates multiple generative models for improved perceptual realism.

Findings

01

MaskGIT achieves the best performance among evaluated models.

02

Proposed models match or outperform state-of-the-art methods.

03

Conditioning on acoustic parameters offers more flexible RIR generation.

Abstract

The generation of room impulse responses (RIRs) using deep neural networks has attracted growing research interest due to its applications in virtual and augmented reality, audio postproduction, and related fields. Most existing approaches condition generative models on physical descriptions of a room, such as its size, shape, and surface materials. However, this reliance on geometric information limits their usability in scenarios where the room layout is unknown or when perceptual realism (how a space sounds to a listener) is more important than strict physical accuracy. In this study, we propose an alternative strategy: conditioning RIR generation directly on a set of RIR acoustic parameters. These parameters include various measures of reverberation time and direct sound to reverberation ratio, both broadband and bandwise. By specifying how the space should sound instead of how it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing