DARAS: Dynamic Audio-Room Acoustic Synthesis for Blind Room Impulse Response Estimation

Chunxi Wang; Maoshen Jia; Wenyu Jin

arXiv:2507.08135·eess.AS·November 5, 2025

DARAS: Dynamic Audio-Room Acoustic Synthesis for Blind Room Impulse Response Estimation

Chunxi Wang, Maoshen Jia, Wenyu Jin

PDF

TL;DR

DARAS is a deep learning framework that accurately estimates room impulse responses from monaural speech signals, improving acoustic modeling for AR/VR applications through innovative feature extraction, parameter estimation, and adaptive synthesis.

Contribution

The paper introduces DARAS, a novel deep learning model combining a dedicated encoder, self-supervised parameter estimation, and adaptive acoustic tuning for blind RIR estimation.

Findings

01

DARAS outperforms existing models in subjective listening tests.

02

The system effectively captures room acoustic parameters from monaural speech.

03

Experimental results show improved realism in synthesized RIRs.

Abstract

Room Impulse Responses (RIRs) accurately characterize acoustic properties of indoor environments and play a crucial role in applications such as speech enhancement, speech recognition, and audio rendering in augmented reality (AR) and virtual reality (VR). Existing blind estimation methods struggle to achieve practical accuracy. To overcome this challenge, we propose the dynamic audio-room acoustic synthesis (DARAS) model, a novel deep learning framework that is explicitly designed for blind RIR estimation from monaural reverberant speech signals. First, a dedicated deep audio encoder effectively extracts relevant nonlinear latent space features. Second, the Mamba-based self-supervised blind room parameter estimation (MASS-BRPE) module, utilizing the efficient Mamba state space model (SSM), accurately estimates key room acoustic parameters and features. Third, the system incorporates a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.