Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation

Anton Ratnarajah; Mehmet Ergezer; Arun Nair; Mrudula Athi

arXiv:2605.00721·cs.SD·May 4, 2026

Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation

Anton Ratnarajah, Mehmet Ergezer, Arun Nair, Mrudula Athi

PDF

TL;DR

This paper presents a data augmentation method using generated room impulse responses to enhance speaker distance estimation accuracy in acoustic environments, achieving significant error reduction.

Contribution

It introduces a novel RIR augmentation technique with quality filtering and hyperparameter tuning to improve SDE model performance.

Findings

01

MAE reduced from 1.66m to 0.6m in GWA rooms

02

MAE reduced from 2.18m to 0.69m in Treble rooms

03

Augmentation significantly improves medium to long-distance estimation

Abstract

The Room Acoustics and Speaker Distance Estimation (SDE) Challenge at ICASSP 2025 explores the effectiveness of augmented room impulse response (RIR) data for improving SDE model performance. This challenge at GenDARA involves generating RIRs to supplement sparse datasets and fine-tuning SDE models with the augmented data. We employ the open-source fast diffuse room impulse response generator (FastRIR) conditioned only on speaker and listener locations. We design a quality filter to ensure generated RIR alignment with challenge RIRs, and hyperparameter optimization is employed for model fine-tuning. Our approach reduces the mean absolute error (MAE) of the five positions from 1.66m to 0.6m for GWA rooms and from 2.18m to 0.69m for Treble rooms, with results demonstrating that the augmentation approach significantly improves estimation accuracy, particularly at medium to long distances.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.