Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation
Jackie Lin, Georg G\"otz, Hermes Sampedro Llopis, Haukur Hafsteinsson,, Steinar Gu{\dh}j\'onsson, Daniel Gert Nielsen, Finnur Pind, Paris Smaragdis,, Dinesh Manocha, John Hershey, Trausti Kristjansson, Minje Kim

TL;DR
This paper introduces a challenge for synthesizing room acoustics to augment data for improving speaker distance estimation, focusing on generating diverse room impulse responses to enhance spatial audio tasks.
Contribution
It presents a novel challenge for generating diverse room acoustics data to support spatially sensitive speech processing tasks, addressing the difficulty of precise acoustic measurement or simulation.
Findings
Challenge dataset and evaluation code released
Generative data augmentation shown as a promising solution
Focus on improving speaker distance estimation accuracy
Abstract
This paper describes the synthesis of the room acoustics challenge as a part of the generative data augmentation workshop at ICASSP 2025. The challenge defines a unique generative task that is designed to improve the quantity and diversity of the room impulse responses dataset so that it can be used for spatially sensitive downstream tasks: speaker distance estimation. The challenge identifies the technical difficulty in measuring or simulating many rooms' acoustic characteristics precisely. As a solution, it proposes generative data augmentation as an alternative that can potentially be used to improve various downstream tasks. The challenge website, dataset, and evaluation code are available at https://sites.google.com/view/genda2025.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
