Efficient learning-based sound propagation for virtual and real-world audio processing applications
Anton Jeran Ratnarajah

TL;DR
This paper presents a fast, learning-based method for generating and estimating room impulse responses (RIRs) that enhances audio processing tasks like speech recognition and synthesis, outperforming traditional simulators.
Contribution
It introduces a novel learning-based RIR generator, an RIR estimator from reverberant speech and visual cues, and IR-GAN for augmenting RIRs, advancing acoustic simulation and estimation techniques.
Findings
Learning-based RIR generator is two orders of magnitude faster than ray-tracing.
Estimated RIRs improve far-field ASR word error rate by 6.9%.
IR-GAN outperforms ray-tracing in ASR benchmark by 8.95%.
Abstract
Sound propagation is the process by which sound energy travels through a medium, such as air, to the surrounding environment as sound waves. The room impulse response (RIR) describes this process and is influenced by the positions of the source and listener, the room's geometry, and its materials. Physics-based acoustic simulators have been used for decades to compute accurate RIRs for specific acoustic environments. However, we have encountered limitations with existing acoustic simulators. To address these limitations, we propose three novel solutions. First, we introduce a learning-based RIR generator that is two orders of magnitude faster than an interactive ray-tracing simulator. Our approach can be trained to input both statistical and traditional parameters directly, and it can generate both monaural and binaural RIRs for both reconstructed and synthetic 3D scenes. Our generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Advanced Adaptive Filtering Techniques
