Blind Spatial Impulse Response Generation from Separate Room- and Scene-Specific Information
Francesc Llu\'is, Nils Meyer-Kahlen

TL;DR
This paper introduces a method for generating spatial room impulse responses in augmented reality by using a contrastive encoder and a diffusion model to infer room acoustics from limited sound sources and synthesize responses for new source positions.
Contribution
It presents a novel approach combining contrastive learning and diffusion models to generate room impulse responses from limited data, enabling realistic AR audio rendering.
Findings
Effective room-specific feature extraction via contrastive encoder
Successful generation of spatial impulse responses for new source positions
Potential for improved AR audio realism
Abstract
For audio in augmented reality (AR), knowledge of the users' real acoustic environment is crucial for rendering virtual sounds that seamlessly blend into the environment. As acoustic measurements are usually not feasible in practical AR applications, information about the room needs to be inferred from available sound sources. Then, additional sound sources can be rendered with the same room acoustic qualities. Crucially, these are placed at different positions than the sources available for estimation. Here, we propose to use an encoder network trained using a contrastive loss that maps input sounds to a low-dimensional feature space representing only room-specific information. Then, a diffusion-based spatial room impulse response generator is trained to take the latent space and generate a new response, given a new source-receiver position. We show how both room- and position-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Image and Signal Denoising Methods · Blind Source Separation Techniques
