ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment   Modeling

Arjun Somayazulu; Sagnik Majumder; Changan Chen; Kristen Grauman

arXiv:2404.16216·cs.CV·April 26, 2024

ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling

Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman

PDF

Open Access

TL;DR

ActiveRIR introduces a reinforcement learning approach for efficient acoustic environment modeling using audio-visual data, significantly reducing data collection needs while accurately capturing environment acoustics.

Contribution

It presents a novel RL-based active sampling method that guides a mobile agent to construct high-quality acoustic models with minimal acoustic samples in unseen environments.

Findings

01

ActiveRIR outperforms traditional and state-of-the-art methods in diverse environments.

02

The RL policy effectively leverages audio-visual information for navigation and sampling.

03

High-quality acoustic models are achieved with fewer samples than existing approaches.

Abstract

An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment, for any given source/receiver location. Traditional methods for constructing acoustic models involve expensive and time-consuming collection of large quantities of acoustic data at dense spatial locations in the space, or rely on privileged knowledge of scene geometry to intelligently select acoustic data sampling locations. We propose active acoustic sampling, a new task for efficiently building an environment acoustic model of an unmapped environment in which a mobile agent equipped with visual and acoustic sensors jointly constructs the environment acoustic model and the occupancy map on-the-fly. We introduce ActiveRIR, a reinforcement learning (RL) policy that leverages information from audio-visual sensor streams to guide agent navigation and determine optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Noise Effects and Management

MethodsSparse Evolutionary Training