Sim2Real Transfer for Audio-Visual Navigation with Frequency-Adaptive Acoustic Field Prediction
Changan Chen, Jordi Ramos, Anshul Tomar, Kristen Grauman

TL;DR
This paper introduces a novel sim2real transfer method for audio-visual navigation by disentangling acoustic field prediction and waypoint navigation, utilizing a frequency-adaptive strategy to bridge the spectral gap between simulation and real-world audio data.
Contribution
It presents the first approach to sim2real transfer for audio-visual navigation, incorporating frequency-adaptive acoustic field prediction and real-world validation on a robot platform.
Findings
Improved navigation performance in real-world tests.
Effective spectral gap measurement and adaptation.
Successful transfer of policies from simulation to real robot.
Abstract
Sim2real transfer has received increasing attention lately due to the success of learning robotic tasks in simulation end-to-end. While there has been a lot of progress in transferring vision-based navigation policies, the existing sim2real strategy for audio-visual navigation performs data augmentation empirically without measuring the acoustic gap. The sound differs from light in that it spans across much wider frequencies and thus requires a different solution for sim2real. We propose the first treatment of sim2real for audio-visual navigation by disentangling it into acoustic field prediction (AFP) and waypoint navigation. We first validate our design choice in the SoundSpaces simulator and show improvement on the Continuous AudioGoal navigation benchmark. We then collect real-world data to measure the spectral difference between the simulation and the real world by training AFP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Vehicle Noise and Vibration Control · Music and Audio Processing
