AVS-Net: Audio-Visual Scale Net for Self-supervised Monocular Metric Depth Estimation
Xiaohu Liu, Sascha Hornauer, Fabien Moutarde, Jialiang Lu

TL;DR
This paper introduces AVS-Net, a novel approach that leverages audio echoes to improve and scale-correct self-supervised monocular depth estimation, addressing generalization issues and eliminating the need for supervised depth data.
Contribution
The work demonstrates how integrating audio echoes can enhance depth prediction accuracy and enable scale correction in self-supervised monocular depth estimation.
Findings
Echoes help resolve object scale ambiguities.
The method improves state-of-the-art depth prediction models.
Echo-based supervision enables scale correction without supervised data.
Abstract
Metric depth prediction from monocular videos suffers from bad generalization between datasets and requires supervised depth data for scale-correct training. Self-supervised training using multi-view reconstruction can benefit from large scale natural videos but not provide correct scale, limiting its benefits. Recently, reflecting audible Echoes off objects is investigated for improved depth prediction and was shown to be sufficient to reconstruct objects at scale even without a visual signal. Because Echoes travel at fixed speed, they have the potential to resolve ambiguities in object scale and appearance. However, predicting depth end-to-end from sound and vision cannot benefit from unsupervised depth prediction approaches, which can process large scale data without sound annotation. In this work we show how Echoes can benefit depth prediction in two ways: When learning metric depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Optical measurement and interference techniques
MethodsEmirates Airlines Office in Dubai
