VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal   Transparent Surface Reconstruction in Indoor Scenes

Advaith V. Sethuraman; Onur Bagoren; Harikrishnan Seetharaman; Dalton; Richardson; Joseph Taylor; and Katherine A. Skinner

arXiv:2411.04963·cs.CV·November 8, 2024

VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes

Advaith V. Sethuraman, Onur Bagoren, Harikrishnan Seetharaman, Dalton, Richardson, Joseph Taylor, and Katherine A. Skinner

PDF

Open Access

TL;DR

This paper introduces VAIR, a novel multi-modal implicit neural approach combining visual and acoustic data to accurately reconstruct transparent surfaces in indoor scenes, aiding mobile robot navigation.

Contribution

The paper presents a new fusion model using generative latent optimization for dense transparent surface reconstruction in indoor environments.

Findings

01

Significant improvement over state-of-the-art methods.

02

Effective multi-modal fusion of acoustic and visual data.

03

Successful reconstruction of transparent surfaces in indoor scenes.

Abstract

Mobile robots operating indoors must be prepared to navigate challenging scenes that contain transparent surfaces. This paper proposes a novel method for the fusion of acoustic and visual sensing modalities through implicit neural representations to enable dense reconstruction of transparent surfaces in indoor scenes. We propose a novel model that leverages generative latent optimization to learn an implicit representation of indoor scenes consisting of transparent surfaces. We demonstrate that we can query the implicit representation to enable volumetric rendering in image space or 3D geometry reconstruction (point clouds or mesh) with transparent surface prediction. We evaluate our method's effectiveness qualitatively and quantitatively on a new dataset collected using a custom, low-cost sensing platform featuring RGB-D cameras and ultrasonic sensors. Our method exhibits significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization · Advanced Vision and Imaging