Rig3R: Rig-Aware Conditioning for Learned 3D Reconstruction
Samuel Li, Pujith Kachana, Prajwal Chidananda, Saurabh Nair, Yasutaka Furukawa, Matthew Brown

TL;DR
Rig3R introduces a rig-aware approach to 3D reconstruction that leverages rig structure information when available or infers it when missing, significantly improving accuracy in multi-camera rig scenarios.
Contribution
The paper presents Rig3R, a novel model that incorporates rig metadata and learns to infer rig structure, enhancing multiview 3D reconstruction and pose estimation.
Findings
Achieves state-of-the-art 3D reconstruction performance.
Outperforms prior methods by 17-45% mAA.
Operates efficiently in a single forward pass.
Abstract
Estimating agent pose and 3D scene structure from multi-camera rigs is a central task in embodied AI applications such as autonomous driving. Recent learned approaches such as DUSt3R have shown impressive performance in multiview settings. However, these models treat images as unstructured collections, limiting effectiveness in scenarios where frames are captured from synchronized rigs with known or inferable structure. To this end, we introduce Rig3R, a generalization of prior multiview reconstruction models that incorporates rig structure when available, and learns to infer it when not. Rig3R conditions on optional rig metadata including camera ID, time, and rig poses to develop a rig-aware latent space that remains robust to missing information. It jointly predicts pointmaps and two types of raymaps: a pose raymap relative to a global frame, and a rig raymap relative to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsManufacturing Process and Optimization · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization
