RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion

Bardienus P. Duisterhof; Jan Oberst; Bowen Wen; Stan Birchfield; Deva Ramanan; Jeffrey Ichnowski

arXiv:2506.05285·cs.CV·June 6, 2025

RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion

Bardienus P. Duisterhof, Jan Oberst, Bowen Wen, Stan Birchfield, Deva Ramanan, Jeffrey Ichnowski

PDF

Open Access 1 Video

TL;DR

RaySt3R introduces a novel view synthesis approach using transformers for 3D shape completion from a single RGB-D image, achieving state-of-the-art results with better 3D consistency and boundary accuracy.

Contribution

It recasts 3D shape completion as a view synthesis problem and employs a transformer to predict depth, masks, and confidence, improving accuracy and efficiency.

Findings

01

Outperforms baselines by up to 44% in 3D chamfer distance

02

Achieves state-of-the-art performance on synthetic and real datasets

03

Addresses 3D consistency and boundary sharpness issues

Abstract

3D shape completion has broad applications in robotics, digital twin reconstruction, and extended reality (XR). Although recent advances in 3D object and scene completion have achieved impressive results, existing methods lack 3D consistency, are computationally expensive, and struggle to capture sharp object boundaries. Our work (RaySt3R) addresses these limitations by recasting 3D shape completion as a novel view synthesis problem. Specifically, given a single RGB-D image and a novel viewpoint (encoded as a collection of query rays), we train a feedforward transformer to predict depth maps, object masks, and per-pixel confidence scores for those query rays. RaySt3R fuses these predictions across multiple query views to reconstruct complete 3D shapes. We evaluate RaySt3R on synthetic and real-world datasets, and observe it achieves state-of-the-art performance, outperforming the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion· slideslive

Taxonomy

Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Robot Manipulation and Learning