Towards Learning a Generalizable 3D Scene Representation from 2D Observations

Martin Gromniak; Jan-Gerrit Habekost; Sebastian Kamp; Sven Magg; Stefan Wermter

arXiv:2602.10943·cs.CV·February 12, 2026

Towards Learning a Generalizable 3D Scene Representation from 2D Observations

Martin Gromniak, Jan-Gerrit Habekost, Sebastian Kamp, Sven Magg, Stefan Wermter

PDF

Open Access

TL;DR

This paper presents a novel neural radiance field approach that predicts 3D workspace occupancy from 2D robot observations, generalizing across unseen scenes and outperforming traditional stereo vision in accuracy.

Contribution

It introduces a global workspace frame for occupancy prediction, enabling generalization without scene-specific fine-tuning, a significant advancement over prior camera-centric methods.

Findings

01

Achieves 26mm reconstruction error on real scenes

02

Successfully generalizes to unseen object arrangements

03

Validates approach on humanoid robot with real sensor data

Abstract

We introduce a Generalizable Neural Radiance Field approach for predicting 3D workspace occupancy from egocentric robot observations. Unlike prior methods operating in camera-centric coordinates, our model constructs occupancy representations in a global workspace frame, making it directly applicable to robotic manipulation. The model integrates flexible source views and generalizes to unseen object arrangements without scene-specific finetuning. We demonstrate the approach on a humanoid robot and evaluate predicted geometry against 3D sensor ground truth. Trained on 40 real scenes, our model achieves 26mm reconstruction error, including occluded regions, validating its ability to infer complete 3D occupancy beyond traditional stereo vision methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Robotics and Sensor-Based Localization