Pandora: Articulated 3D Scene Graphs from Egocentric Vision
Alan Yu, Yun Chang, Christopher Xie, Luca Carlone

TL;DR
This paper introduces Pandora, a method using egocentric vision data to create articulated 3D scene graphs, improving robot understanding of environments and aiding in manipulation tasks.
Contribution
It presents a novel approach to leverage egocentric data for modeling articulated objects and integrating them into 3D scene graphs for robotic applications.
Findings
Egocentric data can recover articulated object models with quality comparable to state-of-the-art methods.
Integrated 3D scene graphs improve understanding of object dynamics and relationships.
Enhanced scene graphs enable robots to perform complex manipulation tasks.
Abstract
Robotic mapping systems typically approach building metric-semantic scene representations from the robot's own sensors and cameras. However, these "first person" maps inherit the robot's own limitations due to its embodiment or skillset, which may leave many aspects of the environment unexplored. For example, the robot might not be able to open drawers or access wall cabinets. In this sense, the map representation is not as complete, and requires a more capable robot to fill in the gaps. We narrow these blind spots in current methods by leveraging egocentric data captured as a human naturally explores a scene wearing Project Aria glasses, giving a way to directly transfer knowledge about articulation from the human to any deployable robot. We demonstrate that, by using simple heuristics, we can leverage egocentric data to recover models of articulate object parts, with quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
