EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates
Weikun Peng, Denys Iliash, Manolis Savva

TL;DR
EgoFun3D introduces a new task, dataset, and benchmark for modeling interactive 3D objects from egocentric videos, emphasizing functional mappings and simulation readiness.
Contribution
The paper presents a novel task, a comprehensive dataset with annotations, and a benchmark for modeling interactive objects using function templates from egocentric videos.
Findings
The task is challenging for existing methods.
The dataset contains 271 videos with detailed annotations.
Benchmark results highlight the need for improved modeling techniques.
Abstract
We present EgoFun3D, a coordinated task formulation, dataset, and benchmark for modeling interactive 3D objects from egocentric videos. Interactive objects are of high interest for embodied AI but scarce, making modeling from readily available real-world videos valuable. Our task focuses on obtaining simulation-ready interactive 3D objects from egocentric video input. While prior work largely focuses on articulations, we capture general cross-part functional mappings (e.g., rotation of stove knob controls stove burner temperature) through function templates, a structured computational representation. Function templates enable precise evaluation and direct compilation into executable code across simulation platforms. To enable comprehensive benchmarking, we introduce a dataset of 271 egocentric videos featuring challenging real-world interactions with paired 3D geometry, segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
