EA3D: Online Open-World 3D Object Extraction from Streaming Videos

Xiaoyu Zhou; Jingqi Wang; Yuang Jia; Yongtao Wang; Deqing Sun; Ming-Hsuan Yang

arXiv:2510.25146·cs.CV·October 30, 2025

EA3D: Online Open-World 3D Object Extraction from Streaming Videos

Xiaoyu Zhou, Jingqi Wang, Yuang Jia, Yongtao Wang, Deqing Sun, Ming-Hsuan Yang

PDF

TL;DR

EA3D is an innovative online framework that enables real-time 3D object extraction and scene understanding from streaming videos, integrating geometric reconstruction and semantic analysis in a unified manner.

Contribution

It introduces a novel online approach combining vision-language and 2D vision encoders with Gaussian feature updates for open-world 3D scene understanding from streaming videos.

Findings

01

Effective across diverse benchmarks and tasks

02

Enables real-time 3D reconstruction and semantic understanding

03

Improves downstream 3D scene analysis capabilities

Abstract

Current 3D scene understanding methods are limited by offline-collected multi-view data or pre-constructed 3D geometry. In this paper, we present ExtractAnything3D (EA3D), a unified online framework for open-world 3D object extraction that enables simultaneous geometric reconstruction and holistic scene understanding. Given a streaming video, EA3D dynamically interprets each frame using vision-language and 2D vision foundation encoders to extract object-level knowledge. This knowledge is integrated and embedded into a Gaussian feature map via a feed-forward online update strategy. We then iteratively estimate visual odometry from historical frames and incrementally update online Gaussian features with new observations. A recurrent joint optimization module directs the model's attention to regions of interest, simultaneously enhancing both geometric reconstruction and semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.