SAGOnline: Segment Any Gaussians Online
Wentao Sun, Quanyun Wu, Hanqing Xu, Kyle Gao, Zhengsen Xu, Yiping Chen, Dedong Zhang, Lingfei Ma, John S. Zelek, Jonathan Li

TL;DR
SAGOnline is a real-time, zero-shot 3D segmentation framework that leverages Gaussian rasterization and foundation models to achieve consistent, efficient, and scene-agnostic 3D scene segmentation without scene-specific training.
Contribution
It introduces a novel Rasterization-aware Geometric Consensus mechanism for deterministic 3D labeling, enabling instant inference and eliminating the need for feature distillation.
Findings
Achieves state-of-the-art accuracy of 92.7% and 95.2% mIoU on benchmarks.
Operates at 27 ms per frame, the fastest among comparable methods.
Supports diverse segmentation tasks including prompt, instance, and semantic segmentation.
Abstract
3D Gaussian Splatting has emerged as a powerful paradigm for explicit 3D scene representation, yet achieving efficient and consistent 3D segmentation remains challenging. Existing segmentation approaches typically rely on high-dimensional feature lifting, which causes costly optimization, implicit semantics, and task-specific constraints. We present \textbf{Segment Any Gaussians Online (SAGOnline)}, a unified, zero-shot framework that achieves real-time, cross-view consistent segmentation without scene-specific training. SAGOnline decouples the monolithic segmentation problem into lightweight sub-tasks. By integrating video foundation models (e.g., SAM 2), we first generate temporally consistent 2D masks across rendered views. Crucially, instead of learning continuous feature fields, we introduce a \textbf{Rasterization-aware Geometric Consensus} mechanism that leverages the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
