OpenGaFF: Open-Vocabulary Gaussian Feature Field with Codebook Attention

Kunyi Li; Michael Niemeyer; Sen Wang; Stefano Gasperini; Nassir Navab; Federico Tombari

arXiv:2605.06088·cs.CV·May 19, 2026

OpenGaFF: Open-Vocabulary Gaussian Feature Field with Codebook Attention

Kunyi Li, Michael Niemeyer, Sen Wang, Stefano Gasperini, Nassir Navab, Federico Tombari

PDF

TL;DR

OpenGaFF is a novel 3D scene understanding framework that models semantics as a continuous function of geometry and appearance, improving open-vocabulary reasoning and spatial coherence in 3D scenes.

Contribution

The paper introduces a Gaussian Feature Field with a structured codebook and codebook-guided attention, enhancing open-vocabulary 3D scene understanding with improved semantic consistency.

Findings

01

Outperforms prior methods on standard benchmarks

02

Achieves better segmentation quality and semantic consistency

03

Provides a semantically interpretable learned representation

Abstract

Understanding open-vocabulary 3D scenes with Gaussian-based representations remains challenging due to fragmented and spatially inconsistent semantic predictions across multi-view observations. In this paper, we present OpenGaFF, a novel framework for open-vocabulary 3D scene understanding built upon 3D Gaussian Splatting. At the core of our method is a Gaussian Feature Field that models semantics as a continuous function of Gaussian geometry and appearance. By explicitly conditioning semantic predictions on geometric structure, this formulation strengthens the coupling between geometry and semantics, leading to improved spatial coherence across similar structures in 3D space. To further enforce object-level semantic consistency, we introduce a structured codebook that serves as a set of shared semantic primitives. Furthermore, a codebook-guided attention mechanism is proposed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.