DINO-SLAM: DINO-informed RGB-D SLAM for Neural Implicit and Explicit Representations

Ziren Gong; Xiaohan Li; Fabio Tosi; Youmin Zhang; Stefano Mattoccia; Jun Wu; Matteo Poggi

arXiv:2507.19474·cs.CV·July 28, 2025

DINO-SLAM: DINO-informed RGB-D SLAM for Neural Implicit and Explicit Representations

Ziren Gong, Xiaohan Li, Fabio Tosi, Youmin Zhang, Stefano Mattoccia, Jun Wu, Matteo Poggi

PDF

Open Access

TL;DR

DINO-SLAM introduces a novel SLAM system that leverages DINO-informed features and a Scene Structure Encoder to improve neural implicit and explicit scene representations, achieving superior results on multiple datasets.

Contribution

The paper proposes EDINO features and two SLAM paradigms integrating them, enhancing neural scene representations with hierarchical structural information.

Findings

01

Outperforms state-of-the-art methods on Replica, ScanNet, and TUM datasets.

02

Enriches scene features with hierarchical structural relationships.

03

Improves neural implicit and explicit SLAM representations.

Abstract

This paper presents DINO-SLAM, a DINO-informed design strategy to enhance neural implicit (Neural Radiance Field -- NeRF) and explicit representations (3D Gaussian Splatting -- 3DGS) in SLAM systems through more comprehensive scene representations. Purposely, we rely on a Scene Structure Encoder (SSE) that enriches DINO features into Enhanced DINO ones (EDINO) to capture hierarchical scene elements and their structural relationships. Building upon it, we propose two foundational paradigms for NeRF and 3DGS SLAM systems integrating EDINO features. Our DINO-informed pipelines achieve superior performance on the Replica, ScanNet, and TUM compared to state-of-the-art methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · 3D Shape Modeling and Analysis · Robot Manipulation and Learning