LagMemo: Language 3D Gaussian Splatting Memory for Multi-modal Open-vocabulary Multi-goal Visual Navigation

Haotian Zhou; Xiaole Wang; He Li; Zhuo Qi; Jinrun Yin; Haiyu Kong; Jianghuan Xu; Huijing Zhao

arXiv:2510.24118·cs.RO·March 10, 2026

LagMemo: Language 3D Gaussian Splatting Memory for Multi-modal Open-vocabulary Multi-goal Visual Navigation

Haotian Zhou, Xiaole Wang, He Li, Zhuo Qi, Jinrun Yin, Haiyu Kong, Jianghuan Xu, Huijing Zhao

PDF

TL;DR

LagMemo introduces a novel 3D language memory system for robots that enhances multi-modal, open-vocabulary visual navigation by efficiently constructing, querying, and verifying goal locations during exploration.

Contribution

The paper presents LagMemo, a new memory system leveraging language 3D Gaussian splatting for improved multi-goal visual navigation in robots, with a curated benchmark for evaluation.

Findings

01

Outperforms state-of-the-art in multi-goal navigation

02

Effective multi-modal open-vocabulary localization

03

Robust spatial-semantic correlation in 3D memory

Abstract

Navigating to a designated goal using visual information is a fundamental capability for intelligent robots. To address the practical demands of multi-modal, open-vocabulary goal queries and multi-goal visual navigation, we propose LagMemo, a navigation system that leverages a language 3D Gaussian Splatting memory. During a one-time exploration, LagMemo constructs a unified 3D language memory with robust spatial-semantic correlations. With incoming task goals, the system efficiently queries the memory, predicts candidate goal locations, and integrates a local perception-based verification mechanism to dynamically match and validate goals. For fair and rigorous evaluation, we curate GOAT-Core, a high-quality core split distilled from GOAT-Bench. Experimental results show that LagMemo's memory module enables effective multi-modal open-vocabulary localization, and significantly outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.