OMCL: Open-vocabulary Monte Carlo Localization

Evgenii Kruzhkov; Raphael Memmesheimer; Sven Behnke

arXiv:2512.15557·cs.RO·April 3, 2026

OMCL: Open-vocabulary Monte Carlo Localization

Evgenii Kruzhkov, Raphael Memmesheimer, Sven Behnke

PDF

TL;DR

OMCL introduces an open-vocabulary Monte Carlo Localization method that leverages vision-language features for robust robot localization across diverse environments and sensor modalities.

Contribution

It extends Monte Carlo Localization with vision-language features, enabling open-vocabulary, cross-modal localization and natural language initialization.

Findings

01

Successfully localizes in indoor and outdoor scenes using vision-language features.

02

Generalizes well across different datasets like Matterport3D, Replica, and SemanticKITTI.

03

Enables natural language-based global localization initialization.

Abstract

Robust robot localization is an important prerequisite for navigation, but it becomes challenging when the map and robot measurements are obtained from different sensors. Prior methods are often tailored to specific environments, relying on closed-set semantics or fine-tuned features. In this work, we extend Monte Carlo Localization with vision-language features, allowing OMCL to robustly compute the likelihood of visual observations given a camera pose and a 3D map created from posed RGB-D images or aligned point clouds. These open-vocabulary features enable us to associate observations and map elements from different modalities, and to natively initialize global localization through natural language descriptions of nearby objects. We evaluate our approach using Matterport3D and Replica for indoor scenes and demonstrate generalization on SemanticKITTI for outdoor scenes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.