TOL: Textual Localization with OpenStreetMap

Youqi Liao; Shuhao Kang; Jingyu Xu; Olaf Wysocki; Yan Xia; Jianping Li; Zhen Dong; Bisheng Yang; Xieyuanli Chen

arXiv:2604.01644·cs.CV·April 28, 2026

TOL: Textual Localization with OpenStreetMap

Youqi Liao, Shuhao Kang, Jingyu Xu, Olaf Wysocki, Yan Xia, Jianping Li, Zhen Dong, Bisheng Yang, Xieyuanli Chen

PDF

1 Repo

TL;DR

This paper introduces TOL, a new large-scale benchmark and a localization framework that estimates urban positions from textual descriptions using OpenStreetMap data, without relying on geometric observations.

Contribution

It formulates the novel Text-to-OSM localization task, creates the TOL benchmark, and proposes TOLoc, a coarse-to-fine framework that leverages semantic and directional information for accurate localization.

Findings

01

TOLoc outperforms existing methods by over 6% at 5m accuracy.

02

The benchmark covers 316 km of urban environments across three cities.

03

TOLoc demonstrates strong generalization to unseen environments.

Abstract

Natural language provides an intuitive way to express spatial intent in geospatial applications. While existing localization methods often rely on dense point cloud maps or high-resolution imagery, OpenStreetMap (OSM) offers a compact and freely available map representation that encodes rich semantic and structural information, making it well-suited for large-scale localization. However, text-to-OSM (T2O) localization remains largely unexplored. In this paper, we formulate the T2O localization task, which aims to estimate accurate 2D positions in urban environments from textual scene descriptions without relying on geometric observations or GNSS-based initial location. To support the proposed task, we introduce TOL, a large-scale benchmark spanning multiple continents and diverse urban environments. TOL contains approximately 121K textual queries paired with OSM map tiles and covers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WHU-USI3DV/TOL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.