RemoteZero: Geospatial Reasoning with Zero Human Annotations

Liang Yao; Fan Liu; Shengxiang Xu; Chuanyi Zhang; Rui Min; Shimin Di; and Yuhui Zheng

arXiv:2605.04451·cs.CV·May 7, 2026

RemoteZero: Geospatial Reasoning with Zero Human Annotations

Liang Yao, Fan Liu, Shengxiang Xu, Chuanyi Zhang, Rui Min, Shimin Di, and Yuhui Zheng

PDF

TL;DR

RemoteZero introduces a self-verifying, annotation-free framework for geospatial reasoning that leverages semantic verification to improve localization without human supervision.

Contribution

It presents a novel box-supervision-free approach enabling self-evolution in geospatial reasoning models using intrinsic semantic verification.

Findings

01

RemoteZero achieves competitive performance with supervised methods.

02

The framework enables iterative self-improvement from unlabeled remote sensing data.

03

Semantic verification replaces geometric supervision effectively.

Abstract

Geospatial reasoning requires models to resolve complex spatial semantics and user intent into precise target locations for Earth observation. Recent progress has liberated the reasoning path from manual curation, allowing models to generate their own inference chains. Yet a final dependency remains: they are still supervised by human-annotated ground-truth coordinates. This leaves the reasoning process autonomous, but not its spatial endpoint, and prevents true self-evolution on abundant unlabeled remote sensing data. To break this bottleneck, we introduce RemoteZero, a box-supervision-free framework for geospatial reasoning. RemoteZero is motivated by a simple asymmetry: an MLLM is typically better at verifying whether a region satisfies a query than at directly generating precise coordinates. Leveraging this stronger discriminative ability, RemoteZero replaces geometric supervision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.