EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration

Jianfei Wu; Zhichun Wang; Zhensheng Wang; Zhiyu He

arXiv:2604.07070·cs.AI·April 14, 2026

EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration

Jianfei Wu, Zhichun Wang, Zhensheng Wang, Zhiyu He

PDF

1 Repo

TL;DR

EVGeoQA introduces a challenging benchmark for evaluating large language models' ability to perform dynamic, multi-objective geo-spatial exploration in electric vehicle charging scenarios, highlighting their strengths and limitations.

Contribution

The paper presents EVGeoQA, a novel geo-spatial benchmark with a dual-objective, location-anchored design, and proposes GeoRover, a framework for assessing LLMs in complex exploration tasks.

Findings

01

LLMs effectively use tools for sub-tasks but struggle with long-range spatial exploration.

02

LLMs can summarize exploration trajectories to improve efficiency.

03

EVGeoQA serves as a challenging testbed for geo-spatial intelligence.

Abstract

While Large Language Models (LLMs) demonstrate remarkable reasoning capabilities, their potential for purpose-driven exploration in dynamic geo-spatial environments remains under-investigated. Existing Geo-Spatial Question Answering (GSQA) benchmarks predominantly focus on static retrieval, failing to capture the complexity of real-world planning that involves dynamic user locations and compound constraints. To bridge this gap, we introduce EVGeoQA, a novel benchmark built upon Electric Vehicle (EV) charging scenarios that features a distinct location-anchored and dual-objective design. Specifically, each query in EVGeoQA is explicitly bound to a user's real-time coordinate and integrates the dual objectives of a charging necessity and a co-located activity preference. To systematically assess models in such complex settings, we further propose GeoRover, a general evaluation framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kg-bnu/EVGeoQA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.