LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

Miho Koda; Yu Zheng; Ruixian Ma; Mingyang Sun; Devesh Pansare; Fabio Duarte; Paolo Santi

arXiv:2506.13841·cs.AI·April 2, 2026

LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

Miho Koda, Yu Zheng, Ruixian Ma, Mingyang Sun, Devesh Pansare, Fabio Duarte, Paolo Santi

PDF

2 Repos

TL;DR

LocationReasoner is a benchmark for evaluating large language models' reasoning skills in real-world site selection, revealing current models' limited performance and challenges in complex spatial reasoning tasks.

Contribution

The paper introduces LocationReasoner, a novel benchmark with tools and verification for assessing LLMs' reasoning in real-world spatial and logistical scenarios.

Findings

01

State-of-the-art models show limited improvement over predecessors.

02

OpenAI o4 model fails on 30% of site selection tasks.

03

Agentic strategies like ReAct can worsen outcomes due to over-reasoning.

Abstract

Recent advances in large language models (LLMs), particularly those enhanced through reinforced post-training, have demonstrated impressive reasoning capabilities, as exemplified by models such as OpenAI o1 and DeepSeek-R1. However, these capabilities are predominantly benchmarked on domains like mathematical problem solving and code generation, leaving open the question of whether such reasoning skills generalize to complex real-world scenarios. In this paper, we introduce LocationReasoner, a benchmark designed to evaluate LLMs' reasoning abilities in the context of real-world site selection, where models must identify feasible locations by reasoning over diverse and complicated spatial, environmental, and logistic constraints. The benchmark covers carefully crafted queries of varying difficulty levels and is supported by a sandbox environment with in-house tools for constraint-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.