From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning

Chalamalasetti Kranti; Sherzod Hakimov; David Schlangen

arXiv:2505.14425·cs.CL·August 19, 2025

From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning

Chalamalasetti Kranti, Sherzod Hakimov, David Schlangen

PDF

Open Access

TL;DR

This paper investigates the challenges large language models face when generalizing from synthetic to human instructions in spatial reasoning tasks, highlighting performance gaps and analyzing error sources.

Contribution

It provides a detailed analysis of generalization issues in instruction-tuned LLMs for spatial tasks and evaluates their performance on a mixed instruction dataset.

Findings

01

Models perform well on simple tasks but struggle with complex instructions.

02

Performance drops significantly when generalizing to human-authored instructions.

03

Error analysis reveals specific gaps in instruction understanding.

Abstract

Instruction-tuned large language models (LLMs) have shown strong performance on a variety of tasks; however, generalizing from synthetic to human-authored instructions in grounded environments remains a challenge for them. In this work, we study generalization challenges in spatial grounding tasks where models interpret and translate instructions for building object arrangements on a $2.5$ D grid. We fine-tune LLMs using only synthetic instructions and evaluate their performance on a benchmark dataset containing both synthetic and human-written instructions. Our results reveal that while models generalize well on simple tasks, their performance degrades significantly on more complex tasks. We present a detailed error analysis of the gaps in instruction generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · AI-based Problem Solving and Planning · Natural Language Processing Techniques