LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs

Yanan Cai; Ahmed Salem; Besmira Nushi; Mark Russinovich

arXiv:2506.10527·cs.AI·June 13, 2025

LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs

Yanan Cai, Ahmed Salem, Besmira Nushi, Mark Russinovich

PDF

Open Access

TL;DR

LogiPlan is a comprehensive benchmark for evaluating large language models' abilities in logical planning and relational reasoning, highlighting current limitations and the impact of model scale and architecture.

Contribution

We introduce LogiPlan, a structured benchmark with diverse tasks and complexity control to assess LLMs' logical reasoning and planning capabilities.

Findings

01

Performance improves with model size and architecture.

02

Models struggle with complex relational structures.

03

Reasoning-enhanced models perform better on simpler tasks.

Abstract

We introduce LogiPlan, a novel benchmark designed to evaluate the capabilities of large language models (LLMs) in logical planning and reasoning over complex relational structures. Logical relational reasoning is important for applications that may rely on LLMs to generate and query structured graphs of relations such as network infrastructure, knowledge bases, or business process schema. Our framework allows for dynamic variation of task complexity by controlling the number of objects, relations, and the minimum depth of relational chains, providing a fine-grained assessment of model performance across difficulty levels. LogiPlan encompasses three complementary tasks: (1) Plan Generation, where models must construct valid directed relational graphs meeting specified structural constraints; (2) Consistency Detection, testing models' ability to identify inconsistencies in relational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · AI-based Problem Solving and Planning · Advanced Graph Neural Networks

MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · GPT-4 · LLaMA