Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures

Yu He; Yingxi Li; Colin White; Ellen Vitercik

arXiv:2505.24069·cs.LG·February 12, 2026

Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures

Yu He, Yingxi Li, Colin White, Ellen Vitercik

PDF

Open Access 1 Repo

TL;DR

This paper introduces DSR-Bench, a comprehensive benchmark using data structures to evaluate the structural reasoning abilities of large language models, revealing significant limitations in their algorithmic reasoning skills.

Contribution

The paper presents DSR-Bench, a novel diagnostic benchmark with automated generation for assessing LLMs' understanding of data structures and their reasoning capabilities.

Findings

01

Top LLMs score only 0.46/1 on challenging instances

02

Models perform poorly on spatial and context-rich data

03

Struggle to reason over their own code

Abstract

Large language models (LLMs) are deployed on increasingly complex tasks that require multi-step decision-making. Understanding their algorithmic reasoning abilities is therefore crucial. However, we lack a diagnostic benchmark for evaluating this capability. We propose data structures as a principled lens: as fundamental building blocks of algorithms, they naturally probe structural reasoning-the ability to understand and manipulate relationships such as order, hierarchy, and connectivity that underpin algorithmic reasoning. We introduce DSR-Bench, spanning 20 data structures, 35 operations, and 4,140 problem instances. DSR-Bench features hierarchical task organization, fully automated generation and evaluation, and fine-grained diagnostics. Evaluating 13 state-of-the-art LLMs reveals critical limitations: the top-performing model achieves only 0.46/1 on challenging instances. Three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dransyhe/dsr-bench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies

MethodsFocus