Improving LLMs' Generalized Reasoning Abilities by Graph Problems

Qifan Zhang; Nuo Chen; Zehua Li; Miao Peng; Jing Tang; Jia Li

arXiv:2507.17168·cs.AI·July 24, 2025

Improving LLMs' Generalized Reasoning Abilities by Graph Problems

Qifan Zhang, Nuo Chen, Zehua Li, Miao Peng, Jing Tang, Jia Li

PDF

Open Access 3 Models

TL;DR

This paper introduces GraphPile, a large-scale graph problem reasoning dataset, to improve the reasoning abilities of LLMs across diverse tasks, demonstrating significant performance gains.

Contribution

It pioneers the use of Graph Problem Reasoning (GPR) for LLM pretraining and provides the first large-scale GPR dataset, enhancing general reasoning capabilities.

Findings

01

Up to 4.9% higher accuracy in mathematical reasoning.

02

Up to 21.2% improvement in non-mathematical reasoning.

03

First to leverage GPR for LLM reasoning enhancement.

Abstract

Large Language Models (LLMs) have made remarkable strides in reasoning tasks, yet their performance often falters on novel and complex problems. Domain-specific continued pretraining (CPT) methods, such as those tailored for mathematical reasoning, have shown promise but lack transferability to broader reasoning tasks. In this work, we pioneer the use of Graph Problem Reasoning (GPR) to enhance the general reasoning capabilities of LLMs. GPR tasks, spanning pathfinding, network analysis, numerical computation, and topological reasoning, require sophisticated logical and relational reasoning, making them ideal for teaching diverse reasoning patterns. To achieve this, we introduce GraphPile, the first large-scale corpus specifically designed for CPT using GPR data. Spanning 10.9 billion tokens across 23 graph tasks, the dataset includes chain-of-thought, program-of-thought, trace of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law