Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code

Keqin Bao; Nuo Chen; Xiaoyuan Li; Binyuan Hui; Bowen Yu; Fuli Feng; Xiangnan He; Dayiheng Liu

arXiv:2507.07498·cs.CL·July 15, 2025

Teaching LLM to Reason: Reinforcement Learning from Algorithmic Problems without Code

Keqin Bao, Nuo Chen, Xiaoyuan Li, Binyuan Hui, Bowen Yu, Fuli Feng, Xiangnan He, Dayiheng Liu

PDF

Open Access

TL;DR

This paper introduces TeaR, a reinforcement learning approach with curated data to enhance large language models' reasoning abilities by guiding them through code-related tasks, leading to significant performance improvements.

Contribution

TeaR is a novel method that combines data curation and reinforcement learning to teach LLMs better reasoning skills without overfitting to complex algorithmic patterns.

Findings

01

Significant performance improvements across 17 benchmarks.

02

35.9% improvement on Qwen2.5-7B model.

03

5.9% improvement on R1-Distilled-7B model.

Abstract

Enhancing reasoning capabilities remains a central focus in the LLM reasearch community. A promising direction involves requiring models to simulate code execution step-by-step to derive outputs for given inputs. However, as code is often designed for large-scale systems, direct application leads to over-reliance on complex data structures and algorithms, even for simple cases, resulting in overfitting to algorithmic patterns rather than core reasoning structures. To address this, we propose TeaR, which aims at teaching LLMs to reason better. TeaR leverages careful data curation and reinforcement learning to guide models in discovering optimal reasoning paths through code-related tasks, thereby improving general reasoning abilities. We conduct extensive experiments using two base models and three long-CoT distillation models, with model sizes ranging from 1.5 billion to 32 billion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Law, AI, and Intellectual Property