Decomposing Elements of Problem Solving: What "Math" Does RL Teach?

Tian Qin; Core Francisco Park; Mujin Kwun; Aaron Walsman; Eran Malach; Nikhil Anand; Hidenori Tanaka; David Alvarez-Melis

arXiv:2505.22756·cs.AI·May 30, 2025

Decomposing Elements of Problem Solving: What "Math" Does RL Teach?

Tian Qin, Core Francisco Park, Mujin Kwun, Aaron Walsman, Eran Malach, Nikhil Anand, Hidenori Tanaka, David Alvarez-Melis

PDF

Open Access 1 Repo

TL;DR

This paper investigates how reinforcement learning improves mathematical reasoning in large language models by decomposing problem-solving into planning, execution, and verification, revealing RL's strengths and limitations in skill development.

Contribution

It introduces a decomposition framework for problem-solving skills and demonstrates RL's impact on execution robustness and planning limitations through empirical and synthetic experiments.

Findings

01

RL enhances execution robustness but struggles with new problems

02

Models mainly improve execution, not planning skills

03

Synthetic tasks confirm RL's role and limitations in exploration

Abstract

Mathematical reasoning tasks have become prominent benchmarks for assessing the reasoning capabilities of LLMs, especially with reinforcement learning (RL) methods such as GRPO showing significant performance gains. However, accuracy metrics alone do not support fine-grained assessment of capabilities and fail to reveal which problem-solving skills have been internalized. To better understand these capabilities, we propose to decompose problem solving into fundamental capabilities: Plan (mapping questions to sequences of steps), Execute (correctly performing solution steps), and Verify (identifying the correctness of a solution). Empirically, we find that GRPO mainly enhances the execution skill-improving execution robustness on problems the model already knows how to solve-a phenomenon we call temperature distillation. More importantly, we show that RL-trained models struggle with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cfpark00/rl-wall
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive and developmental aspects of mathematical skills · Teaching and Learning Programming · Visual and Cognitive Learning Processes