SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

Yifan Wang; Bolian Li; David Cho; Ruqi Zhang; Fanping Sui; Ananth Grama

arXiv:2603.27977·cs.AI·May 12, 2026

SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

Yifan Wang, Bolian Li, David Cho, Ruqi Zhang, Fanping Sui, Ananth Grama

PDF

1 Repo

TL;DR

SARL is a novel label-free reinforcement learning framework that improves reasoning models by rewarding the structure of reasoning paths, leading to better performance on math and open-ended tasks.

Contribution

It introduces SARL, which emphasizes reasoning topology over outcomes, outperforming prior label-free methods and even some supervised approaches.

Findings

01

SARL outperforms prior label-free RL baselines on math tasks.

02

SARL exceeds supervised RL methods with ground truth supervision.

03

SARL achieves significant improvements on open-ended reasoning tasks.

Abstract

Reinforcement learning is critical to improving large reasoning models, but its success relies heavily on verifiable rewards (RLVR), making it hard to use in open-ended domains where correctness is ambiguous and cannot be verified. Moreover, reasoning trajectories remain largely unconstrained, and optimizing solely toward the final answer can favor early exploitation over generalization. In this work, we ask whether general reasoning ability can be improved by teaching models how to think (the structure of reasoning) rather than what to produce (the outcome of reasoning), and we extend traditional RLVR to open-ended settings. We introduce Structure-Aware Reinforcement Learning (SARL), a label-free framework that constructs per-response reasoning maps from intermediate thinking steps and rewards their reasoning topology. SARL shifts supervision from destination to path, encouraging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cacayaya/SARL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.