REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Zafir Stojanovski, Oliver Stanley, Joe Sharratt, Richard Jones, Abdulhakeem Adefioye, Jean Kaddour, Andreas K\"opf

TL;DR
Reasoning Gym (RG) is a versatile library offering over 100 reasoning environments with verifiable rewards, enabling continuous, adjustable complexity training and evaluation for reinforcement learning models across diverse domains.
Contribution
The paper introduces Reasoning Gym, a novel library that provides procedurally generated reasoning environments with verifiable rewards for reinforcement learning research.
Findings
RG effectively evaluates reasoning models across multiple domains.
Procedural generation allows for infinite, adjustable training data.
Experimental results show improved model evaluation and learning.
Abstract
We introduce Reasoning Gym (RG), a library of reasoning environments for reinforcement learning with verifiable rewards. It provides over 100 data generators and verifiers spanning multiple domains including algebra, arithmetic, computation, cognition, geometry, graph theory, logic, and various common games. Its key innovation is the ability to generate virtually infinite training data with adjustable complexity, unlike most previous reasoning datasets, which are typically fixed. This procedural generation approach allows for continuous evaluation across varying difficulty levels. Our experimental results demonstrate the efficacy of RG in both evaluating and reinforcement learning of reasoning models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsLib
