Towards Effective Code-Integrated Reasoning

Fei Bai; Yingqian Min; Beichen Zhang; Zhipeng Chen; Wayne Xin Zhao; Lei Fang; Zheng Liu; Zhongyuan Wang; Ji-Rong Wen

arXiv:2505.24480·cs.CL·June 2, 2025

Towards Effective Code-Integrated Reasoning

Fei Bai, Yingqian Min, Beichen Zhang, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a systematic approach to enhancing the training stability and effectiveness of tool-augmented reinforcement learning for code-integrated reasoning, leading to significant performance improvements on mathematical benchmarks.

Contribution

It introduces improved training strategies that balance exploration and stability, enabling models to better learn when and how to use code tools for reasoning.

Findings

01

Significant performance gains on five mathematical reasoning benchmarks.

02

Enhanced training strategies improve stability and exploration.

03

Deeper understanding of code-integrated reasoning mechanisms.

Abstract

In this paper, we investigate code-integrated reasoning, where models generate code when necessary and integrate feedback by executing it through a code interpreter. To acquire this capability, models must learn when and how to use external code tools effectively, which is supported by tool-augmented reinforcement learning (RL) through interactive learning. Despite its benefits, tool-augmented RL can still suffer from potential instability in the learning dynamics. In light of this challenge, we present a systematic approach to improving the training effectiveness and stability of tool-augmented RL for code-integrated reasoning. Specifically, we develop enhanced training strategies that balance exploration and stability, progressively building tool-use capabilities while improving reasoning performance. Through extensive experiments on five mainstream mathematical reasoning benchmarks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rucaibox/cir
pytorchOfficial

Videos

Towards Effective Code-Integrated Reasoning· underline

Taxonomy

TopicsNatural Language Processing Techniques