OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks   with Reinforcement Fine-Tuning

Yuxiang Zhang; Yuqi Yang; Jiangming Shu; Yuhang Wang; Jinlin Xiao,; Jitao Sang

arXiv:2412.16849·cs.AI·December 24, 2024

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Yuxiang Zhang, Yuqi Yang, Jiangming Shu, Yuhang Wang, Jinlin Xiao,, Jitao Sang

PDF

Open Access 1 Repo

TL;DR

OpenRFT adapts reasoning foundation models for domain-specific tasks using reinforcement fine-tuning, addressing data scarcity and reasoning step limitations to improve performance with minimal samples.

Contribution

This paper introduces OpenRFT, a novel method for fine-tuning reasoning models on domain-specific tasks using reinforcement fine-tuning with data augmentation techniques.

Findings

01

OpenRFT achieves significant performance improvements on SciKnowEval.

02

Effective use of only 100 domain-specific samples per task.

03

Demonstrates the potential of reinforcement fine-tuning for domain adaptation.

Abstract

OpenAI's recent introduction of Reinforcement Fine-Tuning (RFT) showcases the potential of reasoning foundation model and offers a new paradigm for fine-tuning beyond simple pattern imitation. This technical report presents \emph{OpenRFT}, our attempt to fine-tune generalist reasoning models for domain-specific tasks under the same settings as RFT. OpenRFT addresses two key challenges of lacking reasoning step data and the limited quantity of training samples, by leveraging the domain-specific samples in three ways: question augmentation, synthesizing reasoning-process data, and few-shot ICL. The evaluation is conducted on SciKnowEval, where OpenRFT achieves notable performance gains with only $100$ domain-specific samples for each task. More experimental results will be updated continuously in later versions. Source codes, datasets, and models are disclosed at:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adam-bjtu/openrft
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Software Engineering Methodologies · Reinforcement Learning in Robotics · Software Engineering Research