Reasoning with Reinforced Functional Token Tuning

Kongcheng Zhang; Qi Yao; Baisheng Lai; Jiaxing Huang; Wenkai Fang,; Dacheng Tao; Mingli Song; Shunyu Liu

arXiv:2502.13389·cs.AI·February 20, 2025

Reasoning with Reinforced Functional Token Tuning

Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang,, Dacheng Tao, Mingli Song, Shunyu Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Reinforced Functional Token Tuning (RFTT), a novel framework that enhances large language models' reasoning abilities by embedding learnable functional tokens and using reinforcement learning to improve reasoning pathways, achieving state-of-the-art results on math benchmarks.

Contribution

The paper presents a new reinforcement learning-based fine-tuning method that incorporates functional tokens directly into LLMs for improved reasoning capabilities, surpassing previous prompt-driven approaches.

Findings

01

Significant performance improvements on math benchmarks.

02

RFTT outperforms prior models on the MATH dataset.

03

Performance improves with more inference rollouts.

Abstract

In this work, we propose Reinforced Functional Token Tuning (RFTT), a novel reinforced fine-tuning framework that empowers Large Language Models (LLMs) with self-play learn-to-reason capabilities. Unlike prior prompt-driven reasoning efforts, RFTT embeds a rich set of learnable functional tokens (e.g., <analyze>, <verify>, <refine>) directly into the model vocabulary, enabling chain-of-thought construction with diverse human-like reasoning behaviors. Specifically, RFTT comprises two phases: (1) supervised fine-tuning performs prompt-driven tree search to obtain self-generated training data annotated with functional tokens, which warms up the model to learn these tokens for reasoning; and (2) online reinforcement learning further allows the model to explore different reasoning pathways through functional token sampling without relying on prompts, thereby facilitating effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sastpg/rftt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Logic, Reasoning, and Knowledge

MethodsSparse Evolutionary Training