Reasoning with Reinforced Functional Token Tuning
Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang,, Dacheng Tao, Mingli Song, Shunyu Liu

TL;DR
This paper introduces Reinforced Functional Token Tuning (RFTT), a novel framework that enhances large language models' reasoning abilities by embedding learnable functional tokens and using reinforcement learning to improve reasoning pathways, achieving state-of-the-art results on math benchmarks.
Contribution
The paper presents a new reinforcement learning-based fine-tuning method that incorporates functional tokens directly into LLMs for improved reasoning capabilities, surpassing previous prompt-driven approaches.
Findings
Significant performance improvements on math benchmarks.
RFTT outperforms prior models on the MATH dataset.
Performance improves with more inference rollouts.
Abstract
In this work, we propose Reinforced Functional Token Tuning (RFTT), a novel reinforced fine-tuning framework that empowers Large Language Models (LLMs) with self-play learn-to-reason capabilities. Unlike prior prompt-driven reasoning efforts, RFTT embeds a rich set of learnable functional tokens (e.g., <analyze>, <verify>, <refine>) directly into the model vocabulary, enabling chain-of-thought construction with diverse human-like reasoning behaviors. Specifically, RFTT comprises two phases: (1) supervised fine-tuning performs prompt-driven tree search to obtain self-generated training data annotated with functional tokens, which warms up the model to learn these tokens for reasoning; and (2) online reinforcement learning further allows the model to explore different reasoning pathways through functional token sampling without relying on prompts, thereby facilitating effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Logic, Reasoning, and Knowledge
MethodsSparse Evolutionary Training
