O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

Haotian Luo; Li Shen; Haiying He; Yibo Wang; Shiwei Liu; Wei Li,; Naiqiang Tan; Xiaochun Cao; and Dacheng Tao

arXiv:2501.12570·cs.CL·January 30, 2025

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

Haotian Luo, Li Shen, Haiying He, Yibo Wang, Shiwei Liu, Wei Li,, Naiqiang Tan, Xiaochun Cao, and Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

O1-Pruner is a fine-tuning method that reduces inference time of long-thought reasoning LLMs by encouraging shorter reasoning processes without sacrificing accuracy, demonstrated on mathematical benchmarks.

Contribution

It introduces Length-Harmonizing Fine-Tuning (O1-Pruner), a novel RL-style approach to optimize reasoning length and efficiency in large language models.

Findings

01

Significantly reduces inference overhead in reasoning models.

02

Achieves higher accuracy on mathematical reasoning benchmarks.

03

Effectively balances reasoning length and accuracy.

Abstract

Recently, long-thought reasoning LLMs, such as OpenAI's O1, adopt extended reasoning processes similar to how humans ponder over complex problems. This reasoning paradigm significantly enhances the model's problem-solving abilities and has achieved promising results. However, long-thought reasoning process leads to a substantial increase in inference time. A pressing challenge is reducing the inference overhead of long-thought LLMs while ensuring accuracy. In this paper, we experimentally demonstrate that long-thought reasoning models struggle to effectively allocate token budgets based on problem difficulty and reasoning redundancies. To address this, we propose Length-Harmonizing Fine-Tuning (O1-Pruner), aiming at minimizing reasoning overhead while maintaining accuracy. This effective fine-tuning method first estimates the LLM's baseline performance through pre-sampling and then uses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stardewxxx/o1-pruner
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Constraint Satisfaction and Optimization

MethodsADaptive gradient method with the OPTimal convergence rate