Think Only When You Need with Large Hybrid-Reasoning Models

Lingjie Jiang; Xun Wu; Shaohan Huang; Qingxiu Dong; Zewen Chi; Li Dong; Xingxing Zhang; Tengchao Lv; Lei Cui; Furu Wei

arXiv:2505.14631·cs.CL·May 22, 2025

Think Only When You Need with Large Hybrid-Reasoning Models

Lingjie Jiang, Xun Wu, Shaohan Huang, Qingxiu Dong, Zewen Chi, Li Dong, Xingxing Zhang, Tengchao Lv, Lei Cui, Furu Wei

PDF

Open Access 1 Video

TL;DR

This paper introduces Large Hybrid-Reasoning Models (LHRMs) that adaptively decide when to perform extended reasoning, improving efficiency and reasoning capabilities over traditional models by combining hybrid training and reinforcement learning.

Contribution

The work presents the first adaptive hybrid reasoning model with a novel training pipeline and a metric for assessing hybrid thinking, enhancing reasoning efficiency and flexibility.

Findings

01

LHRMs outperform existing models in reasoning accuracy and general capabilities.

02

LHRMs significantly reduce token consumption and latency for simple queries.

03

The hybrid thinking metric effectively measures the model's reasoning adaptability.

Abstract

Recent Large Reasoning Models (LRMs) have shown substantially improved reasoning capabilities over traditional Large Language Models (LLMs) by incorporating extended thinking processes prior to producing final responses. However, excessively lengthy thinking introduces substantial overhead in terms of token consumption and latency, which is particularly unnecessary for simple queries. In this work, we introduce Large Hybrid-Reasoning Models (LHRMs), the first kind of model capable of adaptively determining whether to perform thinking based on the contextual information of user queries. To achieve this, we propose a two-stage training pipeline comprising Hybrid Fine-Tuning (HFT) as a cold start, followed by online reinforcement learning with the proposed Hybrid Group Policy Optimization (HGPO) to implicitly learn to select the appropriate thinking mode. Furthermore, we introduce a metric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Think Only When You Need with Large Hybrid-Reasoning Models· slideslive

Taxonomy

TopicsSemantic Web and Ontologies