RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior

Junyao Yang; Jianwei Wang; Huiping Zhuang; Cen Chen; Ziqian Zeng

arXiv:2508.03140·cs.CL·January 21, 2026

RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior

Junyao Yang, Jianwei Wang, Huiping Zhuang, Cen Chen, Ziqian Zeng

PDF

1 Video

TL;DR

This paper introduces RCP-Merging, a novel framework that effectively combines long chain-of-thought reasoning models with domain-specific models, preserving reasoning ability while enhancing domain-specific task performance.

Contribution

The paper proposes a new merging method that maintains reasoning capabilities and domain knowledge integration without significant performance degradation.

Findings

01

Improves domain task performance by over 9%.

02

Maintains core reasoning capabilities effectively.

03

Outperforms existing merging methods.

Abstract

Large Language Models (LLMs) with long chain-of-thought (CoT) capability, termed Reasoning Models, demonstrate superior intricate problem-solving abilities through multi-step long CoT reasoning. To create a dual-capability model with long CoT capability and domain-specific knowledge without substantial computational and data costs, model merging emerges as a highly resource-efficient method. However, significant challenges lie in merging domain-specific LLMs with long CoT ones since nowadays merging methods suffer from reasoning capability degradation, even gibberish output and output collapse. To overcome this, we introduce RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior, a novel merging framework designed to integrate domain-specific LLMs with long CoT capability, meanwhile maintaining model performance in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior· underline