InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
Zhaoyi Yan, Yiming Zhang, Baoyi He, Yuhao Fu, Qi Zhou, Zhijie Sang,, Chunlin Ji, Shengyu Zhang, Fei Wu, Hongxia Yang

TL;DR
InfiFusion is a novel framework that efficiently combines multiple domain-specific LLMs into a single model, improving performance across diverse tasks while significantly reducing training costs.
Contribution
It introduces a flexible fusion pipeline with enhanced distillation techniques and two strategies, outperforming state-of-the-art models with less computational resources.
Findings
Outperforms models like Qwen-2.5-14B-Instruct and Phi-4 on 11 benchmarks.
Reduces training cost to 160 GPU hours from millions.
Achieves superior reasoning, coding, mathematics, and instruction-following performance.
Abstract
We introduce InfiFusion, an efficient training pipeline designed to integrate multiple domain-specialized Large Language Models (LLMs) into a single pivot model, effectively harnessing the strengths of each source model. Traditional fusion methods either merge model parameters directly or rely on knowledge distillation with rigid assumptions, limiting their flexibility and efficiency. InfiFusion overcomes these limitations by enhancing Universal Logit Distillation (ULD) with Top-K selection and Logits Standardization. We propose two fusion strategies: Pairwise Fusion (InfiFusion), where each source model knowledge is distilled individually into the pivot model followed by merging and Unified Fusion (InfiFusion), where knowledge from all source models is distilled simultaneously into the pivot model. InfiFusion outperforms the state-of-the-art models, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Business Process Modeling and Analysis
