SpecFuse: Ensembling Large Language Models via Next-Segment Prediction
Bo Lv, Nayu Liu, Chen Tang, Xin Liu, Yue Yu, Ping Luo

TL;DR
SpecFuse introduces SpecEM, a dynamic, training-free ensemble framework for large language models that improves performance by segment-level collaboration and real-time model weighting based on task-specific performance.
Contribution
The paper presents SpecEM, a novel plug-and-play ensemble method that enables real-time, segment-level collaboration and adaptive weighting of LLMs without additional training.
Findings
Consistent performance improvements over state-of-the-art ensemble methods.
Effective dynamic weighting based on model performance during verification.
Applicable across multiple LLM sizes and diverse benchmark datasets.
Abstract
Ensembles of generative large language models (LLMs) are a promising way to compensate for individual model limitations, integrating the strengths of different LLMs. Existing LLM ensemble methods, however, face limitations such as first-token delay and challenges in long-range semantic collaboration between models, Moreover, they typically assume equal voting weights for all models during ensemble, ignoring task-specific performance differences among models. In this work, we propose SpecEM, a training-free, plug-and-play LLM ensemble framework that dynamically adjusts each model's model contribution in real time based on task performance. Inspired by speculative decoding, SpecEM iteratively performs drafting and verification, allowing models to collaborate semantically at the segment level for integrated output. Furthermore, we introduce an online feedback mechanism with multiplicative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsBalanced Selection
