AdaFuse: Adaptive Ensemble Decoding with Test-Time Scaling for LLMs
Chengming Cui, Tianxin Wei, Ziyi Chen, Ruizhong Qiu, Zhichen Zeng, Zhining Liu, Xuying Ning, Duo Zhou, Jingrui He

TL;DR
AdaFuse introduces an adaptive ensemble decoding method for large language models that dynamically adjusts fusion granularity during generation, improving performance across various tasks by leveraging test-time scaling and uncertainty measures.
Contribution
It proposes a novel adaptive ensemble framework that dynamically selects fusion units during decoding, enhancing flexibility and performance without retraining.
Findings
Achieves an average 6.88% relative improvement over strong ensemble baselines.
Effectively adapts fusion granularity based on decoding context and uncertainty.
Demonstrates superior performance in question answering, reasoning, and translation tasks.
Abstract
Large language models (LLMs) exhibit complementary strengths arising from differences in pretraining data, model architectures, and decoding behaviors. Inference-time ensembling provides a practical way to combine these capabilities without retraining. However, existing ensemble approaches suffer from fundamental limitations. Most rely on fixed fusion granularity, which lacks the flexibility required for mid-generation adaptation and fails to adapt to different generation characteristics across tasks. To address these challenges, we propose AdaFuse, an adaptive ensemble decoding framework that dynamically selects semantically appropriate fusion units during generation. Rather than committing to a fixed granularity, AdaFuse adjusts fusion behavior on the fly based on the decoding context, with words serving as basic building blocks for alignment. To be specific, we introduce an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques
