Hit the Sweet Spot! Span-Level Ensemble for Large Language Models

Yangyifan Xu; Jianghao Chen; Junhong Wu; Jiajun Zhang

arXiv:2409.18583·cs.CL·September 30, 2024

Hit the Sweet Spot! Span-Level Ensemble for Large Language Models

Yangyifan Xu, Jianghao Chen, Junhong Wu, Jiajun Zhang

PDF

Open Access

TL;DR

This paper introduces SweetSpan, a span-level ensemble method for large language models that balances real-time correction with the use of comprehensive information, improving ensemble performance across tasks.

Contribution

We propose a novel span-level ensemble technique that uses mutual evaluation of candidate spans and a new challenging evaluation setting for more realistic assessment.

Findings

01

SweetSpan outperforms previous ensemble methods in various tasks.

02

The span-level approach provides better robustness and versatility.

03

Evaluation in challenging settings demonstrates improved real-world applicability.

Abstract

Ensembling various LLMs to unlock their complementary potential and leverage their individual strengths is highly valuable. Previous studies typically focus on two main paradigms: sample-level and token-level ensembles. Sample-level ensemble methods either select or blend fully generated outputs, which hinders dynamic correction and enhancement of outputs during the generation process. On the other hand, token-level ensemble methods enable real-time correction through fine-grained ensemble at each generation step. However, the information carried by an individual token is quite limited, leading to suboptimal decisions at each step. To address these issues, we propose SweetSpan, a span-level ensemble method that effectively balances the need for real-time adjustments and the information required for accurate ensemble decisions. Our approach involves two key steps: First, we have each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsFocus