CharED: Character-wise Ensemble Decoding for Large Language Models
Kevin Gu, Eva Tuecke, Dmitriy Katz, Raya Horesh, David Alvarez-Melis,, Mikhail Yurochkin

TL;DR
CharED introduces a novel character-wise ensemble decoding method that combines multiple large language models at inference time, improving performance across various domains without requiring shared vocabularies or fine-tuning.
Contribution
The paper presents CharED, a new inference-time ensembling algorithm that averages character distributions from multiple LLMs, overcoming limitations of traditional ensembling methods.
Findings
Improved performance on coding, math, and toxicity benchmarks.
Effective combination of models regardless of vocabulary or tokenization.
Enhances capabilities of LLM ensembles without additional training.
Abstract
Large language models (LLMs) have shown remarkable potential for problem solving, with open source models achieving increasingly impressive performance on benchmarks measuring areas from logical reasoning to mathematical ability. Ensembling models can further improve capabilities across a variety of domains. However, conventional methods of combining models at inference time such as shallow fusion necessitate a shared vocabulary and tokenization, and alternatives like fine-tuning for domain-specific performance are both time consuming and computationally expensive. We therefore present an inference-time ensembling algorithm aimed at "averaging" outputs from multiple LLMs and illustrate its improved performance across multiple domains compared to its constituent models alone. Character-wise ensemble decoding, CharED, finds the marginal distribution of each character for an individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
