CharED: Character-wise Ensemble Decoding for Large Language Models

Kevin Gu; Eva Tuecke; Dmitriy Katz; Raya Horesh; David Alvarez-Melis,; Mikhail Yurochkin

arXiv:2407.11009·cs.CL·July 17, 2024

CharED: Character-wise Ensemble Decoding for Large Language Models

Kevin Gu, Eva Tuecke, Dmitriy Katz, Raya Horesh, David Alvarez-Melis,, Mikhail Yurochkin

PDF

Open Access

TL;DR

CharED introduces a novel character-wise ensemble decoding method that combines multiple large language models at inference time, improving performance across various domains without requiring shared vocabularies or fine-tuning.

Contribution

The paper presents CharED, a new inference-time ensembling algorithm that averages character distributions from multiple LLMs, overcoming limitations of traditional ensembling methods.

Findings

01

Improved performance on coding, math, and toxicity benchmarks.

02

Effective combination of models regardless of vocabulary or tokenization.

03

Enhances capabilities of LLM ensembles without additional training.

Abstract

Large language models (LLMs) have shown remarkable potential for problem solving, with open source models achieving increasingly impressive performance on benchmarks measuring areas from logical reasoning to mathematical ability. Ensembling models can further improve capabilities across a variety of domains. However, conventional methods of combining models at inference time such as shallow fusion necessitate a shared vocabulary and tokenization, and alternatives like fine-tuning for domain-specific performance are both time consuming and computationally expensive. We therefore present an inference-time ensembling algorithm aimed at "averaging" outputs from multiple LLMs and illustrate its improved performance across multiple domains compared to its constituent models alone. Character-wise ensemble decoding, CharED, finds the marginal distribution of each character for an individual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling