A Thorough Examination of Decoding Methods in the Era of LLMs

Chufan Shi; Haoran Yang; Deng Cai; Zhisong Zhang; Yifan Wang; Yujiu; Yang; Wai Lam

arXiv:2402.06925·cs.CL·October 10, 2024·1 cites

A Thorough Examination of Decoding Methods in the Era of LLMs

Chufan Shi, Haoran Yang, Deng Cai, Zhisong Zhang, Yifan Wang, Yujiu, Yang, Wai Lam

PDF

Open Access 1 Repo

TL;DR

This paper thoroughly analyzes various decoding methods for large language models, examining their performance, robustness, and speed across tasks and models, revealing task dependence and trade-offs in hyperparameter tuning.

Contribution

It provides a comprehensive evaluation of decoding strategies in the context of LLMs, highlighting their performance variability and practical considerations.

Findings

01

Decoding performance varies significantly across tasks and models.

02

Some methods require extensive hyperparameter tuning for optimal results.

03

Trade-offs exist between decoding quality and tuning complexity.

Abstract

Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers. Prior research on decoding methods, primarily focusing on task-specific models, may not extend to the current era of general-purpose large language models (LLMs). Moreover, the recent influx of decoding strategies has further complicated this landscape. This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of LLMs, evaluating their performance, robustness to hyperparameter changes, and decoding speeds across a wide range of tasks, models, and deployment environments. Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization. Intriguingly, sensitivity analysis exposes that certain methods achieve superior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

davidfanzz/llm_decoding
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security