Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation

Luca Beurer-Kellner; Marc Fischer; Martin Vechev

arXiv:2403.06988·cs.LG·March 13, 2024·3 cites

Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation

Luca Beurer-Kellner, Marc Fischer, Martin Vechev

PDF

Open Access

TL;DR

This paper introduces DOMINO, a novel decoding algorithm for large language models that enforces constraints efficiently without performance loss, improving accuracy and speed over existing methods.

Contribution

We propose DOMINO, a subword-aligned constrained decoding algorithm that reduces overhead and enhances task accuracy compared to prior constrained decoding approaches.

Findings

01

DOMINO achieves near-zero overhead during constrained decoding.

02

DOMINO can double the speed of unconstrained decoding in some cases.

03

DOMINO outperforms existing constrained decoding methods in both speed and accuracy.

Abstract

To ensure that text generated by large language models (LLMs) is in an expected format, constrained decoding proposes to enforce strict formal language constraints during generation. However, as we show in this work, not only do such methods incur performance overhead during generation, but many of them also significantly impair task accuracy, if they do not correctly align the underlying LLM sub-word vocabularies with external constraints. To address this, we present a novel decoding algorithm, DOMINO, that can enforce constraints in a fully subword-aligned fashion, while leveraging pre-computation and speculative decoding to achieve virtually no overhead and in some cases even almost 2 $\times$ speedup over unconstrained decoding -- thereby outperforming existing approaches by a wide margin.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security · Mathematics, Computing, and Information Processing · Library Science and Information Systems

MethodsALIGN