Earley-Driven Dynamic Pruning for Efficient Structured Decoding

Xintong Sun; Chi Wei; Minghao Tian; Shiwen Ni

arXiv:2506.01151·cs.LG·June 3, 2025

Earley-Driven Dynamic Pruning for Efficient Structured Decoding

Xintong Sun, Chi Wei, Minghao Tian, Shiwen Ni

PDF

Open Access

TL;DR

This paper introduces ZapFormat, a dynamic pruning strategy for constrained decoding in large language models that reduces computational overhead and speeds up structured generation tasks while maintaining high output quality.

Contribution

We propose ZapFormat, a novel Earley algorithm-based dynamic pruning method that improves efficiency and scalability of constrained decoding in LLMs for structured output generation.

Findings

01

Achieves up to 2x speedup in inference time.

02

Maintains high-precision compliance with structural constraints.

03

Applicable across various LLM architectures.

Abstract

Large Language Models (LLMs) have shown remarkable capabilities, yet ensuring their outputs conform to strict structural or grammatical constraints remains challenging, which is critical in function calls and domain-specific language (DSL) generation. Constrained decoding with context-free grammar is a flexible approach to guarantee LLMs' adherence to a specific format by dynamically building a token logits mask. However, creating this mask requires checking the validity of all tokens in the LLM vocabulary at every decoding step, which often incurs significant overheads in existing constrained decoding engines. To address this challenge, we propose $ZapFormat$ , a novel $dynamic pruning$ strategy based on the Earley algorithm that identifies and eliminates invalid or redundant Earley states in real-time, significantly reducing memory occupation of the Earley algorithm's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetal Forming Simulation Techniques · Advanced Surface Polishing Techniques