Precise Length Control in Large Language Models
Bradley Butcher, Michael O'Keefe, James Titchener

TL;DR
This paper presents a novel method for controlling the response length of large language models using a length-difference positional encoding, achieving precise length control with minimal token errors in various tasks.
Contribution
We introduce LDPE, a new positional encoding technique, and Max New Tokens++, enabling precise and flexible length control in pre-trained LLMs without degrading response quality.
Findings
Mean token error less than 3 tokens.
Effective length control in question answering and summarization.
Flexible upper-bound length control with Max New Tokens++.
Abstract
Large Language Models (LLMs) are increasingly used in production systems, powering applications such as chatbots, summarization, and question answering. Despite their success, controlling the length of their response remains a significant challenge, particularly for tasks requiring structured outputs or specific levels of detail. In this work, we propose a method to adapt pre-trained decoder-only LLMs for precise control of response length. Our approach incorporates a secondary length-difference positional encoding (LDPE) into the input embeddings, which counts down to a user-set response termination length. Fine-tuning with LDPE allows the model to learn to terminate responses coherently at the desired length, achieving mean token errors of less than 3 tokens. We also introduce Max New Tokens++, an extension that enables flexible upper-bound length control, rather than an exact target.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
