Hansel: Output Length Controlling Framework for Large Language Models

Seoha Song; Junhyun Lee; Hyeonmok Ko

arXiv:2412.14033·cs.CL·December 19, 2024

Hansel: Output Length Controlling Framework for Large Language Models

Seoha Song, Junhyun Lee, Hyeonmok Ko

PDF

Open Access

TL;DR

Hansel is a versatile framework that enables precise length control in large language models during finetuning, improving accuracy and extrapolation without sacrificing text quality.

Contribution

It introduces a simple, effective method using hidden tokens for length control applicable to any pre-trained LLMs during finetuning, enhancing output length accuracy and generalization.

Findings

01

Significant reduction in mean absolute error of output length across models and datasets.

02

Improved ability to extrapolate to unseen target lengths.

03

Maintains coherency and fluency of generated text.

Abstract

Despite the great success of large language models (LLMs), efficiently controlling the length of the output sequence still remains a challenge. In this paper, we propose Hansel, an efficient framework for length control in LLMs without affecting its generation ability. Hansel utilizes periodically outputted hidden special tokens to keep track of the remaining target length of the output sequence. Together with techniques to avoid abrupt termination of the output, this seemingly simple method proved to be efficient and versatile, while not harming the coherency and fluency of the generated text. The framework can be applied to any pre-trained LLMs during the finetuning stage of the model, regardless of its original positional encoding method. We demonstrate this by finetuning four different LLMs with Hansel and show that the mean absolute error of the output sequence decreases…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Topic Modeling