LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models
Wei Zhang, Lintong Du, Yuanhe Zhang, Zhenhong Zhou, Kun Wang, Li Sun, Sen Su

TL;DR
LARFT is a novel training framework that enhances large language models' ability to follow length instructions precisely by aligning their internal length cognition with their output actions through reinforcement learning.
Contribution
The paper introduces LARFT, a reinforcement learning-based method that improves length control in LLMs by teaching models to better understand and manage output length.
Findings
LARFT outperforms existing methods on length instruction benchmarks.
Achieves an average +20.92 point improvement in length following accuracy.
Maintains strong general capabilities with only a -1.45 point decline.
Abstract
Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise control of output length remains a persistent challenge. Existing methods primarily attempt to enforce length constraints by externally imposing length signals or optimization objectives, while largely overlooking the underlying limitation: the model's intrinsic deficit in length cognition. To address this, we propose LARFT (Length-Aware Reinforcement Fine-Tuning), a training framework that aligns the model's length cognition with its action. Specifically, LARFT integrates length-oriented reinforcement learning with a hindsight length awareness. By transforming on-policy data into hindsight self-awareness tasks where the model learns to identify the actual length of its own generation, LARFT jointly optimizes the model's internal representation of length information and refines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
