HOMURA: Taming the Sand-Glass for Time-Constrained LLM Translation via Reinforcement Learning
Ziang Cui, Mengran Yu, Tianjiao Li, Chenyu Shi, Yingxuan Shi, Lusheng Zhang, Hongwei Lin

TL;DR
HOMURA is a reinforcement learning framework that improves multilingual translation in LLMs by explicitly controlling output length to meet strict time constraints, addressing the verbosity bias for applications like subtitling.
Contribution
We introduce Sand-Glass, a benchmark for evaluating translation under syllable-level constraints, and propose HOMURA, a novel RL method with a dynamic reward for length control in LLM translation.
Findings
HOMURA significantly outperforms baseline models in length control.
The method maintains semantic adequacy while respecting temporal constraints.
Experimental results validate the effectiveness of the dynamic syllable-ratio reward.
Abstract
Large Language Models (LLMs) have achieved remarkable strides in multilingual translation but are hindered by a systemic cross-lingual verbosity bias, rendering them unsuitable for strict time-constrained tasks like subtitling and dubbing. Current prompt-engineering approaches struggle to resolve this conflict between semantic fidelity and rigid temporal feasibility. To bridge this gap, we first introduce Sand-Glass, a benchmark specifically designed to evaluate translation under syllable-level duration constraints. Furthermore, we propose HOMURA, a reinforcement learning framework that explicitly optimizes the trade-off between semantic preservation and temporal compliance. By employing a KL-regularized objective with a novel dynamic syllable-ratio reward, HOMURA effectively "tames" the output length. Experimental results demonstrate that our method significantly outperforms strong LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
