What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation
Shaomu Tan, Dawei Zhu, Ke Tran, Michael Denkowski, Sony Trenous, Bill Byrne, Leonardo Ribeiro, Felix Hieber

TL;DR
This systematic study investigates how iterative self-refinement in large language models affects document-level literary translation, revealing that simple, general refinement strategies improve fluency, style, and terminology more reliably than targeted error correction.
Contribution
The paper provides a comprehensive analysis of document-level LLM refinement strategies, identifying effective pipelines, and clarifying their impact on translation quality and limitations.
Findings
Document-level MT followed by segment-level refinement is most effective.
General refinement prompts outperform error-specific prompts.
Refinement improves fluency, style, and terminology, but less so adequacy.
Abstract
Iterative self-refinement is a simple inference-time strategy for machine translation: an LLM revises its own translation over multiple inference-time passes. Yet document-scale refinement remains poorly understood: 1) which pipelines work best, 2) what quality dimensions improve, and 3) how refiners behave. In this paper, we present a systematic study of document-level literary translation, covering nine LLMs and seven language pairs. Across nine translation-refinement granularity combinations and five refinement strategies, we find a robust recipe: document-level MT followed by segment-level refinement yields strong and stable improvements. In contrast, document-level refinement often makes fewer edits and leads to smaller or less reliable gains. Beyond granularity, A simple general refinement prompt consistently outperforms error-specific prompting and evaluate-then-refine schemes.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
