Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation
Di Wu, Seth Aycock, Christof Monz

TL;DR
This paper investigates whether human-like reasoning improves translation in large language models, finding that explicit decomposition does not necessarily enhance performance and that self-refinement can outperform step-by-step prompting.
Contribution
The study challenges the effectiveness of explicit reasoning decomposition in LLM translation, highlighting the benefits of self-refinement over human-like step-by-step prompts.
Findings
No clear performance benefit from explicit decomposition.
Self-refinement surpasses human-like step-by-step prompting.
Decomposition affects translation behavior with mixed effects.
Abstract
Large Language Models (LLMs) demonstrate strong reasoning capabilities for many tasks, often by explicitly decomposing the task via Chain-of-Thought (CoT) reasoning. Recent work on LLM-based translation designs hand-crafted prompts to decompose translation, or trains models to incorporate intermediate steps. Translating Step-by-step (Briakou et al., 2024), for instance, introduces a multi-step prompt with decomposition and refinement of translation with LLMs, which achieved state-of-the-art results on WMT24 test data. In this work, we scrutinise this strategy's effectiveness. Empirically, we find no clear evidence that performance gains stem from explicitly decomposing the translation process via CoT, at least for the models on test; and we show prompting LLMs to 'translate again' and self-refine yields even better results than human-like step-by-step prompting. While the decomposition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
