PEGRL: Improving Machine Translation by Post-Editing Guided Reinforcement Learning

Yunzhi Shen; Hao Zhou; Xin Huang; Xue Han; Junlan Feng; Shujian Huang

arXiv:2602.03352·cs.CL·May 19, 2026

PEGRL: Improving Machine Translation by Post-Editing Guided Reinforcement Learning

Yunzhi Shen, Hao Zhou, Xin Huang, Xue Han, Junlan Feng, Shujian Huang

PDF

1 Repo

TL;DR

PEGRL introduces a two-stage reinforcement learning framework utilizing post-editing as an auxiliary task to enhance machine translation quality, stabilizing training and improving sample efficiency.

Contribution

The paper proposes PEGRL, a novel RL framework that leverages post-editing to guide training, balancing exploration and local optimization for better translation performance.

Findings

01

Consistent improvements over RL baselines in multiple language pairs.

02

Performance on English-Turkish translation comparable to advanced LLM systems.

03

Effective stabilization of RL training through post-editing auxiliary tasks.

Abstract

Reinforcement learning (RL) has shown strong promise for LLM-based machine translation, with recent methods such as GRPO demonstrating notable gains; nevertheless, translation-oriented RL remains challenged by noisy learning signals arising from Monte Carlo return estimation, as well as a large trajectory space that favors global exploration over fine-grained local optimization. We introduce \textbf{PEGRL}, a \textit{two-stage} RL framework that uses post-editing as an auxiliary task to stabilize training and guide overall optimization. At each iteration, translation outputs are sampled to construct post-editing inputs, allowing return estimation in the post-editing stage to benefit from conditioning on the current translation behavior, while jointly supporting both global exploration and fine-grained local optimization. A task-specific weighting scheme further balances the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NJUNLP/peg-rl
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications