Beyond Scalar Scores: Reinforcement Learning for Error-Aware Quality Estimation of Machine Translation
Archchana Sindhujan, Girish A. Koushik, Shenbin Qian, Diptesh Kanojia, Constantin Or\u{a}san

TL;DR
This paper introduces a reinforcement learning framework that enhances machine translation quality estimation by incorporating error-aware rewards, especially effective for low-resource language pairs like English to Malayalam.
Contribution
It presents the first segment-level QE dataset for English-Malayalam and a novel ALOPE-RL framework that improves QE performance using error-aware, policy-based learning with limited data.
Findings
ALOPE-RL outperforms larger LLM baselines and encoder-based models.
Error-aware rewards improve translation quality reasoning.
Effective QE achieved with small-scale datasets and compact models.
Abstract
Quality Estimation (QE) aims to assess the quality of machine translation (MT) outputs without relying on reference translations, making it essential for real-world, large-scale MT evaluation. Large Language Models (LLMs) have shown significant promise in advancing the field of quality estimation of machine translation. However, most of the QE approaches solely rely on scalar quality scores, offering no explicit information about the translation errors that should drive these judgments. Moreover, for low-resource languages where annotated QE data is limited, existing approaches struggle to achieve reliable performance. To address these challenges, we introduce the first segment-level QE dataset for English to Malayalam, a severely resource-scarce language pair in the QE domain, comprising human-annotated Direct Assessment (DA) scores and Translation Quality Remarks (TQR), which are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Explainable Artificial Intelligence (XAI)
