Ara-HOPE: Human-Centric Post-Editing Evaluation for Dialectal Arabic to Modern Standard Arabic Translation
Abdullah Alabdullah, Lifeng Han, Chenghua Lin

TL;DR
Ara-HOPE is a human-centric evaluation framework for Dialectal Arabic to Modern Standard Arabic translation that addresses dialect-specific errors, providing a systematic and actionable assessment method to improve translation quality.
Contribution
It introduces a new error taxonomy and annotation protocol tailored for dialectal Arabic MT evaluation, filling a gap in existing automatic and human evaluation methods.
Findings
Dialect-specific terminology remains a major challenge.
Semantic preservation is difficult across systems.
Ara-HOPE effectively differentiates system performance.
Abstract
Dialectal Arabic to Modern Standard Arabic (DA-MSA) translation is a challenging task in Machine Translation (MT) due to significant lexical, syntactic, and semantic divergences between Arabic dialects and MSA. Existing automatic evaluation metrics and general-purpose human evaluation frameworks struggle to capture dialect-specific MT errors, hindering progress in translation assessment. This paper introduces Ara-HOPE, a human-centric post-editing evaluation framework designed to systematically address these challenges. The framework includes a five-category error taxonomy and a decision-tree annotation protocol. Through comparative evaluation of three MT systems (Arabic-centric Jais, general-purpose GPT-3.5, and baseline NLLB-200), Ara-HOPE effectively highlights systematic performance differences between these systems. Our results show that dialect-specific terminology and semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
