LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation

Samar M. Magdy; Fakhraddin Alwajih; Abdellah El Mekki; Wesam El-Sayed; Muhammad Abdul-Mageed

arXiv:2604.18490·cs.CL·April 21, 2026

LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation

Samar M. Magdy, Fakhraddin Alwajih, Abdellah El Mekki, Wesam El-Sayed, Muhammad Abdul-Mageed

PDF

1 Repo

TL;DR

LQM introduces a hierarchical, linguistically grounded error taxonomy for evaluating machine translation, especially effective for diglossic languages like Arabic, combining human annotation and automatic metrics.

Contribution

It presents a novel, linguistically motivated multidimensional error taxonomy and a comprehensive Arabic dialect corpus for improved MT evaluation.

Findings

01

LQM effectively diagnoses errors across six linguistic levels.

02

Expert annotations reveal detailed error patterns in Arabic dialects.

03

LQM's framework is adaptable to other languages.

Abstract

Existing MT evaluation frameworks, including automatic metrics and human evaluation schemes such as Multidimensional Quality Metrics (MQM), are largely language-agnostic. However, they often fail to capture dialect- and culture-specific errors in diglossic languages (e.g., Arabic), where translation failures stem from mismatches in language variety, content coverage, and pragmatic appropriateness rather than surface form alone.We introduce LQM: Linguistically Motivated Multidimensional Quality Metrics for MT. LQM is a hierarchical error taxonomy for diagnosing MT errors through six linguistically grounded levels: sociolinguistics, pragmatics, semantics, morphosyntax, orthography, and graphetics (Figure 1). We construct a bidirectional parallel corpus of 3,850 sentences (550 per variety) spanning seven Arabic dialects (Egyptian, Emirati, Jordanian, Mauritanian, Moroccan, Palestinian, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

UBC-NLP/LQM_MT
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.