SemMT: A Semantic-based Testing Approach for Machine Translation Systems

Jialun Cao; Meiziniu Li; Yeting Li; Ming Wen; Shing-Chi; Cheung

arXiv:2012.01815·cs.SE·April 7, 2022

SemMT: A Semantic-based Testing Approach for Machine Translation Systems

Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi, Cheung

PDF

2 Repos

TL;DR

SemMT introduces a semantic similarity-based testing approach for machine translation systems, utilizing round-trip translation and novel metrics to improve detection of translation errors with higher accuracy.

Contribution

The paper presents SemMT, a new automatic testing method that incorporates semantic similarity metrics and round-trip translation to better evaluate machine translation quality.

Findings

01

SemMT achieves 21% higher accuracy over existing methods.

02

SemMT improves F-Score by 23% compared to state-of-the-art approaches.

03

Combining multiple metrics enhances testing effectiveness.

Abstract

Machine translation has wide applications in daily life. In mission-critical applications such as translating official documents, incorrect translation can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine translation systems. Existing methodologies mostly rely on metamorphic relations designed at the textual level (e.g., Levenshtein distance) or syntactic level (e.g., the distance between grammar structures) to determine the correctness of translation results. However, these metamorphic relations do not consider whether the original and translated sentences have the same meaning (i.e., Semantic similarity). Therefore, in this paper, we propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking. SemMT applies round-trip translation and measures the semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.