TL;DR
SemMT introduces a semantic similarity-based testing approach for machine translation systems, utilizing round-trip translation and novel metrics to improve detection of translation errors with higher accuracy.
Contribution
The paper presents SemMT, a new automatic testing method that incorporates semantic similarity metrics and round-trip translation to better evaluate machine translation quality.
Findings
SemMT achieves 21% higher accuracy over existing methods.
SemMT improves F-Score by 23% compared to state-of-the-art approaches.
Combining multiple metrics enhances testing effectiveness.
Abstract
Machine translation has wide applications in daily life. In mission-critical applications such as translating official documents, incorrect translation can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine translation systems. Existing methodologies mostly rely on metamorphic relations designed at the textual level (e.g., Levenshtein distance) or syntactic level (e.g., the distance between grammar structures) to determine the correctness of translation results. However, these metamorphic relations do not consider whether the original and translated sentences have the same meaning (i.e., Semantic similarity). Therefore, in this paper, we propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking. SemMT applies round-trip translation and measures the semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
