As Easy as 1, 2, 3: Behavioural Testing of NMT Systems for Numerical   Translation

Jun Wang; Chang Xu; Francisco Guzman; Ahmed El-Kishky; Benjamin I. P.; Rubinstein; Trevor Cohn

arXiv:2107.08357·cs.CL·July 20, 2021

As Easy as 1, 2, 3: Behavioural Testing of NMT Systems for Numerical Translation

Jun Wang, Chang Xu, Francisco Guzman, Ahmed El-Kishky, Benjamin I. P., Rubinstein, Trevor Cohn

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive behavioral testing framework to evaluate neural machine translation systems' ability to accurately translate numerical data, revealing widespread issues and novel errors across various languages.

Contribution

It develops new test examples and assessment methods to expose numerical mistranslation in NMT systems, highlighting a general problem and proposing mitigation strategies.

Findings

01

Major commercial and research NMT systems fail on many numerical test cases.

02

Numerical mistranslation is prevalent across high- and low-resource languages.

03

The study uncovers previously unreported errors in NMT systems.

Abstract

Mistranslated numbers have the potential to cause serious effects, such as financial loss or medical misinformation. In this work we develop comprehensive assessments of the robustness of neural machine translation systems to numerical text via behavioural testing. We explore a variety of numerical translation capabilities a system is expected to exhibit and design effective test examples to expose system underperformance. We find that numerical mistranslation is a general issue: major commercial systems and state-of-the-art research models fail on many of our test examples, for high- and low-resource languages. Our tests reveal novel errors that have not previously been reported in NMT systems, to the best of our knowledge. Lastly, we discuss strategies to mitigate numerical mistranslation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JunW15/NumberTest
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research