Robust Neural Machine Translation: Modeling Orthographic and   Interpunctual Variation

Toms Bergmanis; Art\=urs Stafanovi\v{c}s; M\=arcis Pinnis

arXiv:2009.05460·cs.CL·September 15, 2020

Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation

Toms Bergmanis, Art\=urs Stafanovi\v{c}s, M\=arcis Pinnis

PDF

TL;DR

This paper introduces a noise-robust training method for neural machine translation that improves translation quality on noisy, informal texts by using adversarial data augmentation, significantly enhancing robustness and consistency.

Contribution

The paper proposes a simple generative noise model for adversarial data augmentation, improving NMT robustness to orthographic and punctuation variations in noisy texts.

Findings

01

Systems trained with adversarial examples perform nearly as well on noisy data as on clean data.

02

Baseline systems' performance drops by 2-3 BLEU points on noisy data.

03

Adversarial training yields 50% improvement in translation consistency.

Abstract

Neural machine translation systems typically are trained on curated corpora and break when faced with non-standard orthography or punctuation. Resilience to spelling mistakes and typos, however, is crucial as machine translation systems are used to translate texts of informal origins, such as chat conversations, social media posts and web pages. We propose a simple generative noise model to generate adversarial examples of ten different types. We use these to augment machine translation systems' training data and show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data, while baseline systems' performance drops by 2-3 BLEU points. To measure the robustness and noise invariance of machine translation systems' outputs, we use the average translation edit rate between the translation of the original sentence and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.