Detecting Word Sense Disambiguation Biases in Machine Translation for   Model-Agnostic Adversarial Attacks

Denis Emelin; Ivan Titov; Rico Sennrich

arXiv:2011.01846·cs.CL·November 4, 2020

Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks

Denis Emelin, Ivan Titov, Rico Sennrich

PDF

1 Repo

TL;DR

This paper investigates how neural machine translation models' word sense disambiguation errors are influenced by dataset artifacts and introduces methods to predict and attack these errors to assess model robustness.

Contribution

It presents a novel approach to predict disambiguation errors using statistical data properties and develops an adversarial attack strategy to evaluate model robustness.

Findings

01

Disambiguation robustness varies across domains.

02

Models trained on the same data have different vulnerabilities.

03

Dataset artifacts significantly influence disambiguation errors.

Abstract

Word sense disambiguation is a well-known source of translation errors in NMT. We posit that some of the incorrect disambiguation choices are due to models' over-reliance on dataset artifacts found in training data, specifically superficial word co-occurrences, rather than a deeper understanding of the source text. We introduce a method for the prediction of disambiguation errors based on statistical data properties, demonstrating its effectiveness across several domains and model types. Moreover, we develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors to further probe the robustness of translation models. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

demelin/detecting_wsd_biases_for_nmt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.