Backdoor Attack on Multilingual Machine Translation
Jun Wang, Qiongkai Xu, Xuanli He, Benjamin I. P. Rubinstein, Trevor, Cohn

TL;DR
This paper uncovers security vulnerabilities in multilingual machine translation systems, demonstrating that minimal poisoned data can cause significant malicious translations across languages, raising concerns for low-resource language settings.
Contribution
It introduces a novel backdoor attack method on MNMT systems, showing how poisoned data in low-resource languages can compromise high-resource language translations.
Findings
Less than 0.01% poisoned data can achieve 20% attack success rate
Attack is more effective in low-resource language settings
Highlights security risks in multilingual translation systems
Abstract
While multilingual machine translation (MNMT) systems hold substantial promise, they also have security vulnerabilities. Our research highlights that MNMT systems can be susceptible to a particularly devious style of backdoor attack, whereby an attacker injects poisoned data into a low-resource language pair to cause malicious translations in other languages, including high-resource languages. Our experimental results reveal that injecting less than 0.01% poisoned data into a low-resource language pair can achieve an average 20% attack success rate in attacking high-resource language pairs. This type of attack is of particular concern, given the larger attack surface of languages inherent to low-resource settings. Our aim is to bring attention to these vulnerabilities within MNMT systems with the hope of encouraging the community to address security concerns in machine translation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
