Extend Adversarial Policy Against Neural Machine Translation via Unknown Token
Wei Zou, Shujian Huang, Jiajun Chen

TL;DR
This paper introduces DexChar, a novel adversarial policy for neural machine translation that uses character perturbations and improved feedback mechanisms to enhance robustness testing, especially where existing methods fail.
Contribution
It proposes DexChar, an adversarial policy incorporating character perturbations and refined self-supervised matching, improving adversarial example generation for NMT robustness.
Findings
Compatible with scenarios where baseline adversaries fail
Generates high-efficiency adversarial examples
Enhances analysis and optimization of NMT systems
Abstract
Generating adversarial examples contributes to mainstream neural machine translation~(NMT) robustness. However, popular adversarial policies are apt for fixed tokenization, hindering its efficacy for common character perturbations involving versatile tokenization. Based on existing adversarial generation via reinforcement learning~(RL), we propose the `DexChar policy' that introduces character perturbations for the existing mainstream adversarial policy based on token substitution. Furthermore, we improve the self-supervised matching that provides feedback in RL to cater to the semantic constraints required during training adversaries. Experiments show that our method is compatible with the scenario where baseline adversaries fail, and can generate high-efficiency adversarial examples for analysis and optimization of the system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
