Visual Cues and Error Correction for Translation Robustness
Zhenhao Li, Marek Rei, Lucia Specia

TL;DR
This paper proposes using visual context and error correction training to enhance neural machine translation robustness against human-like noise, maintaining performance on clean texts.
Contribution
It introduces a multimodal approach with visual cues and a novel error correction training regime to improve translation robustness.
Findings
Visual context improves noise robustness
Error correction training enhances translation stability
Models retain quality on clean texts
Abstract
Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions. Existing robustness techniques generally fail when faced with unseen types of noise and their performance degrades on clean texts. In this paper, we focus on three types of realistic noise that are commonly generated by humans and introduce the idea of visual context to improve translation robustness for noisy texts. In addition, we describe a novel error correction training regime that can be used as an auxiliary task to further improve translation robustness. Experiments on English-French and English-German translation show that both multimodal and error correction components improve model robustness to noisy texts, while still retaining translation quality on clean texts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
