Visual Agreement Regularized Training for Multi-Modal Machine   Translation

Pengcheng Yang; Boxing Chen; Pei Zhang; Xu Sun

arXiv:1912.12014·cs.CL·December 30, 2019·1 cites

Visual Agreement Regularized Training for Multi-Modal Machine Translation

Pengcheng Yang, Boxing Chen, Pei Zhang, Xu Sun

PDF

Open Access

TL;DR

This paper introduces a visual agreement regularized training method for multi-modal machine translation, jointly training translation models to better utilize visual information and improve translation accuracy.

Contribution

It proposes a novel training approach that encourages models to focus consistently on visual features, along with a multi-head co-attention mechanism for enhanced visual-text interaction.

Findings

01

Outperforms baseline models on Multi30k dataset

02

Improves attention agreement on visual features

03

Enhances use of visual information in translation

Abstract

Multi-modal machine translation aims at translating the source sentence into a different language in the presence of the paired image. Previous work suggests that additional visual information only provides dispensable help to translation, which is needed in several very special cases such as translating ambiguous words. To make better use of visual information, this work presents visual agreement regularized training. The proposed approach jointly trains the source-to-target and target-to-source translation models and encourages them to share the same focus on the visual information when generating semantically equivalent visual words (e.g. "ball" in English and "ballon" in French). Besides, a simple yet effective multi-head co-attention model is also introduced to capture interactions between visual and textual features. The results show that our approaches can outperform competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling