Distilling Translations with Visual Awareness

Julia Ive; Pranava Madhyastha; Lucia Specia

arXiv:1906.07701·cs.CL·June 19, 2019·1 cites

Distilling Translations with Visual Awareness

Julia Ive, Pranava Madhyastha, Lucia Specia

PDF

Open Access 1 Repo

TL;DR

This paper introduces a translate-and-refine method that effectively incorporates visual context in multimodal translation, improving translation quality and robustness against source errors, achieving state-of-the-art results.

Contribution

It presents a novel joint training approach where images are used in a second decoding stage to enhance translation accuracy and handle source errors.

Findings

01

Achieves state-of-the-art translation performance.

02

Improves handling of ambiguous words with visual context.

03

Recovers from erroneous or missing source words.

Abstract

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient. As a consequence, models tend to learn to ignore this information. We propose a translate-and-refine approach to this problem where images are only used by a second stage decoder. This approach is trained jointly to generate a good first draft translation and to improve over this draft by (i) making better use of the target language textual context (both left and right-side contexts) and (ii) making use of visual context. This approach leads to the state of the art results. Additionally, we show that it has the ability to recover from erroneous or missing words in the source language.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ImperialNLP/MMT-Delib
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling