Beyond Triplet: Leveraging the Most Data for Multimodal Machine   Translation

Yaoming Zhu; Zewei Sun; Shanbo Cheng; Luyang Huang; Liwei Wu; Mingxuan; Wang

arXiv:2212.10313·cs.CL·September 6, 2023·1 cites

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

Yaoming Zhu, Zewei Sun, Shanbo Cheng, Luyang Huang, Liwei Wu, Mingxuan, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new framework and dataset for multimodal machine translation that leverages large-scale non-triple data, improving translation quality in realistic scenarios and outperforming existing models.

Contribution

It proposes a 2/3-Triplet framework utilizing monolingual and parallel text data, and constructs the EMMT dataset for more practical evaluation of MMT systems.

Findings

01

Significant performance improvement with non-triple data.

02

Outperforms state-of-the-art models on benchmarks.

03

Better suitability for real-world applications.

Abstract

Multimodal machine translation (MMT) aims to improve translation quality by incorporating information from other modalities, such as vision. Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets. These studies face two challenges. First, they can only utilize triple data (bilingual texts with images), which is scarce; second, current benchmarks are relatively restricted and do not correspond to realistic scenarios. Therefore, this paper correspondingly establishes new methods and new datasets for MMT. First, we propose a framework 2/3-Triplet with two new approaches to enhance MMT by utilizing large-scale non-triple data: monolingual image-text data and parallel text-only data. Second, we construct an English-Chinese {e}-commercial {m}ulti{m}odal {t}ranslation dataset (including training and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yaoming95/23triplet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Translation Studies and Practices

MethodsTest