Multimodal Machine Translation with Embedding Prediction
Tosho Hirasawa, Hayahide Yamagishi, Yukio Matsumura, Mamoru, Komachi

TL;DR
This paper enhances multimodal neural machine translation by integrating pretrained embeddings to better translate rare words, resulting in significant improvements in translation quality and rare word accuracy.
Contribution
It combines pretrained embeddings with multimodal NMT to improve rare word translation, demonstrating notable performance gains.
Findings
Improved METEOR score by 1.24
Enhanced BLEU score by 2.49
Achieved 7.67 F-score increase for rare words
Abstract
Multimodal machine translation is an attractive application of neural machine translation (NMT). It helps computers to deeply understand visual objects and their relations with natural languages. However, multimodal NMT systems suffer from a shortage of available training data, resulting in poor performance for translating rare words. In NMT, pretrained word embeddings have been shown to improve NMT of low-resource domains, and a search-based approach is proposed to address the rare word problem. In this study, we effectively combine these two approaches in the context of multimodal NMT and explore how we can take full advantage of pretrained word embeddings to better translate rare words. We report overall performance improvements of 1.24 METEOR and 2.49 BLEU and achieve an improvement of 7.67 F-score for rare word translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
