OSU Multimodal Machine Translation System Report

Mingbo Ma; Dapeng Li; Kai Zhao; Liang Huang

arXiv:1710.02718·cs.CL·December 15, 2017·1 cites

OSU Multimodal Machine Translation System Report

Mingbo Ma, Dapeng Li, Kai Zhao, Liang Huang

PDF

Open Access

TL;DR

This paper presents OSU's multimodal machine translation system that leverages shared images to improve translation quality for image caption datasets, achieving top TER results in English-German translation on MSCOCO.

Contribution

Introduces a simple multimodal translation system using shared images for encoding and decoding, enhancing translation performance on caption datasets.

Findings

01

Achieved best TER score for English-German on MSCOCO

02

System performs effectively on in-domain and out-of-domain datasets

03

Utilizes shared images to improve translation accuracy

Abstract

This paper describes Oregon State University's submissions to the shared WMT'17 task "multimodal translation task I". In this task, all the sentence pairs are image captions in different languages. The key difference between this task and conventional machine translation is that we have corresponding images as additional information for each sentence pair. In this paper, we introduce a simple but effective system which takes an image shared between different languages, feeding it into the both encoding and decoding side. We report our system's performance for English-French and English-German with Flickr30K (in-domain) and MSCOCO (out-of-domain) datasets. Our system achieves the best performance in TER for English-German for MSCOCO dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling