CUNI System for the WMT17 Multimodal Translation Task

Jind\v{r}ich Helcl; Jind\v{r}ich Libovick\'y

arXiv:1707.04550·cs.CL·July 17, 2017

CUNI System for the WMT17 Multimodal Translation Task

Jind\v{r}ich Helcl, Jind\v{r}ich Libovick\'y

PDF

TL;DR

This paper details CUNI's submissions to the WMT17 Multimodal Translation Task, focusing on neural translation and cross-lingual captioning, utilizing data augmentation and translation pipelines.

Contribution

It introduces a purely textual neural translation system enhanced with data synthesis and demonstrates a pipeline for cross-lingual image captioning.

Findings

01

Data augmentation improved translation quality

02

Back-translation contributed to better model performance

03

Negative results highlight potential directions for future research

Abstract

In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with back-translation. For Task 2 (cross-lingual image captioning), our best submitted system generates an English caption which is then translated by the best system used in Task 1. We also present negative results, which are based on ideas that we believe have potential of making improvements, but did not prove to be useful in our particular setup.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.