Multilingual Image Description with Neural Sequence Models

Desmond Elliott; Stella Frank; Eva Hasler

arXiv:1510.04709·cs.CL·November 19, 2015·74 cites

Multilingual Image Description with Neural Sequence Models

Desmond Elliott, Stella Frank, Eva Hasler

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural sequence model that generates multilingual image descriptions by integrating image features and source language descriptions, significantly improving translation quality across languages.

Contribution

It presents a novel multi-language image description model combining neural machine translation and image captioning techniques, enhancing multilingual description generation.

Findings

01

Significant BLEU4 and Meteor score improvements with multi-language training

02

Effective integration of image features and source language descriptions

03

Enhanced cross-lingual image description performance

Abstract

In this paper we present an approach to multi-language image description bringing together insights from neural machine translation and neural image description. To create a description of an image for a given target language, our sequence generation models condition on feature vectors from the image, the description from the source language, and/or a multimodal vector computed over the image and a description in the source language. In image description experiments on the IAPR-TC12 dataset of images aligned with English and German sentences, we find significant and substantial improvements in BLEU4 and Meteor scores for models trained over multiple languages, compared to a monolingual baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

elliottd/GroundedTranslation
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Natural Language Processing Techniques