IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and   Languages

Emanuele Bugliarello; Fangyu Liu; Jonas Pfeiffer; Siva Reddy; and Desmond Elliott; Edoardo Maria Ponti; Ivan Vuli\'c

arXiv:2201.11732·cs.CL·July 19, 2022·25 cites

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Emanuele Bugliarello, Fangyu Liu, Jonas Pfeiffer, Siva Reddy, and Desmond Elliott, Edoardo Maria Ponti, Ivan Vuli\'c

PDF

Open Access 3 Repos

TL;DR

IGLUE is a comprehensive multilingual vision-and-language benchmark that evaluates transfer learning across 20 languages, highlighting the challenges and factors influencing model performance in zero-shot and few-shot settings.

Contribution

The paper introduces IGLUE, the first extensive multilingual benchmark for vision-and-language tasks, enabling evaluation of transfer learning across diverse languages and tasks.

Findings

01

Translate-test transfer outperforms zero-shot transfer.

02

Few-shot learning remains challenging for many tasks.

03

Performance correlates with unlabelled textual data availability.

Abstract

Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. To fill this gap, we introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together - by both aggregating pre-existing datasets and creating new ones - visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups. Based on the evaluation of the available state-of-the-art models, we find that translate-test transfer is superior to zero-shot transfer and that few-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning