Large Scale Multi-Lingual Multi-Modal Summarization Dataset

Yash Verma; Anubhav Jangra; Raghvendra Kumar; Sriparna Saha

arXiv:2302.06560·cs.CL·February 14, 2023

Large Scale Multi-Lingual Multi-Modal Summarization Dataset

Yash Verma, Anubhav Jangra, Raghvendra Kumar, Sriparna Saha

PDF

Open Access 1 Repo

TL;DR

This paper introduces M3LS, the largest multi-lingual multi-modal summarization dataset with over a million document-image pairs across 20 languages, enabling advanced research in multi-modal and multi-lingual summarization.

Contribution

It provides the first large-scale, diverse multi-lingual multi-modal dataset for summarization, along with formal task definition and baseline evaluations.

Findings

01

M3LS is the largest multi-lingual multi-modal summarization dataset to date.

02

Baseline models show varying performance across languages and modalities.

03

The dataset offers new challenges and opportunities for multi-modal, multi-lingual summarization research.

Abstract

Significant developments in techniques such as encoder-decoder models have enabled us to represent information comprising multiple modalities. This information can further enhance many downstream tasks in the field of information retrieval and natural language processing; however, improvements in multi-modal techniques and their performance evaluation require large-scale multi-modal data which offers sufficient diversity. Multi-lingual modeling for a variety of tasks like multi-modal summarization, text generation, and translation leverages information derived from high-quality multi-lingual annotated data. In this work, we present the current largest multi-lingual multi-modal summarization dataset (M3LS), and it consists of over a million instances of document-image pairs along with a professionally annotated multi-modal summary for each pair. It is derived from news articles published…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zenquiorra/m3ls
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications