Multilingual Multimodality: A Taxonomical Survey of Datasets, Techniques, Challenges and Opportunities
Khyathi Raghavi Chandu, Alborz Geramifard

TL;DR
This survey comprehensively catalogs and analyzes the intersection of multilingual and multimodal research in NLP, highlighting datasets, techniques, challenges, and future opportunities in the MultiX paradigm.
Contribution
It provides a structured overview of MultiX tasks, datasets, and methods, and offers insights into current trends, challenges, and future research directions in multilingual multimodal NLP.
Findings
Most research centers on English and text modality.
Various datasets with parallel annotations are used across languages.
Modeling approaches have distinct strengths and weaknesses.
Abstract
Contextualizing language technologies beyond a single language kindled embracing multiple modalities and languages. Individually, each of these directions undoubtedly proliferated into several NLP tasks. Despite this momentum, most of the multimodal research is primarily centered around English and multilingual research is primarily centered around contexts from text modality. Challenging this conventional setup, researchers studied the unification of multilingual and multimodal (MultiX) streams. The main goal of this work is to catalogue and characterize these works by charting out the categories of tasks, datasets and methods to address MultiX scenarios. To this end, we review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling. We present an account of the modeling approaches along with their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
