Large Multimodal Models for Low-Resource Languages: A Survey

Marian Lupascu; Ana-Cristina Rogoz; Mihai Sorin Stupariu; Radu Tudor Ionescu

arXiv:2502.05568·cs.CL·February 3, 2026

Large Multimodal Models for Low-Resource Languages: A Survey

Marian Lupascu, Ana-Cristina Rogoz, Mihai Sorin Stupariu, Radu Tudor Ionescu

PDF

TL;DR

This survey reviews techniques for adapting large multimodal models to low-resource languages, highlighting the importance of visual information and identifying key challenges like hallucination and efficiency.

Contribution

It provides a comprehensive categorization and analysis of 117 studies on LMM adaptation for low-resource languages, offering insights into current methods and challenges.

Findings

01

Visual information enhances model performance in LR settings

02

Significant challenges include hallucination and computational efficiency

03

Resource and method-oriented approaches are systematically categorized

Abstract

In this survey, we systematically analyze techniques used to adapt large multimodal models (LMMs) for low-resource (LR) languages, examining approaches ranging from visual enhancement and data creation to cross-modal transfer and fusion strategies. Through a comprehensive analysis of 117 studies across 96 LR languages, we identify key patterns in how researchers tackle the challenges of limited data and computational resources. We categorize works into resource-oriented and method-oriented contributions, further dividing contributions into relevant sub-categories. We compare method-oriented contributions in terms of performance and efficiency, discussing benefits and limitations of representative studies. We find that visual information often serves as a crucial bridge for improving model performance in LR settings, though significant challenges remain in areas such as hallucination…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.