Decoding the Diversity: A Review of the Indic AI Research Landscape

Sankalp KJ; Vinija Jain; Sreyoshi Bhaduri; Tamoghna Roy; Aman Chadha

arXiv:2406.09559·cs.CL·June 17, 2024·2 cites

Decoding the Diversity: A Review of the Indic AI Research Landscape

Sankalp KJ, Vinija Jain, Sreyoshi Bhaduri, Tamoghna Roy, Aman Chadha

PDF

Open Access

TL;DR

This paper reviews recent research on large language models for Indic languages, highlighting challenges like limited data and linguistic complexities, and provides a taxonomy of 84 publications to guide future work.

Contribution

It offers a comprehensive taxonomy and analysis of recent Indic language LLM research, addressing key challenges and summarizing advancements in the field.

Findings

01

Researchers face data scarcity and linguistic challenges in Indic NLP.

02

There is a lack of standardization in evaluation methods.

03

Recent publications show growing interest and diverse approaches.

Abstract

This review paper provides a comprehensive overview of large language model (LLM) research directions within Indic languages. Indic languages are those spoken in the Indian subcontinent, including India, Pakistan, Bangladesh, Sri Lanka, Nepal, and Bhutan, among others. These languages have a rich cultural and linguistic heritage and are spoken by over 1.5 billion people worldwide. With the tremendous market potential and growing demand for natural language processing (NLP) based applications in diverse languages, generative applications for Indic languages pose unique challenges and opportunities for research. Our paper deep dives into the recent advancements in Indic generative modeling, contributing with a taxonomy of research directions, tabulating 84 recent publications. Research directions surveyed in this paper include LLM development, fine-tuning existing LLMs, development of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI