IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task   Language Understanding

Sankalp KJ; Ashutosh Kumar; Laxmaan Balaji; Nikunj Kotecha; Vinija; Jain; Aman Chadha; Sreyoshi Bhaduri

arXiv:2501.15747·cs.CL·January 29, 2025

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Sankalp KJ, Ashutosh Kumar, Laxmaan Balaji, Nikunj Kotecha, Vinija, Jain, Aman Chadha, Sreyoshi Bhaduri

PDF

Open Access 2 Datasets

TL;DR

IndicMMLU-Pro is a comprehensive benchmark for evaluating large language models on multiple Indic languages across diverse tasks, aiming to advance culturally sensitive NLP research in the Indian subcontinent.

Contribution

The paper introduces a new benchmark specifically designed for Indic languages, covering multiple tasks and languages, with baseline results for state-of-the-art models.

Findings

01

Benchmark covers major Indic languages and tasks.

02

Baseline models show varied performance across languages.

03

Framework promotes development of culturally aware NLP models.

Abstract

Known by more than 1.5 billion people in the Indian subcontinent, Indic languages present unique challenges and opportunities for natural language processing (NLP) research due to their rich cultural heritage, linguistic diversity, and complex structures. IndicMMLU-Pro is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) across Indic languages, building upon the MMLU Pro (Massive Multitask Language Understanding) framework. Covering major languages such as Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi, Tamil, Telugu, and Urdu, our benchmark addresses the unique challenges and opportunities presented by the linguistic diversity of the Indian subcontinent. This benchmark encompasses a wide range of tasks in language comprehension, reasoning, and generation, meticulously crafted to capture the intricacies of Indian languages. IndicMMLU-Pro provides a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques