ParamBench: A Graduate-Level Benchmark for Evaluating LLM Understanding on Indic Subjects

Ayush Maheshwari; Kaushal Sharma; Vivek Patel; Aditya Maheshwari

arXiv:2508.16185·cs.CL·October 9, 2025

ParamBench: A Graduate-Level Benchmark for Evaluating LLM Understanding on Indic Subjects

Ayush Maheshwari, Kaushal Sharma, Vivek Patel, Aditya Maheshwari

PDF

1 Datasets

TL;DR

ParamBench is a comprehensive Hindi-language benchmark with over 17,000 graduate-level questions across 21 Indian subjects, designed to evaluate LLM understanding of culturally grounded, complex disciplinary topics.

Contribution

This paper introduces ParamBench, a novel large-scale, culturally specific benchmark for assessing LLMs on Indian graduate-level questions across diverse formats and subjects.

Findings

01

Gemma3-27B achieves 56.4% accuracy overall.

02

LLMs perform poorly on music, instruments, and law topics.

03

Culturally grounded reasoning remains a challenge for current LLMs.

Abstract

Large language models have been widely evaluated on tasks such as comprehension, summarization, code generation, etc. However, their performance on graduate-level, culturally grounded questions in the Indian context remains largely unexplored. Existing Indian benchmarks emphasise basic fact-orientated queries that offer limited assessment of a deeper disciplinary understanding tailored to the Indian setting. In this paper, we present ParamBench, consisting of more than 17K questions in the Hindi language, comprising questionnaires from 21 diverse subjects. These questions are primarily derived from a nationwide graduate-level entrance examination covering topics such as history, music, instruments, yoga, literature, philosophy, law, etc.~ specifically for the Indian context. Additionally, we assess the ability of LLMs to handle diverse question formats - such as list-based matching,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

bharatgenai/ParamBench
dataset· 18 dl
18 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.