The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs using Indian Riddles
Abhinav P M, Ojasva Saxena, Oswald C, Parameswari Krishnamurthy

TL;DR
This study evaluates multilingual reasoning and self-awareness in large language models across seven Indian languages using a new riddle dataset, revealing significant gaps in reasoning accuracy and self-assessment capabilities.
Contribution
Introduces a multilingual Indian riddle dataset and systematically assesses reasoning and self-awareness of five LLMs across multiple languages and prompting strategies.
Findings
Gemini 2.5 Pro performs best overall in riddle-solving.
Self-evaluation accuracy is inversely related to initial reasoning accuracy.
LLaMA 4 Scout shows higher self-awareness than top-performing models.
Abstract
The extent to which large language models (LLMs) can perform culturally grounded reasoning across non-English languages remains underexplored. This paper examines the reasoning and self-assessment abilities of LLMs across seven major Indian languages-Bengali, Gujarati, Hindi, Kannada, Malayalam, Tamil, and Telugu. We introduce a multilingual riddle dataset combining traditional riddles with context-reconstructed variants and evaluate five LLMs-Gemini 2.5 Pro, Gemini 2.5 Flash, Mistral-Saba, LLaMA 4 Scout, and LLaMA 4 Maverick-under seven prompting strategies. In the first stage, we assess riddle-solving performance and find that while Gemini 2.5 Pro performs best overall, few-shot methods yield only marginal gains, and accuracy varies notably across languages. In the second stage, we conduct a self-evaluation experiment to measure reasoning consistency. The results reveal a key finding:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
