BengaliFig: A Low-Resource Challenge for Figurative and Culturally Grounded Reasoning in Bengali
Abdullah Al Sefat

TL;DR
BengaliFig introduces a challenging dataset of Bengali riddles to evaluate large language models' abilities in figurative and culturally grounded reasoning within low-resource language contexts, revealing significant gaps.
Contribution
The paper presents BengaliFig, a novel, richly annotated dataset for assessing LLMs' reasoning in Bengali, addressing a critical low-resource and cultural reasoning gap.
Findings
LLMs show weaknesses in metaphorical reasoning
Cultural reasoning remains challenging for current models
Dataset enables diagnostic evaluation of low-resource language models
Abstract
Large language models excel on broad multilingual benchmarks but remain to be evaluated extensively in figurative and culturally grounded reasoning, especially in low-resource contexts. We present BengaliFig, a compact yet richly annotated challenge set that targets this gap in Bengali, a widely spoken low-resourced language. The dataset contains 435 unique riddles drawn from Bengali oral and literary traditions. Each item is annotated along five orthogonal dimensions capturing reasoning type, trap type, cultural depth, answer category, and difficulty, and is automatically converted to multiple-choice format through a constraint-aware, AI-assisted pipeline. We evaluate eight frontier LLMs from major providers under zero-shot and few-shot chain-of-thought prompting, revealing consistent weaknesses in metaphorical and culturally specific reasoning. BengaliFig thus contributes both a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
