When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms

Adib Sakhawat; Shamim Ara Parveen; Md Ruhul Amin; Shamim Al Mahmud; Md Saiful Islam; Tahera Khatun

arXiv:2602.12921·cs.CL·February 16, 2026

When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms

Adib Sakhawat, Shamim Ara Parveen, Md Ruhul Amin, Shamim Al Mahmud, Md Saiful Islam, Tahera Khatun

PDF

Open Access

TL;DR

This paper introduces a comprehensive Bengali idiom dataset and benchmark to evaluate and improve large language models' understanding of figurative language in low-resource, culturally-rich contexts, revealing significant performance gaps.

Contribution

The creation of a large-scale, culturally-grounded Bengali idiom dataset with detailed annotations and a benchmark for evaluating LLMs' figurative language understanding.

Findings

01

No model exceeds 50% accuracy on the benchmark.

02

Human performance is 83.4%, highlighting the models' limitations.

03

The dataset and benchmark reveal significant gaps in LLMs' cultural and figurative understanding.

Abstract

Figurative language understanding remains a significant challenge for Large Language Models (LLMs), especially for low-resource languages. To address this, we introduce a new idiom dataset, a large-scale, culturally-grounded corpus of 10,361 Bengali idioms. Each idiom is annotated under a comprehensive 19-field schema, established and refined through a deliberative expert consensus process, that captures its semantic, syntactic, cultural, and religious dimensions, providing a rich, structured resource for computational linguistics. To establish a robust benchmark for Bangla figurative language understanding, we evaluate 30 state-of-the-art multilingual and instruction-tuned LLMs on the task of inferring figurative meaning. Our results reveal a critical performance gap, with no model surpassing 50% accuracy, a stark contrast to significantly higher human performance (83.4%). This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage, Metaphor, and Cognition · Sentiment Analysis and Opinion Mining · Topic Modeling