From Facts to Folklore: Evaluating Large Language Models on Bengali Cultural Knowledge
Nafis Chowdhury, Moinul Haque, Anika Ahmed, Nazia Tasnim, Md. Istiak Hossain Shihab, Sajjadur Rahman, and Farig Sadeque

TL;DR
This paper introduces the BLanCK dataset to evaluate large language models on Bengali cultural knowledge, revealing their struggles with cultural nuances and the importance of context-aware training.
Contribution
It presents a new culturally focused dataset for Bengali, highlighting gaps in LLMs' cultural understanding and emphasizing the role of context in improving performance.
Findings
LLMs perform well in non-cultural tasks
Models struggle with cultural knowledge
Context improves model performance significantly
Abstract
Recent progress in NLP research has demonstrated remarkable capabilities of large language models (LLMs) across a wide range of tasks. While recent multilingual benchmarks have advanced cultural evaluation for LLMs, critical gaps remain in capturing the nuances of low-resource cultures. Our work addresses these limitations through a Bengali Language Cultural Knowledge (BLanCK) dataset including folk traditions, culinary arts, and regional dialects. Our investigation of several multilingual language models shows that while these models perform well in non-cultural categories, they struggle significantly with cultural knowledge and performance improves substantially across all models when context is provided, emphasizing context-aware architectures and culturally curated training data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Computational and Text Analysis Methods · Artificial Intelligence in Games
