BengaliMoralBench: A Benchmark for Auditing Moral Reasoning in Large Language Models within Bengali Language and Culture

Shahriyar Zaman Ridoy; Azmine Toushik Wasi; Koushik Ahamed Tonmoy; Taki Hasan Rafi; Dong-Kyu Chae

arXiv:2511.03180·cs.CL·April 21, 2026

BengaliMoralBench: A Benchmark for Auditing Moral Reasoning in Large Language Models within Bengali Language and Culture

Shahriyar Zaman Ridoy, Azmine Toushik Wasi, Koushik Ahamed Tonmoy, Taki Hasan Rafi, Dong-Kyu Chae

PDF

1 Datasets

TL;DR

BengaliMoralBench is a comprehensive ethics benchmark for evaluating large language models' moral reasoning within Bengali cultural and linguistic contexts, addressing the lack of culturally grounded assessment tools.

Contribution

It introduces a large-scale, culturally nuanced ethics benchmark for Bengali, covering five moral domains and evaluating models across multiple ethical perspectives.

Findings

01

Models show significant variation in moral reasoning performance.

02

Current LLMs exhibit weaknesses in cultural grounding and fairness.

03

Benchmark reveals critical limitations in non-Western moral understanding.

Abstract

As multilingual Large Language Models (LLMs) gain traction across South Asia, their alignment with local ethical norms, particularly for Bengali, spoken by over 285 million people worldwide and among the most widely spoken languages globally, remains underexplored. Existing ethics benchmarks are predominantly English-centric and shaped by Western moral frameworks, overlooking cultural nuances vital for real-world deployment. To address this gap, we introduce BengaliMoralBench, a large-scale ethics benchmark designed for Bengali language and sociocultural contexts. Our benchmark spans five moral domains: (1) Daily Activities, (2) Habits, (3) Parenting, (4) Family Relationships, and (5) Religious Activities, each subdivided into ten culturally grounded categories, totaling 50 subtopics. Each scenario is annotated through native-speaker consensus under three ethical lenses: virtue ethics,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ciol-research/BengaliMoralBench
dataset· 132 dl
132 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.