Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context
Ashish Pandey, Tek Raj Chhetri

TL;DR
This paper systematically evaluates social biases in seven large language models within the Nepali cultural context, revealing measurable explicit and implicit biases and their dependence on decoding parameters, emphasizing the need for culturally grounded bias mitigation.
Contribution
It introduces the Dual-Metric Bias Assessment (DMBA) framework for evaluating biases in LLMs in underrepresented cultural settings, with comprehensive analysis of bias behaviors across models and parameters.
Findings
Models exhibit measurable explicit agreement bias (0.36-0.43)
Implicit completion bias rate is high (0.740-0.755) and varies with temperature
Implicit bias peaks at moderate stochasticity (T=0.3) and is stable across top-p settings
Abstract
Large language models (LLMs) increasingly influence global digital ecosystems, yet their potential to perpetuate social and cultural biases remains poorly understood in underrepresented contexts. This study presents a systematic analysis of representational biases in seven state-of-the-art LLMs: GPT-4o-mini, Claude-3-Sonnet, Claude-4-Sonnet, Gemini-2.0-Flash, Gemini-2.0-Lite, Llama-3-70B, and Mistral-Nemo in the Nepali cultural context. Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1) agreement with biased statements and (2) stereotypical completion tendencies. Results show models exhibit measurable explicit agreement bias, with mean bias agreement ranging from 0.36 to 0.43 across decoding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Language and cultural evolution · Topic Modeling
