A Japanese Benchmark for Evaluating Social Bias in Reasoning Based on Attribution Theory
Taihei Shiotani, Masahiro Kaneko, Naoaki Okazaki

TL;DR
This paper introduces JUBAKU-v2, a Japanese social bias benchmark based on attribution theory, designed to evaluate biases in reasoning processes of language models within cultural contexts.
Contribution
It presents a culturally specific dataset that assesses biases in reasoning, not just conclusions, improving bias detection in Japanese language models.
Findings
JUBAKU-v2 detects performance differences across models more sensitively than existing benchmarks.
The dataset reflects Japanese cultural biases in attribution within reasoning tasks.
Experimental results demonstrate the effectiveness of the new benchmark.
Abstract
In enhancing the fairness of Large Language Models (LLMs), evaluating social biases rooted in the cultural contexts of specific linguistic regions is essential. However, most existing Japanese benchmarks heavily rely on translating English data, which does not necessarily provide an evaluation suitable for Japanese culture. Furthermore, they only evaluate bias in the conclusion, failing to capture biases lurking in the reasoning. In this study, based on attribution theory in social psychology, we constructed a new dataset, ``JUBAKU-v2,'' which evaluates the bias in attributing behaviors to in-groups and out-groups within reasoning while fixing the conclusion. This dataset consists of 216 examples reflecting cultural biases specific to Japan. Experimental results verified that it can detect performance differences across models more sensitively than existing benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
