JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models

Hitomi Yanaka; Namgi Han; Ryoma Kumon; Jie Lu; Masashi Takeshita; Ryo Sekizawa; Taisei Kato; Hiromi Arai

arXiv:2406.02050·cs.CL·June 16, 2025·3 cites

JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models

Hitomi Yanaka, Namgi Han, Ryoma Kumon, Jie Lu, Masashi Takeshita, Ryo Sekizawa, Taisei Kato, Hiromi Arai

PDF

Open Access 1 Repo

TL;DR

This paper introduces JBBQ, a Japanese Bias Benchmark for analyzing social biases in large language models, revealing that larger models tend to exhibit increased biases, and prompting strategies can mitigate these biases.

Contribution

The paper creates the first Japanese bias benchmark dataset for LLMs, enabling analysis of social biases specific to Japanese language models.

Findings

01

Larger Japanese LLMs show higher bias scores.

02

Prompting with bias warnings reduces bias effects.

03

Japanese LLMs struggle to extract correct evidence in bias mitigation prompts.

Abstract

With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, with analysis of social biases in Japanese LLMs. The results show that while current open Japanese LLMs with more parameters show improved accuracies on JBBQ, their bias scores increase. In addition, prompts with a warning about social biases and chain-of-thought prompting reduce the effect of biases in model outputs, but there is room for improvement in extracting the correct evidence from contexts in Japanese. Our dataset is available at https://github.com/ynklab/JBBQ_data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ynklab/JBBQ_data
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods