CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI   Collaboration for Large Language Models

Yufei Huang; Deyi Xiong

arXiv:2306.16244·cs.CL·June 29, 2023·5 cites

CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models

Yufei Huang, Deyi Xiong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive Chinese Bias Benchmark dataset created through human-AI collaboration, aimed at detecting societal biases in large language models related to Chinese culture, with extensive experiments showing prevalent biases and some models' ability to self-correct.

Contribution

The work presents a novel Chinese bias dataset constructed via a structured human-AI process, enabling effective bias detection and analysis in Chinese large language models.

Findings

01

All tested models exhibit significant biases in certain categories.

02

Fine-tuned models can partially avoid morally harmful outputs.

03

The dataset effectively detects biases in Chinese language models.

Abstract

Holistically measuring societal biases of large language models is crucial for detecting and reducing ethical risks in highly capable AI models. In this work, we present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models, covering stereotypes and societal biases in 14 social dimensions related to Chinese culture and values. The curation process contains 4 essential steps: bias identification via extensive literature review, ambiguous context generation, AI-assisted disambiguous context generation, snd manual review \& recomposition. The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control. The dataset exhibits wide coverage and high diversity. Extensive experiments demonstrate the effectiveness of the dataset in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yfhuangxxxx/cbbq
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Topic Modeling · Natural Language Processing Techniques