Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and   Benchmarks

Jingyan Zhou; Jiawen Deng; Fei Mi; Yitong Li; Yasheng Wang; Minlie; Huang; Xin Jiang; Qun Liu; Helen Meng

arXiv:2202.08011·cs.CL·October 31, 2022·6 cites

Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks

Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie, Huang, Xin Jiang, Qun Liu, Helen Meng

PDF

Open Access

TL;DR

This paper introduces a new framework and dataset for detecting social bias in Chinese dialog systems, providing benchmarks to improve safety and reduce biases in conversational AI.

Contribution

It proposes the Dial-Bias Frame for analyzing social bias, creates the first annotated Chinese bias dialog dataset, and establishes benchmarks for bias detection at multiple levels.

Findings

01

The Dial-Bias Frame enables comprehensive bias analysis.

02

The CDail-Bias Dataset is the first annotated Chinese social bias dialog dataset.

03

Benchmarks show the importance of detailed analysis for bias detection.

Abstract

The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e.g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice. Among all these unsafe issues, addressing social bias is more complex as its negative impact on marginalized populations is usually expressed implicitly, thus requiring normative reasoning and rigorous analysis. In this paper, we focus our investigation on social bias detection of dialog safety problems. We first propose a novel Dial-Bias Frame for analyzing the social bias in conversations pragmatically, which considers more comprehensive bias-related analyses rather than simple dichotomy annotations. Based on the proposed framework, we further introduce CDail-Bias Dataset that, to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection