AI in the Gray: Exploring Moderation Policies in Dialogic Large Language   Models vs. Human Answers in Controversial Topics

Vahid Ghafouri; Vibhor Agarwal; Yong Zhang; Nishanth Sastry; Jose; Such; Guillermo Suarez-Tangil

arXiv:2308.14608·cs.LG·August 30, 2023

AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics

Vahid Ghafouri, Vibhor Agarwal, Yong Zhang, Nishanth Sastry, Jose, Such, Guillermo Suarez-Tangil

PDF

1 Repo

TL;DR

This study compares ChatGPT's moderation and bias levels on controversial topics to human answers, finding recent models are less biased but still show some ideological leanings, with implications for AI moderation.

Contribution

It provides an empirical analysis of ChatGPT's biases and moderation effectiveness on controversial issues compared to human responses, highlighting recent improvements and remaining challenges.

Findings

01

Recent ChatGPT versions show reduced explicit biases.

02

ChatGPT maintains some implicit libertarian leanings.

03

Bing AI answers tend to be more centrist than humans.

Abstract

The introduction of ChatGPT and the subsequent improvement of Large Language Models (LLMs) have prompted more and more individuals to turn to the use of ChatBots, both for information and assistance with decision-making. However, the information the user is after is often not formulated by these ChatBots objectively enough to be provided with a definite, globally accepted answer. Controversial topics, such as "religion", "gender identity", "freedom of speech", and "equality", among others, can be a source of conflict as partisan or biased answers can reinforce preconceived notions or promote disinformation. By exposing ChatGPT to such debatable questions, we aim to understand its level of awareness and if existing models are subject to socio-political and/or economic biases. We also aim to explore how AI-generated answers compare to human ones. For exploring this, we use a dataset of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vahidthegreat/ai-in-the-gray
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.