TL;DR
This paper emphasizes the urgent need for a standardized auditing framework to evaluate and mitigate biases and norm violations in chatbots and large language models, promoting safer and more ethical AI deployment.
Contribution
It proposes a community-driven, values-based auditing framework and demonstrates initial bias assessments on GPT models, highlighting the necessity for standardized evaluation methods.
Findings
GPT 3.5 and GPT 4 responses show bias and norm violations
Existing models sometimes produce content inconsistent with legal and societal values
The need for a transparent, community-established auditing standard is urgent
Abstract
The launch of ChatGPT in November 2022 marked the beginning of a new era in AI, the availability of generative AI tools for everyone to use. ChatGPT and other similar chatbots boast a wide range of capabilities from answering student homework questions to creating music and art. Given the large amounts of human data chatbots are built on, it is inevitable that they will inherit human errors and biases. These biases have the potential to inflict significant harm or increase inequity on different subpopulations. Because chatbots do not have an inherent understanding of societal values, they may create new content that is contrary to established norms. Examples of concerning generated content includes child pornography, inaccurate facts, and discriminatory posts. In this position paper, we argue that the speed of advancement of this technology requires us, as computer and data scientists,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Cosine Annealing · Linear Layer · Residual Connection · Linear Warmup With Cosine Annealing · Attention Dropout · Discriminative Fine-Tuning · Multi-Head Attention
