Adding guardrails to advanced chatbots
Yanchen Wang, Lisa Singh

TL;DR
This paper evaluates ChatGPT's fairness and biases across various tasks, highlighting the need for mitigation strategies and impartial review panels to enhance safety and equity in advanced chatbots.
Contribution
It provides an analysis of ChatGPT's strengths and biases, proposing strategies and the establishment of review panels to improve fairness and safety.
Findings
ChatGPT is a fair search engine for tested tasks
Biases are present in text and code generation
Small prompt changes affect fairness levels
Abstract
Generative AI models continue to become more powerful. The launch of ChatGPT in November 2022 has ushered in a new era of AI. ChatGPT and other similar chatbots have a range of capabilities, from answering student homework questions to creating music and art. There are already concerns that humans may be replaced by chatbots for a variety of jobs. Because of the wide spectrum of data chatbots are built on, we know that they will have human errors and human biases built into them. These biases may cause significant harm and/or inequity toward different subpopulations. To understand the strengths and weakness of chatbot responses, we present a position paper that explores different use cases of ChatGPT to determine the types of questions that are answered fairly and the types that still need improvement. We find that ChatGPT is a fair search engine for the tasks we tested; however, it has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · AI in Service Interactions
