Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording
Aisha Khatun, Daniel G. Brown

TL;DR
This paper systematically analyzes GPT-3's responses to sensitive topics and prompt variations, revealing inconsistencies and specific areas of reliability and unreliability in its outputs.
Contribution
It provides a novel systematic analysis of GPT-3's response patterns to sensitive topics and prompt wording, highlighting vulnerabilities and inconsistencies.
Findings
GPT-3 correctly disagrees with obvious conspiracies and stereotypes
GPT-3 makes mistakes with misconceptions and controversies
Responses are inconsistent across prompts and settings
Abstract
Large language models (LLMs) have become mainstream technology with their versatile use cases and impressive performance. Despite the countless out-of-the-box applications, LLMs are still not reliable. A lot of work is being done to improve the factual accuracy, consistency, and ethical standards of these models through fine-tuning, prompting, and Reinforcement Learning with Human Feedback (RLHF), but no systematic analysis of the responses of these models to different categories of statements, or on their potential vulnerabilities to simple prompting changes is available. In this work, we analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response. We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies. The model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · 15 Ways to Contact How can i speak to someone at Delta Airlines · Layer Normalization · Weight Decay · Softmax
