Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics   and Prompt Wording

Aisha Khatun; Daniel G. Brown

arXiv:2306.06199·cs.CL·June 13, 2023·2 cites

Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording

Aisha Khatun, Daniel G. Brown

PDF

Open Access 2 Repos

TL;DR

This paper systematically analyzes GPT-3's responses to sensitive topics and prompt variations, revealing inconsistencies and specific areas of reliability and unreliability in its outputs.

Contribution

It provides a novel systematic analysis of GPT-3's response patterns to sensitive topics and prompt wording, highlighting vulnerabilities and inconsistencies.

Findings

01

GPT-3 correctly disagrees with obvious conspiracies and stereotypes

02

GPT-3 makes mistakes with misconceptions and controversies

03

Responses are inconsistent across prompts and settings

Abstract

Large language models (LLMs) have become mainstream technology with their versatile use cases and impressive performance. Despite the countless out-of-the-box applications, LLMs are still not reliable. A lot of work is being done to improve the factual accuracy, consistency, and ethical standards of these models through fine-tuning, prompting, and Reinforcement Learning with Human Feedback (RLHF), but no systematic analysis of the responses of these models to different categories of statements, or on their potential vulnerabilities to simple prompting changes is available. In this work, we analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response. We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies. The model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Cosine Annealing · 15 Ways to Contact How can i speak to someone at Delta Airlines · Layer Normalization · Weight Decay · Softmax