Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions

Ruizhe Li; Yanjun Gao

arXiv:2405.03205·cs.CL·June 3, 2025·1 cites

Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions

Ruizhe Li, Yanjun Gao

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the positional bias in GPT-2 models for multiple-choice questions, identifies internal mechanisms responsible, and proposes targeted interventions to mitigate bias and improve accuracy.

Contribution

It provides the first mechanistic analysis of anchored bias in GPT-2's MCQ performance and introduces minimal interventions to reduce bias and enhance robustness.

Findings

01

Mitigating bias improves GPT-2 accuracy on MCQs

02

Internal modules responsible for bias identified and modified

03

Targeted interventions significantly reduce anchored bias

Abstract

Large Language Models (LLMs), such as the GPT-4 and LLaMA families, have demonstrated considerable success across diverse tasks, including multiple-choice questions (MCQs). However, these models exhibit a positional bias, particularly an even worse anchored bias in the GPT-2 family, where they consistently favour the first choice 'A' in MCQs during inference. This anchored bias challenges the integrity of GPT-2's decision-making process, as it skews performance based on the position rather than the content of the choices in MCQs. In this study, we utilise the mechanistic interpretability approach to identify the internal modules within GPT-2 models responsible for this bias. We focus on the Multi-Layer Perceptron (MLP) layers and attention heads, using the "logit lens" method to trace and modify the specific value vectors that contribute to the bias. By updating these vectors within MLP…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruizheliuoa/anchored_bias_gpt2
noneOfficial

Videos

Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions· underline

Taxonomy

TopicsClinical Reasoning and Diagnostic Skills · Decision-Making and Behavioral Economics · Meta-analysis and systematic reviews

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Label Smoothing · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Cosine Annealing · Dense Connections · Transformer · Dropout · Weight Decay