Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Sarah Wiegreffe, Oyvind Tafjord, Yonatan Belinkov, Hannaneh, Hajishirzi, Ashish Sabharwal

TL;DR
This paper investigates how transformer language models perform on multiple-choice question answering by analyzing internal mechanisms, identifying key layers and attention heads responsible for answer prediction, and proposing a synthetic task to diagnose model learning.
Contribution
It introduces methods to localize and understand the internal decision processes of models on MCQA tasks, revealing the roles of specific layers and attention heads, and presents a synthetic diagnostic task for error analysis.
Findings
Prediction is causally linked to middle layers' attention mechanisms.
Answer probability increases in later layers, driven by specific attention heads.
Models' ability to distinguish answer choices improves over training.
Abstract
Multiple-choice question answering (MCQA) is a key competence of performant transformer language models that is tested by mainstream benchmarks. However, recent evidence shows that models can have quite a range of performance, particularly when the task format is diversified slightly (such as by shuffling answer choice order). In this work we ask: how do successful models perform formatted MCQA? We employ vocabulary projection and activation patching methods to localize key hidden states that encode relevant information for predicting the correct answer. We find that the prediction of a specific answer symbol is causally attributed to a few middle layers, and specifically their multi-head self-attention mechanisms. We show that subsequent layers increase the probability of the predicted answer symbol in vocabulary space, and that this probability increase is associated with a sparse set…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Science and Education Research · Language, Metaphor, and Cognition · Discourse Analysis in Language Studies
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training · Activation Patching
