When Models Decide and When They Bind: A Two-Stage Computation for Multiple-Choice Question-Answering
Hugh Mee Wong, Rick Nouwen, Albert Gatt

TL;DR
This paper investigates how language models internally handle multiple-choice question answering, revealing a two-stage process involving content selection and symbol binding, supported by representational and causal analyses.
Contribution
It uncovers a two-stage mechanism in models for MCQA, distinguishing content selection from symbol binding, using representational and causal intervention methods.
Findings
Models show strong signals for option correctness in residual states.
A two-stage process: content selection followed by symbol binding.
Support for a two-stage mechanism through permutation tests.
Abstract
Multiple-choice question answering (MCQA) is easy to evaluate but adds a meta-task: models must both solve the problem and output the symbol that *represents* the answer, conflating reasoning errors with symbol-binding failures. We study how language models implement MCQA internally using representational analyses (PCA, linear probes) as well as causal interventions. We find that option-boundary (newline) residual states often contain strong linearly decodable signals related to per-option correctness. Winner-identity probing reveals a two-stage progression: the winning *content position* becomes decodable immediately after the final option is processed, while the *output symbol* is represented closer to the answer emission position. Tests under symbol and content permutations support a two-stage mechanism in which models first select a winner in content space and then bind or route…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Multimodal Machine Learning Applications
