Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA
Eduard Tulchinskii, Laida Kushnareva, Kristian Kuznetsov, Anastasia, Voznyuk, Andrei Andriiainen, Irina Piontkovskaya, Evgeny Burnaev, Serguei, Barannikov

TL;DR
This paper introduces select-and-copy attention heads and new scoring methods to better evaluate and extract knowledge from large language models in multiple-choice question answering, overcoming format limitations.
Contribution
It proposes a novel approach using specific attention heads and scores to improve knowledge extraction and model evaluation in MCQA tasks.
Findings
Up to 16% performance gain on MCQA benchmarks.
Nearly 60% accuracy increase on synthetic datasets.
Effective across models from 7B to 70B parameters.
Abstract
A standard way to evaluate the abilities of LLM involves presenting a multiple-choice question and selecting the option with the highest logit as the model's predicted answer. However, such a format for evaluating LLMs has limitations, since even if the model knows the correct answer, it may struggle to select the corresponding letter simply due to difficulties in following this rigid format. To address this, we introduce new scores that better capture and reveal model's underlying knowledge: the Query-Key Score (QK-score), derived from the interaction between query and key representations in attention heads, and the Attention Score, based on attention weights. These scores are extracted from specific \textit{select-and-copy} heads, which show consistent performance across popular Multi-Choice Question Answering (MCQA) datasets. Based on these scores, our method improves knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fault Detection and Control Systems · Reservoir Engineering and Simulation Methods
MethodsSoftmax · Attention Is All You Need
