Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries

Tianyi Lorena Yan; Robin Jia

arXiv:2502.20475·cs.CL·September 24, 2025

Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries

Tianyi Lorena Yan, Robin Jia

PDF

1 Repo 1 Datasets

TL;DR

This paper uncovers a promote-then-suppress mechanism in language models that enables them to recall multiple factual answers and avoid repetition when answering one-to-many queries, revealing how internal components interact.

Contribution

The study introduces a detailed analysis of the internal promote-then-suppress process in LMs for complex factual recall, supported by novel experimental methods.

Findings

01

Models use subject and previous answer tokens for recall.

02

Attention mechanisms promote answers and suppress repetitions.

03

Experimental tools like Token Lens and knockout validate the mechanism.

Abstract

To answer one-to-many factual queries (e.g., listing cities of a country), a language model (LM) must simultaneously recall knowledge and avoid repeating previous answers. How are these two subtasks implemented and integrated internally? Across multiple datasets, models, and prompt templates, we identify a promote-then-suppress mechanism: the model first recalls all answers, and then suppresses previously generated ones. Specifically, LMs use both the subject and previous answer tokens to perform knowledge recall, with attention propagating subject information and MLPs promoting the answers. Then, attention attends to and suppresses previous answer tokens, while MLPs amplify the suppression signal. Our mechanism is corroborated by extensive experimental evidence: in addition to using early decoding and causal tracing, we analyze how components use different tokens by introducing both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lorenayannnnn/how-lms-answer-one-to-many-factual-queries
pytorchOfficial

Datasets

LorenaYannnnn/how_lms_answer_one_to_many_factual_queries
dataset· 9 dl
9 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need