Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large   Language Models

Zeping Yu; Yonatan Belinkov; Sophia Ananiadou

arXiv:2502.10835·cs.CL·February 18, 2025

Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models

Zeping Yu, Yonatan Belinkov, Sophia Ananiadou

PDF

Open Access 1 Video

TL;DR

This paper introduces logit flow for interpretability and proposes back attention to improve multi-hop reasoning in large language models, significantly enhancing their reasoning accuracy.

Contribution

We develop logit flow for analyzing reasoning processes and propose back attention, a novel mechanism that boosts multi-hop reasoning performance in LLMs.

Findings

01

Logit flow reveals four stages in single-hop reasoning.

02

Back attention improves reasoning accuracy across multiple datasets.

03

A 1-layer transformer with back attention matches 2-layer performance.

Abstract

We investigate how large language models perform latent multi-hop reasoning in prompts like "Wolfgang Amadeus Mozart's mother's spouse is". To analyze this process, we introduce logit flow, an interpretability method that traces how logits propagate across layers and positions toward the final prediction. Using logit flow, we identify four distinct stages in single-hop knowledge prediction: (A) entity subject enrichment, (B) entity attribute extraction, (C) relation subject enrichment, and (D) relation attribute extraction. Extending this analysis to multi-hop reasoning, we find that failures often stem from the relation attribute extraction stage, where conflicting logits reduce prediction accuracy. To address this, we propose back attention, a novel mechanism that enables lower layers to leverage higher-layer hidden states from different positions during attention computation. With…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSoftmax · Attention Is All You Need