Layer-Order Inversion: Rethinking Latent Multi-Hop Reasoning in Large Language Models
Xukai Liu, Ye Liu, Jipeng Zhang, Yanghai Zhang, Kai Zhang, Qi Liu

TL;DR
This paper challenges existing assumptions about how large language models perform multi-hop reasoning, revealing that answer entities can be decoded earlier than bridge entities, and proposes a probabilistic framework to explain this behavior.
Contribution
It introduces the layer-order inversion phenomenon and a probabilistic recall-and-extract model to better understand multi-hop reasoning in LLMs.
Findings
Later-hop answer entities can be decoded before bridge entities.
The proposed framework explains multi-hop reasoning behaviors and failures.
Systematic probing validates the probabilistic recall-and-extract model.
Abstract
Large language models (LLMs) perform well on multi-hop reasoning, yet how they internally compose multiple facts remains unclear. Recent work proposes \emph{hop-aligned circuit hypothesis}, suggesting that bridge entities are computed sequentially across layers before later-hop answers. Through systematic analyses on real-world multi-hop queries, we show that this hop-aligned assumption does not generalize: later-hop answer entities can become decodable earlier than bridge entities, a phenomenon we call \emph{layer-order inversion}, which strengthens with total hops. To explain this behavior, we propose a \emph{probabilistic recall-and-extract} framework that models multi-hop reasoning as broad probabilistic recall in shallow MLP layers followed by selective extraction in deeper attention layers. This framework is empirically validated through systematic probing analyses, reinterpreting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
