Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions
Yao Yan

TL;DR
This paper investigates how Llama-3-8B finalizes three-digit addition answers after cross-token routing, revealing a late-layer boundary where last token input dominates and digit directions are related by a low-rank orthogonal map.
Contribution
It characterizes the post-routing arithmetic answer formation in Llama-3, identifying a late-layer boundary and a shared low-rank structure in digit direction mappings.
Findings
Decoding sum is controlled mainly by the last token after layer 17
Digit direction dictionaries are related by an orthogonal map in a low-rank subspace
Rotating digit directions through learned maps enables effective counterfactual editing
Abstract
We study three-digit addition in Meta-Llama-3-8B (base) under a one-token readout to characterize how arithmetic answers are finalized after cross-token routing becomes causally irrelevant. Causal residual patching and cumulative attention ablations localize a sharp boundary near layer~17: beyond it, the decoded sum is controlled almost entirely by the last input token and late-layer self-attention is largely dispensable. In this post-routing regime, digit(-sum) direction dictionaries vary with a next-higher-digit context but are well-related by an approximately orthogonal map inside a shared low-rank subspace (low-rank Procrustes alignment). Causal digit editing matches this geometry: naive cross-context transfer fails, while rotating directions through the learned map restores strict counterfactual edits; negative controls do not recover.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Cognitive and developmental aspects of mathematical skills · Logic, programming, and type systems
