Instruction Anchor: Dissecting the Mechanistic Dynamics of Modality Arbitration

Yu Zhang; Mufan Xu; Xuefeng Bai; Kehai Chen; Pengfei Zhang; Yang Xiang; Min Zhang

arXiv:2602.03677·cs.CL·May 12, 2026

Instruction Anchor: Dissecting the Mechanistic Dynamics of Modality Arbitration

Yu Zhang, Mufan Xu, Xuefeng Bai, Kehai Chen, Pengfei Zhang, Yang Xiang, Min Zhang

PDF

TL;DR

This paper investigates how multimodal large language models (MLLMs) decide which modality to follow based on instructions, revealing the internal attention mechanisms and specific heads responsible for this process.

Contribution

It uncovers the role of instruction tokens as anchors and identifies key attention heads that control modality arbitration, providing a mechanistic understanding of this behavior.

Findings

01

Attention layers transfer multimodal cues to instruction tokens early on.

02

Deep attention layers focus on instruction-compliant subspaces for modality arbitration.

03

Blocking 5% of key attention heads significantly impairs modality following, while amplification can improve it.

Abstract

Modality following is the ability to selectively leverage multimodal contexts based on user instructions. It is fundamental to the safety and reliability of multimodal large language models (MLLMs) in real-world deployments. However, the internal mechanisms governing this decision-making process remain largely under-explored. In this work, we investigate the mechanism underlying modality following through an information flow perspective. Our findings reveal that instruction tokens serve as structural anchor for modality arbitration: Shallow attention layers perform undifferentiated information transfer, aggregating multimodal cues to instruction tokens as a latent buffer; in contrast, deep attention layers selectively strengthen the instruction-compliant subspace and resolve modality arbitration according to the instruction-specified intent, with a sparse subset of attention heads…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.