Instruction Anchor: Dissecting the Mechanistic Dynamics of Modality Arbitration
Yu Zhang, Mufan Xu, Xuefeng Bai, Kehai Chen, Pengfei Zhang, Yang Xiang, Min Zhang

TL;DR
This paper investigates how multimodal large language models (MLLMs) decide which modality to follow based on instructions, revealing the internal attention mechanisms and specific heads responsible for this process.
Contribution
It uncovers the role of instruction tokens as anchors and identifies key attention heads that control modality arbitration, providing a mechanistic understanding of this behavior.
Findings
Attention layers transfer multimodal cues to instruction tokens early on.
Deep attention layers focus on instruction-compliant subspaces for modality arbitration.
Blocking 5% of key attention heads significantly impairs modality following, while amplification can improve it.
Abstract
Modality following is the ability to selectively leverage multimodal contexts based on user instructions. It is fundamental to the safety and reliability of multimodal large language models (MLLMs) in real-world deployments. However, the internal mechanisms governing this decision-making process remain largely under-explored. In this work, we investigate the mechanism underlying modality following through an information flow perspective. Our findings reveal that instruction tokens serve as structural anchor for modality arbitration: Shallow attention layers perform undifferentiated information transfer, aggregating multimodal cues to instruction tokens as a latent buffer; in contrast, deep attention layers selectively strengthen the instruction-compliant subspace and resolve modality arbitration according to the instruction-specified intent, with a sparse subset of attention heads…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
