How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism
Elisabetta Rocchetti, Alfio Ferrara

TL;DR
This paper challenges the idea of a universal instruction-following mechanism in LLMs, showing instead that it involves skillful coordination of diverse capabilities with limited shared representations.
Contribution
It provides empirical evidence that instruction-following relies on task-specific skills rather than a single universal process in instruction-tuned models.
Findings
Probes show limited representational sharing across tasks.
Weak transfer observed between different tasks.
Causal ablation reveals asymmetric dependencies rather than shared representations.
Abstract
Instruction tuning is commonly assumed to endow language models with a domain-general ability to follow instructions, yet the underlying mechanism remains poorly understood. Does instruction-following rely on a universal mechanism or compositional skill deployment? We investigate this through diagnostic probing across nine diverse tasks in three instruction-tuned models. Our analysis provides converging evidence against a universal mechanism. First, general probes trained across all tasks consistently underperform task-specific specialists, indicating limited representational sharing. Second, cross-task transfer is weak and clustered by skill similarity. Third, causal ablation reveals sparse asymmetric dependencies rather than shared representations. Tasks also stratify by complexity across layers, with structural constraints emerging early and semantic tasks emerging late. Finally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
