Enhancing Instruction-Following Capabilities in Seq2Seq Models: DoLA Adaptations for T5
Huey Sun, Anabel Yong, Lorenzo Gilly, Felipe Jin

TL;DR
This paper investigates why instruction-following in Seq2Seq models like FLAN-T5 often fails due to volatile internal representations and introduces a gradient-based method to steer models towards better instruction compliance, significantly improving performance.
Contribution
The paper adapts DoLa to FLAN-T5 to analyze internal representations and proposes a gradient-based activation-steering technique to enhance instruction-following capabilities.
Findings
Decoder layers show rapid shifts driven by cross-attention.
Token preferences are highly volatile across layers.
Steering improves MemoTrap performance from 52% to 99.7%.
Abstract
Encoder-decoder models such as FLAN-T5 are finetuned to follow instructions, but often fail when the instructions conflict with memorized continuations ingrained during training. To understand this behavior, we adapt DoLa to FLAN-T5 and examine how representations evolve in the decoder. Our findings show that T5's intermediate layers undergo rapid shifts driven by cross-attention to the encoder. When projected through the language modeling head, each depth presents highly volatile token preferences, leading to unreliable behavior with contrastive decoding. Motivated by this, we introduce a gradient-based activation-steering method that injects an "instruction-compliance" direction into mid-decoder layers, where the representation is both meaningful and still malleable. This intervention dramatically improves MemoTrap performance (52% to 99.7%), demonstrating that mechanistic steering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Parallel Computing and Optimization Techniques · Software Engineering Research
