Sparse Activation Editing for Reliable Instruction Following in Narratives

Runcong Zhao; Chengyu Cao; Qinglin Zhu; Xiucheng Lv; Shun Shao; Lin Gui; Ruifeng Xu; Yulan He

arXiv:2505.16505·cs.CL·May 23, 2025

Sparse Activation Editing for Reliable Instruction Following in Narratives

Runcong Zhao, Chengyu Cao, Qinglin Zhu, Xiucheng Lv, Shun Shao, Lin Gui, Ruifeng Xu, Yulan He

PDF

Open Access 1 Video

TL;DR

This paper introduces Concise-SAE, a training-free neuron editing method that enhances language models' ability to follow instructions in narratives, validated on a new diverse benchmark, FreeInstruct.

Contribution

It presents a novel, training-free neuron editing framework for improving instruction following in narratives, without needing labeled data.

Findings

01

Achieves state-of-the-art instruction adherence

02

Maintains high generation quality

03

Effective across diverse narrative tasks

Abstract

Complex narrative contexts often challenge language models' ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly evaluate our method, we introduce FreeInstruct, a diverse and realistic benchmark of 1,212 examples that highlights the challenges of instruction following in narrative-rich settings. While initially motivated by complex narratives, Concise-SAE demonstrates state-of-the-art instruction adherence across varied tasks without compromising generation quality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sparse Activation Editing for Reliable Instruction Following in Narratives· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Text Readability and Simplification