Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity
Haotian Xu, Jiannan Yang, Tian Gao, Tsui-Wei Weng, Tengfei Ma

TL;DR
This paper introduces SPON, a lightweight method that stabilizes activation sparsity in large language models by using input-independent representational anchors, improving inference efficiency without sacrificing accuracy.
Contribution
The paper proposes SPON, a novel approach that addresses representational instability in activation sparsity, enhancing LLM inference robustness and generalization.
Findings
SPON restores performance across multiple LLMs at high sparsity.
SPON stabilizes latent representations and preserves model generalization.
The method incurs negligible inference overhead after training.
Abstract
Activation sparsity offers a compelling route to accelerate large language model (LLM) inference by selectively suppressing hidden activations, yet existing approaches exhibit severe accuracy degradation at high sparsity. We show that this failure stems from representational instability: *activation sparsity disrupts input-dependent activation learned during pretraining, inducing distribution shifts in hidden states.* We address this issue by reframing activation sparsity as a representational alignment problem and introducing **Spontaneous Neurons (SPON)**, a lightweight mechanism inspired by spontaneous neural activity in biological systems. SPON injects a small set of learnable, input-independent activation vectors that act as persistent representational anchors for sparse computation. These vectors are trained via distribution matching to the dense model and can be absorbed into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
