ActTail: Global Activation Sparsity in Large Language Models
Wenwen Hou, Xinyuan Song, Shiwei Liu

TL;DR
ActTail introduces a theoretically grounded, projection-specific activation sparsity method for large language models, significantly improving inference efficiency and performance at high sparsity levels by leveraging heavy-tailed spectral properties.
Contribution
The paper proposes ActTail, a novel activation sparsity technique that allocates sparsity based on spectral properties, supported by a theoretical analysis linking sparsity ratios to heavy-tail exponents.
Findings
Improves perplexity and downstream task performance at high sparsity levels.
Achieves up to 40.1% perplexity reduction on LLaMA-2-13B at 80% sparsity.
Outperforms uniform sparsity allocation methods.
Abstract
Activation sparsity is a promising approach for accelerating large language model (LLM) inference by reducing computation and memory movement. However, existing activation sparsity methods typically apply uniform sparsity across projections, ignoring the heterogeneous statistical properties of Transformer weights and thereby amplifying performance degradation. In this paper, we propose ActTail, a TopK magnitude-based activation sparsity method with global activation sparsity allocation grounded in Heavy-Tailed Self-Regularization (HT-SR) theory. Specifically, we capture this heterogeneity via the heavy-tail exponent computed from each projection's empirical spectral density (ESD), which is used as a quantitative indicator to assign projection-specific sparsity budgets. Importantly, we provide a theoretical analysis that establishes an explicit relationship between the activation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques
