ActTail: Global Activation Sparsity in Large Language Models

Wenwen Hou; Xinyuan Song; Shiwei Liu

arXiv:2603.12272·cs.CL·March 16, 2026

ActTail: Global Activation Sparsity in Large Language Models

Wenwen Hou, Xinyuan Song, Shiwei Liu

PDF

Open Access

TL;DR

ActTail introduces a theoretically grounded, projection-specific activation sparsity method for large language models, significantly improving inference efficiency and performance at high sparsity levels by leveraging heavy-tailed spectral properties.

Contribution

The paper proposes ActTail, a novel activation sparsity technique that allocates sparsity based on spectral properties, supported by a theoretical analysis linking sparsity ratios to heavy-tail exponents.

Findings

01

Improves perplexity and downstream task performance at high sparsity levels.

02

Achieves up to 40.1% perplexity reduction on LLaMA-2-13B at 80% sparsity.

03

Outperforms uniform sparsity allocation methods.

Abstract

Activation sparsity is a promising approach for accelerating large language model (LLM) inference by reducing computation and memory movement. However, existing activation sparsity methods typically apply uniform sparsity across projections, ignoring the heterogeneous statistical properties of Transformer weights and thereby amplifying performance degradation. In this paper, we propose ActTail, a TopK magnitude-based activation sparsity method with global activation sparsity allocation grounded in Heavy-Tailed Self-Regularization (HT-SR) theory. Specifically, we capture this heterogeneity via the heavy-tail exponent computed from each projection's empirical spectral density (ESD), which is used as a quantitative indicator to assign projection-specific sparsity budgets. Importantly, we provide a theoretical analysis that establishes an explicit relationship between the activation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques