IMPACT: Importance-Aware Activation Space Reconstruction
Md Mokarram Chowdhury, Daniel Agyei Asante, Ernie Chang, and Yang Li

TL;DR
IMPACT is a novel importance-aware activation space reconstruction method that optimizes low-rank compression of large language models by considering activation importance to preserve accuracy.
Contribution
It introduces an importance-weighted activation covariance approach for low-rank compression, improving size reduction while maintaining model performance.
Findings
Achieves up to 55.4% greater model size reduction.
Maintains or improves accuracy compared to state-of-the-art baselines.
Effective across multiple models and tasks.
Abstract
Large language models (LLMs) achieve strong performance across diverse domains but remain difficult to deploy in resource-constrained environments due to their size. Low-rank compression is a common remedy, typically minimizing weight reconstruction error under the assumption that weights are low-rank. However, this assumption often does not hold in LLMs. In contrast, LLM activations exhibit a more pronounced low-rank structure, motivating approaches that minimize activation reconstruction error. This shift alone, however, is not sufficient: different activation dimensions contribute unequally to model performance, and treating them uniformly can lead to accuracy loss. We introduce IMPACT, an importance-aware activation reconstruction framework that links compression to its effect on model performance. IMPACT formulates compression as an optimization problem that integrates activation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
