AVSS: Layer Importance Evaluation in Large Language Models via Activation Variance-Sparsity Analysis
Zichen Song, Yuxin Wu, Sitan Huang, Zhongfeng Kang

TL;DR
This paper introduces AVSS, a metric combining activation variance and sparsity to evaluate layer importance in large language models, enabling effective model pruning without significant performance loss.
Contribution
It proposes a novel AVSS metric for assessing layer importance in LLMs and demonstrates effective pruning by removing less critical layers while maintaining performance.
Findings
Removing the lowest 25% AVSS layers retains over 90% of performance
AVSS effectively identifies non-essential layers in LLMs
Pruning based on AVSS improves model efficiency
Abstract
The evaluation of layer importance in deep learning has been an active area of research, with significant implications for model optimization and interpretability. Recently, large language models (LLMs) have gained prominence across various domains, yet limited studies have explored the functional importance and performance contributions of individual layers within LLMs, especially from the perspective of activation distribution. In this work, we propose the Activation Variance-Sparsity Score (AVSS), a novel metric combining normalized activation variance and sparsity to assess each layer's contribution to model performance. By identifying and removing approximately the lowest 25% of layers based on AVSS, we achieve over 90% of original model performance across tasks such as question answering, language modeling, and sentiment classification, indicating that these layers may be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
