GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao

TL;DR
GRASP introduces a gradient-based method that adaptively retains critical singular parameters to compress large language models efficiently, reducing redundancy while preserving performance.
Contribution
It proposes a novel gradient-based framework that replaces redundant layers with minimal adaptive singular parameters, improving compression without significant performance loss.
Findings
Achieves 90% of original performance at 20% compression ratio.
Outperforms existing compression methods across multiple LLMs.
Uses gradient attribution on small calibration datasets for adaptive layer replacement.
Abstract
Recent studies have demonstrated that many layers are functionally redundant in large language models (LLMs), enabling model compression by removing these layers to reduce inference cost. While such approaches can improve efficiency, indiscriminate layer pruning often results in significant performance degradation. In this paper, we propose GRASP (Gradient-based Retention of Adaptive Singular Parameters), a novel compression framework that mitigates this issue by preserving sensitivity-aware singular values. Unlike direct layer pruning, GRASP leverages gradient-based attribution on a small calibration dataset to adaptively identify and retain critical singular components. By replacing redundant layers with only a minimal set of parameters, GRASP achieves efficient compression while maintaining strong performance with minimal overhead. Experiments across multiple LLMs show that GRASP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques
