GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks
Wenwu Tang, Dong Wang, Lothar Thiele, Olga Saukh

TL;DR
GRAIL is a post-hoc linear reconstruction method that restores accuracy in compressed neural networks without additional training, using a small calibration set to improve performance across various models.
Contribution
It introduces a simple, data-aware, zero-finetuning post-compression technique that enhances accuracy of compressed models by linearly reconstructing hidden representations.
Findings
Consistently improves accuracy or perplexity across ResNets, ViTs, and LLMs.
Requires only a few forward passes without gradients or labels.
Operates with manageable computational overhead.
Abstract
Structured deep model compression methods are hardware-friendly and substantially reduce memory and inference costs. However, under aggressive compression, the resulting accuracy degradation often necessitates post-compression finetuning, which can be impractical due to missing labeled data or high training cost. We propose post-hoc blockwise compensation, called GRAIL, a simple zero-finetuning step applied after model compression that restores each block's input-output behavior using a small calibration set. The method summarizes hidden activations via a Gram matrix and applies ridge regression to linearly reconstruct the original hidden representation from the reduced one. The resulting reconstruction map is absorbed into the downstream projection weights, while the upstream layer is compressed. The approach is selector-agnostic (Magnitude, Wanda, Gram-based selection, or folding),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Data Compression Techniques · Advanced Data Storage Technologies
