AutoCompress: Critical Layer Isolation for Efficient Transformer Compression

Archit Thorat

arXiv:2604.22786·cs.LG·April 28, 2026

AutoCompress: Critical Layer Isolation for Efficient Transformer Compression

Archit Thorat

PDF

1 Repo

TL;DR

AutoCompress introduces Critical Layer Isolation, protecting the most task-critical layer in transformers to enable significant compression while maintaining performance.

Contribution

It proposes a novel architecture that isolates and preserves the critical initial layer in transformers, improving compression without sacrificing accuracy.

Findings

01

Layer 0 in small transformers carries disproportionately high task-critical information.

02

CLI-GPT2 achieves 2.47x compression with only a slight increase in perplexity.

03

Architectural protection of Layer 0 outperforms uniform bottleneck baselines.

Abstract

We present AutoCompress, a transformer compression method motivated by an empirical finding: in small transformers, Layer 0 carries disproportionately high task-critical information, with an NTK-based importance score of 3.6 compared to a maximum of 0.054 for all other layers -- a gap of over 60x. Based on this finding, we propose Critical Layer Isolation (CLI), an architecture that protects Layer 0 at full dimensionality, compresses all intermediate layers through a learned bottleneck, and restores the full dimension at the final layer. Applied to GPT-2 Medium (354.8M parameters), CLI-GPT2 achieves 204.5 perplexity on WikiText-103 with only 143.8M parameters -- a 2.47x compression ratio and 59.5% parameter reduction. Crucially, an ablation study demonstrates that a uniform bottleneck baseline of comparable size achieves only 571.8 perplexity under identical training conditions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.