Compressing Large Language Models with PCA Without Performance Loss

Magnus Bengtsson

arXiv:2508.04307·cs.CE·August 7, 2025

Compressing Large Language Models with PCA Without Performance Loss

Magnus Bengtsson

PDF

TL;DR

This paper shows that applying PCA in a structured way to inputs allows for extreme neural model compression without performance loss across various tasks and modalities.

Contribution

It introduces a PCA-based input compression method that enables significant model size reduction while maintaining high accuracy and coherence.

Findings

01

A one-layer classifier on PCA-compressed polar MNIST achieves over 98% accuracy with 840 parameters.

02

A two-layer transformer on PCA-reduced MiniLM embeddings reaches 76.62% accuracy on 20 Newsgroups with 81,000 parameters.

03

A decoder-only transformer generates coherent sequences from PCA embeddings, preserving over 97% cosine similarity with full representations.

Abstract

We demonstrate that Principal Component Analysis (PCA), when applied in a structured manner, either to polar-transformed images or segment-wise to token sequences, enables extreme compression of neural models without sacrificing performance. Across three case studies, we show that a one-layer classifier trained on PCA-compressed polar MNIST achieves over 98 percent accuracy using only 840 parameters. A two-layer transformer trained on 70-dimensional PCA-reduced MiniLM embeddings reaches 76.62 percent accuracy on the 20 Newsgroups dataset with just 81000 parameters. A decoder-only transformer generates coherent token sequences from 70-dimensional PCA embeddings while preserving over 97 percent cosine similarity with full MiniLM representations, using less than 17 percent of the parameter count of GPT-2. These results highlight PCA-based input compression as a general and effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.