Lossless Model Compression via Joint Low-Rank Factorization Optimization

Boyang Zhang; Daning Cheng; Yunquan Zhang; Fangming Liu; Jiake Tian

arXiv:2412.06867·cs.LG·December 24, 2025

Lossless Model Compression via Joint Low-Rank Factorization Optimization

Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangming Liu, Jiake Tian

PDF

Open Access

TL;DR

This paper introduces a joint low-rank factorization optimization method that achieves lossless model compression, surpassing original performance without fine-tuning, applicable across vision and language models.

Contribution

It presents a novel joint optimization strategy for low-rank weight factorization that guarantees lossless compression and improved performance, unlike previous separate optimization approaches.

Findings

01

Achieves 70% compression on ResNext50 with better performance than original

02

Develops algorithms that do not require fine-tuning for lossless compression

03

Demonstrates robustness across various vision and language tasks

Abstract

Low-rank factorization is a popular model compression technique that minimizes the error $δ$ between approximated and original weight matrices. Despite achieving performances close to the original models when $δ$ is optimized, a performance discrepancy remains due to the separate optimization processes for low-rank factorization and model performance, resulting in unavoidable losses. We address this issue by introducing a novel joint optimization strategy for lossless low-rank weight factorization, which, for the first time, enhances the model's performance beyond the original. Our approach begins with a theoretical analysis of the relationship between low-rank factorization and model optimization objectives, establishing a precise perturbation range for matrix factorization errors on model performance. This challenge is then reformulated as a numerical rank deficiency problem…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques