TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

Xiangyu Chen; Jing Liu; Ye Wang; Matthew Brand; Pu (Perry) Wang; Toshiaki Koike-Akino

arXiv:2505.21835·cs.LG·May 29, 2025

TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

Xiangyu Chen, Jing Liu, Ye Wang, Matthew Brand, Pu (Perry) Wang, Toshiaki Koike-Akino

PDF

Open Access

TL;DR

This paper introduces TuneComp, a method that jointly fine-tunes and compresses large models during post-training, leading to smaller, more efficient models with better performance than traditional sequential approaches.

Contribution

The paper proposes a novel joint fine-tuning and compression technique that integrates distillation, pruning, and low-rank approximation in a unified process.

Findings

01

Joint fine-tuning and compression outperform sequential methods.

02

Significant reduction in model size with maintained or improved performance.

03

Effective distillation to a pruned low-rank structure during training.

Abstract

To reduce model size during post-training, compression methods, including knowledge distillation, low-rank approximation, and pruning, are often applied after fine-tuning the model. However, sequential fine-tuning and compression sacrifices performance, while creating a larger than necessary model as an intermediate step. In this work, we aim to reduce this gap, by directly constructing a smaller model while guided by the downstream task. We propose to jointly fine-tune and compress the model by gradually distilling it to a pruned low-rank structure. Experiments demonstrate that joint fine-tuning and compression significantly outperforms other sequential compression methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis