Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization

Habib Hajimolahoseini; Walid Ahmed; Yang Liu

arXiv:2309.03824·cs.LG·May 27, 2025·2 cites

Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization

Habib Hajimolahoseini, Walid Ahmed, Yang Liu

PDF

Open Access

TL;DR

This paper introduces techniques to accelerate low-rank decomposed neural networks by optimizing ranks and freezing layers sequentially, significantly improving training and inference speed while maintaining accuracy.

Contribution

The paper proposes rank optimization and sequential freezing methods to enhance training acceleration of low-rank decomposed models without reducing decomposition ranks.

Findings

01

Up to 60% training throughput improvement

02

Up to 37% inference speedup

03

Maintains accuracy close to original models

Abstract

Low Rank Decomposition (LRD) is a model compression technique applied to the weight tensors of deep learning models in order to reduce the number of trainable parameters and computational complexity. However, due to high number of new layers added to the architecture after applying LRD, it may not lead to a high training/inference acceleration if the decomposition ranks are not small enough. The issue is that using small ranks increases the risk of significant accuracy drop after decomposition. In this paper, we propose two techniques for accelerating low rank decomposed models without requiring to use small ranks for decomposition. These methods include rank optimization and sequential freezing of decomposed layers. We perform experiments on both convolutional and transformer-based models. Experiments show that these techniques can improve the model throughput up to 60% during training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Computational Physics and Python Applications · Advanced Neural Network Applications