Basis Scaling and Double Pruning for Efficient Inference in Network-Based Transfer Learning
Ken C. L. Wong, Satyananda Kashyap, Mehdi Moradi

TL;DR
This paper introduces a novel basis scaling and double pruning method for network-based transfer learning that significantly reduces model size without substantial accuracy loss, applicable across various architectures.
Contribution
The paper proposes a basis scaling approach using SVD and a double pruning strategy that eliminates the need for fine-tuning and enhances pruning effectiveness in transfer learning.
Findings
Achieves up to 74.6% pruning on CIFAR-10
Reaches 98.9% pruning on MNIST
Maintains less than 1% accuracy loss
Abstract
Network-based transfer learning allows the reuse of deep learning features with limited data, but the resulting models can be unnecessarily large. Although network pruning can improve inference efficiency, existing algorithms usually require fine-tuning that may not be suitable for small datasets. In this paper, using the singular value decomposition, we decompose a convolutional layer into two layers: a convolutional layer with the orthonormal basis vectors as the filters, and a "BasisScalingConv" layer which is responsible for rescaling the features and transforming them back to the original space. As the filters in each decomposed layer are linearly independent, when using the proposed basis scaling factors with the Taylor approximation of importance, pruning can be more effective and fine-tuning individual weights is unnecessary. Furthermore, as the numbers of input and output…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Fetal and Pediatric Neurological Disorders · COVID-19 diagnosis using AI
MethodsPruning
