Structured Pruning Adapters

Lukas Hedegaard; Aman Alok; Juby Jose; Alexandros Iosifidis

arXiv:2211.10155·cs.CV·February 3, 2023·1 cites

Structured Pruning Adapters

Lukas Hedegaard, Aman Alok, Juby Jose, Alexandros Iosifidis

PDF

Open Access 1 Repo 4 Reviews

TL;DR

Structured Pruning Adapters (SPAs) are a novel method that combines structured pruning with adapter-based networks to achieve faster inference and efficient task adaptation, significantly reducing parameters while maintaining high accuracy.

Contribution

The paper introduces Structured Pruning Adapters, a new approach that accelerates and compresses adapter networks using structured pruning, outperforming traditional fine-tuning in efficiency and accuracy.

Findings

01

Channel-SPAs improve accuracy by 6.9% on average.

02

They use half the parameters at 90% pruning.

03

Achieve 17x fewer parameters with minimal accuracy loss.

Abstract

Adapters are a parameter-efficient alternative to fine-tuning, which augment a frozen base network to learn new tasks. Yet, the inference of the adapted model is often slower than the corresponding fine-tuned model. To improve on this, we propose Structured Pruning Adapters (SPAs), a family of compressing, task-switching network adapters, that accelerate and specialize networks using tiny parameter sets and structured pruning. Specifically, we propose a channel-based SPA and evaluate it with a suite of pruning methods on multiple computer vision benchmarks. Compared to regular structured pruning with fine-tuning, our channel-SPAs improve accuracy by 6.9% on average while using half the parameters at 90% pruned weights. Alternatively, they can learn adaptations with 17x fewer parameters at 70% pruning with 1.6% lower accuracy. Similarly, our block-SPA requires far fewer parameters than…

Peer Reviews

Decision·ICLR 2024 Conference Withdrawn Submission

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

the proposed technique of SPLoRA enables interesting combination of applications allowing practitioners to deploy a model fine-tuned for a variety of tasks while also accelerating them in a way thats very flexible. They demonstrate that SPLoRA can be more effective than fine-pruning for more extreme levels of sparsity, which may be important for mobile and edge applications. Experiments are extensive and convincing. Paper is written in a clear and direct way.

Weaknesses

Its a bit difficult to understand the utility here. Looking at the results in figure 3 and table 1 it seems that SPLoRA is strictly worse than pruning and finetuning until one reaches extreme levels of sparsity. Even then the drop in accuracy (for example at 10% density) is significant enough that it may be unacceptable. I would also like to note it was a bit hard to understand exactly what experiments are being conducted here. I would have appreciated more self encapsulation of the experimenta

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

1. The writing is clear. 2. In the evaluation, it’s good that the authors repeat each experiment three times and report the mean and standard deviation of each metric.

Weaknesses

1. The discussion of related works is not sufficient. The discussion about transfer pruning works is missing. 2. In Section 2, it would be better if the authors discuss the difference between the proposed method and related works, the limitations of the related methods, and which pruning category (e.g. iterative pruning v.s. one-shot pruning, structured pruning v.s. unstructured pruning) the proposed method belongs to. Although we can tell which pruning category SPAs belong to in the later secti

Reviewer 03Rating 3· reject, not good enoughConfidence 5

Strengths

1. Extensive experiments, including 5 image classification tasks, four different pruning methods, and four model architectures. 2. The proposed method can reduce the trainable parameters during pruning.

Weaknesses

While the paper is easy to read and conducts extensive experiments to demonstrate the effectiveness of introducing adapters during structured pruning, it faces several key limitations: 1. It lacks novelty. The concept of introducing LoRA adapters to reduce the number of trainable parameters is intuitive and straightforward. 2. The paper does not delve into any technical challenges, possibly due to the straightforward nature of the idea. 3. Overall, the work reads more like an engineering report

Reviewer 04Rating 1· strong rejectConfidence 4

Strengths

I think this paper is very well written. It leverages trending methods in the field, which shows the knowledge of the authors in the matter. The results are clearly superior to relevant baselines.

Weaknesses

My main concern is the lack of novelty. The authors do not innovate w.r.t. pruning nor low rank adapters. Furthermore, the proposed combination of the two is straightforward. In my opinion, a naive combination of two well known method does not constitute a scientific contribution. I would advise the author to either highlight their specific contribution to the domain of pruning or adapters, if there are any. Second, if there are no contributions to those fields, I would recommend exposing how

Code & Models

Repositories

lukashedegaard/structured-pruning-adapters
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Brain Tumor Detection and Classification

MethodsLib · Pruning · Balanced Selection