Differentiable Architecture Pruning for Transfer Learning

Nicolo Colombo; Yang Gao

arXiv:2107.03375·cs.LG·July 8, 2021

Differentiable Architecture Pruning for Transfer Learning

Nicolo Colombo, Yang Gao

PDF

Open Access

TL;DR

This paper introduces a gradient-based method for extracting transferable, low-complexity neural network architectures from large models, enabling effective transfer learning with limited data and providing theoretical guarantees.

Contribution

It presents a novel gradient-based architecture pruning approach that disentangles architecture from weights, suitable for transfer learning and backed by convergence guarantees.

Findings

01

Effective transfer learning with minimal data.

02

The method produces architectures that can be retrained successfully.

03

Theoretical convergence guarantees are provided.

Abstract

We propose a new gradient-based approach for extracting sub-architectures from a given large model. Contrarily to existing pruning methods, which are unable to disentangle the network architecture and the corresponding weights, our architecture-pruning scheme produces transferable new structures that can be successfully retrained to solve different tasks. We focus on a transfer-learning setup where architectures can be trained on a large data set but very few data points are available for fine-tuning them on new tasks. We define a new gradient-based algorithm that trains architectures of arbitrarily low complexity independently from the attached weights. Given a search space defined by an existing large neural model, we reformulate the architecture search task as a complexity-penalized subset-selection problem and solve it through a two-temperature relaxation scheme. We provide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Human Pose and Action Recognition · Machine Learning and ELM

MethodsPruning