HALO: Learning to Prune Neural Networks with Shrinkage

Skyler Seto; Martin T. Wells; Wenyu Zhang

arXiv:2008.10183·cs.LG·March 2, 2021

HALO: Learning to Prune Neural Networks with Shrinkage

Skyler Seto, Martin T. Wells, Wenyu Zhang

PDF

Open Access 1 Repo

TL;DR

HALO introduces a novel Bayesian hierarchical penalty that adaptively sparsifies neural networks, enabling highly sparse models with competitive accuracy, surpassing existing pruning methods at similar sparsity levels.

Contribution

The paper proposes HALO, a new adaptive sparsity penalty based on Bayesian hierarchical models, which effectively prunes neural networks without fine-tuning.

Findings

01

HALO achieves 5% parameter sparsity with high accuracy.

02

HALO outperforms state-of-the-art magnitude pruning methods.

03

Highly sparse networks maintain strong performance.

Abstract

Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data, however this performance is closely tied to model size. Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network. We study different sparsity inducing penalties from the perspective of Bayesian hierarchical models and present a novel penalty called Hierarchical Adaptive Lasso (HALO) which learns to adaptively sparsify weights of a given network via trainable parameters. When used to train over-parametrized networks, our penalty yields small subnetworks with high accuracy without fine-tuning. Empirically, on image recognition tasks, we find that HALO is able to learn highly sparse network (only 5% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skyler120/sparsity-halo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Advanced Neural Network Applications

MethodsPruning