Compact and Optimal Deep Learning with Recurrent Parameter Generators

Jiayun Wang; Yubei Chen; Stella X. Yu; Brian Cheung; Yann LeCun

arXiv:2107.07110·cs.CV·October 28, 2022

Compact and Optimal Deep Learning with Recurrent Parameter Generators

Jiayun Wang, Yubei Chen, Stella X. Yu, Brian Cheung, Yann LeCun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel recurrent parameter generator that decouples model size from degrees of freedom, enabling highly compact yet accurate deep learning models through end-to-end constrained optimization.

Contribution

The paper presents a new approach using a recurrent parameter generator to optimize models with random linear constraints, achieving significant parameter reduction while maintaining high accuracy.

Findings

01

Achieves 96% of ResNet18 accuracy with only 18% of parameters.

02

Log-linear relationship between model DoF and accuracy.

03

Models can be pruned and quantized for further efficiency.

Abstract

Deep learning has achieved tremendous success by training increasingly large models, which are then compressed for practical deployment. We propose a drastically different approach to compact and optimal deep learning: We decouple the Degrees of freedom (DoF) and the actual number of parameters of a model, optimize a small DoF with predefined random linear constraints for a large model of arbitrary architecture, in one-stage end-to-end learning. Specifically, we create a recurrent parameter generator (RPG), which repeatedly fetches parameters from a ring and unpacks them onto a large model with random permutation and sign flipping to promote parameter decorrelation. We show that gradient descent can automatically find the best model under constraints with faster convergence. Our extensive experimentation reveals a log-linear relationship between model DoF and accuracy. Our RPG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samaonline/Recurrent-Parameter-Generators
pytorchOfficial

Videos

Compact and Optimal Deep Learning with Recurrent Parameter Generators· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Residual Connection · Average Pooling · Global Average Pooling · 1x1 Convolution · Kaiming Initialization · Residual Block · Bottleneck Residual Block · Max Pooling