Learning Implicitly Recurrent CNNs Through Parameter Sharing

Pedro Savarese; Michael Maire

arXiv:1902.09701·cs.LG·March 15, 2019·25 cites

Learning Implicitly Recurrent CNNs Through Parameter Sharing

Pedro Savarese, Michael Maire

PDF

Open Access 1 Repo

TL;DR

This paper presents a parameter sharing scheme for CNNs that creates hybrid recurrent-convolutional architectures, reducing parameters while maintaining accuracy and enabling implicit discovery of recurrent structures.

Contribution

The authors introduce a novel parameter sharing method that hybridizes CNNs and recurrent networks, achieving parameter efficiency and competitive accuracy with NAS-based architectures.

Findings

01

Significant parameter savings on image classification tasks.

02

Networks with implicit recurrent structures often become actual recurrent networks.

03

Hybrid networks outperform in algorithmic tasks in training speed and extrapolation.

Abstract

We introduce a parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates. Restricting the number of templates yields a flexible hybridization of traditional CNNs and recurrent networks. Compared to traditional CNNs, we demonstrate substantial parameter savings on standard image classification tasks, while maintaining accuracy. Our simple parameter sharing scheme, though defined via soft weights, in practice often yields trained networks with near strict recurrent structure; with negligible side effects, they convert into networks with actual loops. Training these networks thus implicitly involves discovery of suitable recurrent architectures. Though considering only the design aspect of recurrent links, our trained networks achieve accuracy competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lolemacs/soft-sharing
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning

MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory