Rescaling CNN through Learnable Repetition of Network Parameters

Arnav Chavan; Udbhav Bamba; Rishabh Tiwari; Deepak Gupta

arXiv:2101.05650·cs.CV·August 20, 2021

Rescaling CNN through Learnable Repetition of Network Parameters

Arnav Chavan, Udbhav Bamba, Rishabh Tiwari, Deepak Gupta

PDF

1 Repo

TL;DR

This paper introduces a learnable weight repetition strategy for CNNs that enhances performance without increasing parameter count, demonstrating that weight sharing significantly contributes to improvements in various CNN architectures.

Contribution

The paper proposes a novel learnable weight repetition method for CNNs, showing it can boost performance and explain part of the gains from group-equivariant CNNs without increasing parameters.

Findings

01

Small rescaled networks achieve performance comparable to larger ones with fewer parameters.

02

Learnable weight sharing accounts for a significant portion of improvements in group-equivariant CNNs.

03

Up to 40% of the gains in rotation-equivariant CNNs may be due to learned weight repetition.

Abstract

Deeper and wider CNNs are known to provide improved performance for deep learning tasks. However, most such networks have poor performance gain per parameter increase. In this paper, we investigate whether the gain observed in deeper models is purely due to the addition of more optimization parameters or whether the physical size of the network as well plays a role. Further, we present a novel rescaling strategy for CNNs based on learnable repetition of its parameters. Based on this strategy, we rescale CNNs without changing their parameter count, and show that learnable sharing of weights itself can provide significant boost in the performance of any given model without changing its parameter count. We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one. The relevance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

transmuteAI/RepeatNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.