ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget

Guillaume Coiffier; Ghouthi Boukli Hacene; Vincent Gripon

arXiv:2007.10106·cs.LG·July 21, 2020

ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget

Guillaume Coiffier, Ghouthi Boukli Hacene, Vincent Gripon

PDF

TL;DR

ThriftyNet introduces a recursive convolutional architecture that maximizes parameter efficiency, achieving competitive accuracy on CIFAR datasets with significantly fewer parameters than traditional models.

Contribution

The paper presents ThriftyNet, a novel recursive CNN architecture that drastically reduces parameter count while maintaining high performance.

Findings

01

Achieves over 91% accuracy on CIFAR-10 with fewer than 40K parameters.

02

Attains 74.3% accuracy on CIFAR-100 with under 600K parameters.

03

Demonstrates effective parameter utilization through recursive convolutional layers.

Abstract

Typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40K…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.