DNArch: Learning Convolutional Neural Architectures by Backpropagation

David W. Romero; Neil Zeghidour

arXiv:2302.05400·cs.LG·July 25, 2023

DNArch: Learning Convolutional Neural Architectures by Backpropagation

David W. Romero, Neil Zeghidour

PDF

Open Access

TL;DR

DNArch introduces a differentiable method for jointly learning CNN weights and architectures, enabling automatic discovery of optimal configurations like kernel sizes, channels, downsampling, and depth, across various tasks.

Contribution

It proposes a novel continuous, differentiable approach to neural architecture search that is not limited to predefined components, allowing comprehensive architecture discovery.

Findings

01

DNArch effectively finds high-performing CNN architectures for classification and dense prediction tasks.

02

It can incorporate computational budget constraints during training.

03

Demonstrates versatility across sequential and image data tasks.

Abstract

We present Differentiable Neural Architectures (DNArch), a method that jointly learns the weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation. In particular, DNArch allows learning (i) the size of convolutional kernels at each layer, (ii) the number of channels at each layer, (iii) the position and values of downsampling layers, and (iv) the depth of the network. To this end, DNArch views neural architectures as continuous multidimensional entities, and uses learnable differentiable masks along each dimension to control their size. Unlike existing methods, DNArch is not limited to a predefined set of possible neural components, but instead it is able to discover entire CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds performant CNN architectures for several classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning