DNArch: Learning Convolutional Neural Architectures by Backpropagation
David W. Romero, Neil Zeghidour

TL;DR
DNArch introduces a differentiable method for jointly learning CNN weights and architectures, enabling automatic discovery of optimal configurations like kernel sizes, channels, downsampling, and depth, across various tasks.
Contribution
It proposes a novel continuous, differentiable approach to neural architecture search that is not limited to predefined components, allowing comprehensive architecture discovery.
Findings
DNArch effectively finds high-performing CNN architectures for classification and dense prediction tasks.
It can incorporate computational budget constraints during training.
Demonstrates versatility across sequential and image data tasks.
Abstract
We present Differentiable Neural Architectures (DNArch), a method that jointly learns the weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation. In particular, DNArch allows learning (i) the size of convolutional kernels at each layer, (ii) the number of channels at each layer, (iii) the position and values of downsampling layers, and (iv) the depth of the network. To this end, DNArch views neural architectures as continuous multidimensional entities, and uses learnable differentiable masks along each dimension to control their size. Unlike existing methods, DNArch is not limited to a predefined set of possible neural components, but instead it is able to discover entire CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds performant CNN architectures for several classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning
