Simultaneous Weight and Architecture Optimization for Neural Networks

Zitong Huang; Mansooreh Montazerin; Ajitesh Srivastava

arXiv:2410.08339·cs.LG·October 14, 2024

Simultaneous Weight and Architecture Optimization for Neural Networks

Zitong Huang, Mansooreh Montazerin, Ajitesh Srivastava

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel training framework that learns neural network architecture and parameters simultaneously through gradient descent, enabling the discovery of sparse, compact, and high-performing networks without traditional NAS steps.

Contribution

It introduces a multi-scale encoder-decoder architecture that embeds neural networks for joint optimization of structure and weights via gradient descent.

Findings

01

Successfully discovers sparse, compact neural networks.

02

Maintains high performance comparable to traditional methods.

03

Demonstrates effectiveness across datasets.

Abstract

Neural networks are trained by choosing an architecture and training the parameters. The choice of architecture is often by trial and error or with Neural Architecture Search (NAS) methods. While NAS provides some automation, it often relies on discrete steps that optimize the architecture and then train the parameters. We introduce a novel neural network training framework that fundamentally transforms the process by learning architecture and parameters simultaneously with gradient descent. With the appropriate setting of the loss function, it can discover sparse and compact neural networks for given datasets. Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other (irrespective of their architectures and weights). To train a neural network with a given dataset, we randomly sample a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zitonghuangcynthia/Simultaneous-Weight-and-Architecture-Optimization
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications