Growth strategies for arbitrary DAG neural architectures

Stella Douka (LISN,TAU); Manon Verbockhaven (LISN,TAU); Th\'eo; Rudkiewicz (LISN,TAU); St\'ephane Rivaud (LISN,TAU); Fran\c{c}ois P. Landes; (TAU,LISN); Sylvain Chevallier (TAU,LISN); Guillaume Charpiat (TAU,LISN)

arXiv:2501.12690·cs.LG·February 17, 2025

Growth strategies for arbitrary DAG neural architectures

Stella Douka (LISN,TAU), Manon Verbockhaven (LISN,TAU), Th\'eo, Rudkiewicz (LISN,TAU), St\'ephane Rivaud (LISN,TAU), Fran\c{c}ois P. Landes, (TAU,LISN), Sylvain Chevallier (TAU,LISN), Guillaume Charpiat (TAU,LISN)

PDF

TL;DR

This paper proposes methods for dynamically growing arbitrary DAG neural architectures during training to improve efficiency and reduce costs, by expanding models in a flexible, data-driven manner.

Contribution

It extends neural architecture growth techniques to arbitrary DAGs, enabling more efficient and adaptable network expansion during training.

Findings

01

Effective strategies for reducing computational costs.

02

Demonstrated growth of DAG architectures during training.

03

Improved parameter efficiency in neural networks.

Abstract

Deep learning has shown impressive results obtained at the cost of training huge neural networks. However, the larger the architecture, the higher the computational, financial, and environmental costs during training and inference. We aim at reducing both training and inference durations. We focus on Neural Architecture Growth, which can increase the size of a small model when needed, directly during training using information from the backpropagation. We expand existing work and freely grow neural networks in the form of any Directed Acyclic Graph by reducing expressivity bottlenecks in the architecture. We explore strategies to reduce excessive computations and steer network growth toward more parameter-efficient architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus