Growth strategies for arbitrary DAG neural architectures
Stella Douka (LISN,TAU), Manon Verbockhaven (LISN,TAU), Th\'eo, Rudkiewicz (LISN,TAU), St\'ephane Rivaud (LISN,TAU), Fran\c{c}ois P. Landes, (TAU,LISN), Sylvain Chevallier (TAU,LISN), Guillaume Charpiat (TAU,LISN)

TL;DR
This paper proposes methods for dynamically growing arbitrary DAG neural architectures during training to improve efficiency and reduce costs, by expanding models in a flexible, data-driven manner.
Contribution
It extends neural architecture growth techniques to arbitrary DAGs, enabling more efficient and adaptable network expansion during training.
Findings
Effective strategies for reducing computational costs.
Demonstrated growth of DAG architectures during training.
Improved parameter efficiency in neural networks.
Abstract
Deep learning has shown impressive results obtained at the cost of training huge neural networks. However, the larger the architecture, the higher the computational, financial, and environmental costs during training and inference. We aim at reducing both training and inference durations. We focus on Neural Architecture Growth, which can increase the size of a small model when needed, directly during training using information from the backpropagation. We expand existing work and freely grow neural networks in the form of any Directed Acyclic Graph by reducing expressivity bottlenecks in the architecture. We explore strategies to reduce excessive computations and steer network growth toward more parameter-efficient architectures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
