Differentiable Neural Architecture Search with Morphism-based Transformable Backbone Architectures
Renlong Jie, Junbin Gao

TL;DR
This paper introduces a morphism-based, transformable backbone architecture for differentiable neural architecture search, enabling dynamic growth during training and improving efficiency and performance in recurrent neural networks.
Contribution
It proposes a novel growing mechanism based on network morphism for differentiable NAS, allowing adaptive architecture growth during training.
Findings
Effective in multi-variate time series forecasting.
Outperforms baseline architectures like LSTM.
Enhances efficiency of differentiable architecture search.
Abstract
This study aims at making the architecture search process more adaptive for one-shot or online training. It is extended from the existing study on differentiable neural architecture search, and we made the backbone architecture transformable rather than fixed during the training process. As is known, differentiable neural architecture search (DARTS) requires a pre-defined over-parameterized backbone architecture, while its size is to be determined manually. Also, in DARTS backbone, Hadamard product of two elements is not introduced, which exists in both LSTM and GRU cells for recurrent nets. This study introduces a growing mechanism for differentiable neural architecture search based on network morphism. It enables growing of the cell structures from small size towards large size ones with one-shot training. Two modes can be applied in integrating the growing and original pruning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Neural Networks and Reservoir Computing
MethodsPruning · Differentiable Architecture Search · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Gated Recurrent Unit
