TL;DR
This paper introduces a novel NAS method that explicitly models inter-layer dependencies in network architecture optimization, leading to improved performance over existing methods.
Contribution
It proposes a new approach that captures dependencies between edges in the architecture graph, enhancing the NAS process by modeling sequential decision dependencies.
Findings
Outperforms state-of-the-art NAS methods on five benchmarks.
Demonstrates the importance of modeling inter-layer dependencies.
Validates the effectiveness of the transition matrix approach.
Abstract
Differential Neural Architecture Search (NAS) methods represent the network architecture as a repetitive proxy directed acyclic graph (DAG) and optimize the network weights and architecture weights alternatively in a differential manner. However, existing methods model the architecture weights on each edge (i.e., a layer in the network) as statistically independent variables, ignoring the dependency between edges in DAG induced by their directed topological connections. In this paper, we make the first attempt to investigate such dependency by proposing a novel Inter-layer Transition NAS method. It casts the architecture optimization into a sequential decision process where the dependency between the architecture weights of connected edges is explicitly modeled. Specifically, edges are divided into inner and outer groups according to whether or not their predecessor edges are in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
