Multi-objective Neural Architecture Search via Non-stationary Policy Gradient
Zewei Chen, Fengwei Zhou, George Trimponias, Zhenguo Li

TL;DR
This paper introduces a novel reinforcement learning approach called non-stationary policy gradient for multi-objective neural architecture search, enabling efficient approximation of the Pareto front and superior architecture performance.
Contribution
It proposes a new RL-based framework with non-stationary rewards, innovative exploration, and a shared model for fast, accurate multi-objective NAS.
Findings
Efficient approximation of the full Pareto front.
Discovered architectures outperform existing methods.
Framework achieves fast speeds and high predictive performance.
Abstract
Multi-objective Neural Architecture Search (NAS) aims to discover novel architectures in the presence of multiple conflicting objectives. Despite recent progress, the problem of approximating the full Pareto front accurately and efficiently remains challenging. In this work, we explore the novel reinforcement learning (RL) based paradigm of non-stationary policy gradient (NPG). NPG utilizes a non-stationary reward function, and encourages a continuous adaptation of the policy to capture the entire Pareto front efficiently. We introduce two novel reward functions with elements from the dominant paradigms of scalarization and evolution. To handle non-stationarity, we propose a new exploration scheme using cosine temperature decay with warm restarts. For fast and accurate architecture evaluation, we introduce a novel pre-trained shared model that we continuously fine-tune throughout…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Machine Learning and Data Classification · Machine Learning in Materials Science
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
