CORE: Constraint-Aware One-Step Reinforcement Learning for Simulation-Guided Neural Network Accelerator Design
Yifeng Xiao, Yurong Xu, Ning Yan, Masood Mortazavi, Pierluigi Nuzzo

TL;DR
CORE introduces a constraint-aware, one-step reinforcement learning approach for simulation-guided neural network accelerator design, improving sample efficiency and constraint satisfaction without relying on value functions.
Contribution
It presents a novel critic-free, one-step RL method that effectively handles complex constraints and hybrid action spaces in high-dimensional design optimization.
Findings
Significantly improves sample efficiency over existing methods.
Achieves better accelerator configurations in neural network hardware mapping.
Demonstrates broad applicability to constrained design problems.
Abstract
Simulation-based design space exploration (DSE) aims to efficiently optimize high-dimensional structured designs under complex constraints and expensive evaluation costs. Existing approaches, including heuristic and multi-step reinforcement learning (RL) methods, struggle to balance sampling efficiency and constraint satisfaction due to sparse, delayed feedback, and large hybrid action spaces. In this paper, we introduce CORE, a constraint-aware, one-step RL method for simulationguided DSE. In CORE, the policy agent learns to sample design configurations by defining a structured distribution over them, incorporating dependencies via a scaling-graph-based decoder, and by reward shaping to penalize invalid designs based on the feedback obtained from simulation. CORE updates the policy using a surrogate objective that compares the rewards of designs within a sampled batch, without learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Model Reduction and Neural Networks · Machine Learning in Materials Science
