Neuro-symbolic Action Masking for Deep Reinforcement Learning
Shuai Han, Mehdi Dastani, Shihan Wang

TL;DR
This paper introduces NSAM, a framework that automatically learns symbolic models during deep reinforcement learning to improve sample efficiency and reduce infeasible actions, integrating symbolic reasoning with policy learning.
Contribution
NSAM is a novel approach that automatically learns symbolic models from high-dimensional states during DRL, enhancing action feasibility and learning efficiency.
Findings
NSAM significantly improves sample efficiency in DRL tasks.
NSAM reduces constraint violations during learning.
NSAM outperforms baseline methods in multiple domains.
Abstract
Deep reinforcement learning (DRL) may explore infeasible actions during training and execution. Existing approaches assume a symbol grounding function that maps high-dimensional states to consistent symbolic representations and a manually specified action masking techniques to constrain actions. In this paper, we propose Neuro-symbolic Action Masking (NSAM), a novel framework that automatically learn symbolic models, which are consistent with given domain constraints of high-dimensional states, in a minimally supervised manner during the DRL process. Based on the learned symbolic model of states, NSAM learns action masks that rules out infeasible actions. NSAM enables end-to-end integration of symbolic reasoning and deep policy optimization, where improvements in symbolic grounding and policy learning mutually reinforce each other. We evaluate NSAM on multiple domains with constraints,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Multimodal Machine Learning Applications
