Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz, Shie, Mannor

TL;DR
This paper introduces AE-DQN, a deep reinforcement learning architecture that improves learning efficiency by eliminating sub-optimal actions using an external elimination signal, leading to faster and more robust performance in complex environments.
Contribution
The paper presents a novel Action-Elimination Deep Q-Network that integrates an Action Elimination Network to filter out invalid actions, enhancing learning speed and robustness.
Findings
Significant speedup over vanilla DQN in complex environments
Improved robustness in text-based games with many actions
Effective action filtering using external elimination signals
Abstract
Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is sometimes easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Data Stream Mining Techniques
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
