At Human Speed: Deep Reinforcement Learning with Action Delay
Vlad Firoiu, Tina Ju, Josh Tenenbaum

TL;DR
This paper addresses the challenge of action delay in deep reinforcement learning by introducing a neural predictive model that compensates for delays, demonstrated effectively in a competitive video game setting.
Contribution
It proposes a novel neural predictive model to mitigate action delay effects in deep reinforcement learning, inspired by human perception mechanisms.
Findings
Standard deep RL performance drops with human-like delays.
The proposed predictive model restores performance levels.
Effective against professional players in Super Smash Bros. Melee.
Abstract
There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of tasks, from video games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning and reinforcement learning, that learn to play from experience with minimal prior knowledge. However, these machines often do not win through intelligence alone -- they possess vastly superior speed and precision, allowing them to act in ways a human never could. To level the playing field, we restrict the machine's reaction time to a human level, and find that standard deep reinforcement learning methods quickly drop in performance. We propose a solution to the action delay problem inspired by human perception -- to endow agents with a neural predictive model of the environment which "undoes" the delay inherent in their environment -- and demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Digital Games and Media
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
