Utilizing Skipped Frames in Action Repeats via Pseudo-Actions
Taisei Hashimoto, Yoshimasa Tsuruoka

TL;DR
This paper introduces pseudo-actions to utilize intermediate frames discarded during action repeats in reinforcement learning, improving sample efficiency by making more data available for training.
Contribution
The paper proposes a novel pseudo-action method that leverages intermediate frames in action repeats, enhancing data utilization in model-free reinforcement learning.
Findings
Improved sample efficiency in continuous control tasks.
Enhanced learning performance in discrete control tasks.
Compatible with various Q-learning algorithms.
Abstract
In many deep reinforcement learning settings, when an agent takes an action, it repeats the same action a predefined number of times without observing the states until the next action-decision point. This technique of action repetition has several merits in training the agent, but the data between action-decision points (i.e., intermediate frames) are, in effect, discarded. Since the amount of training data is inversely proportional to the interval of action repeats, they can have a negative impact on the sample efficiency of training. In this paper, we propose a simple but effective approach to alleviate to this problem by introducing the concept of pseudo-actions. The key idea of our method is making the transition between action-decision points usable as training data by considering pseudo-actions. Pseudo-actions for continuous control tasks are obtained as the average of the action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications
