Active Reinforcement Learning over MDPs
Qi Yang, Peng Yang, Ke Tang

TL;DR
This paper introduces an Active Reinforcement Learning framework over MDPs that actively selects training instances to improve generalization efficiency with limited resources, outperforming traditional methods.
Contribution
It proposes a novel framework for active instance selection in RL, incorporating evaluation metrics and mechanisms to enhance resource efficiency and generalization performance.
Findings
Active instance selection improves generalization efficiency.
The framework outperforms unbiased data selection methods.
Proximal Policy Optimization enhances the effectiveness of the approach.
Abstract
The past decade has seen the rapid development of Reinforcement Learning, which acquires impressive performance with numerous training resources. However, one of the greatest challenges in RL is generalization efficiency (i.e., generalization performance in a unit time). This paper proposes a framework of Active Reinforcement Learning (ARL) over MDPs to improve generalization efficiency in a limited resource by instance selection. Given a number of instances, the algorithm chooses out valuable instances as training sets while training the policy, thereby costing fewer resources. Unlike existing approaches, we attempt to actively select and use training data rather than train on all the given data, thereby costing fewer resources. Furthermore, we introduce a general instance evaluation metrics and selection mechanism into the framework. Experiments results reveal that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Data Stream Mining Techniques
