Active Reinforcement Learning over MDPs

Qi Yang; Peng Yang; Ke Tang

arXiv:2108.02323·cs.LG·August 18, 2021

Active Reinforcement Learning over MDPs

Qi Yang, Peng Yang, Ke Tang

PDF

Open Access

TL;DR

This paper introduces an Active Reinforcement Learning framework over MDPs that actively selects training instances to improve generalization efficiency with limited resources, outperforming traditional methods.

Contribution

It proposes a novel framework for active instance selection in RL, incorporating evaluation metrics and mechanisms to enhance resource efficiency and generalization performance.

Findings

01

Active instance selection improves generalization efficiency.

02

The framework outperforms unbiased data selection methods.

03

Proximal Policy Optimization enhances the effectiveness of the approach.

Abstract

The past decade has seen the rapid development of Reinforcement Learning, which acquires impressive performance with numerous training resources. However, one of the greatest challenges in RL is generalization efficiency (i.e., generalization performance in a unit time). This paper proposes a framework of Active Reinforcement Learning (ARL) over MDPs to improve generalization efficiency in a limited resource by instance selection. Given a number of instances, the algorithm chooses out valuable instances as training sets while training the policy, thereby costing fewer resources. Unlike existing approaches, we attempt to actively select and use training data rather than train on all the given data, thereby costing fewer resources. Furthermore, we introduce a general instance evaluation metrics and selection mechanism into the framework. Experiments results reveal that the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Data Stream Mining Techniques