Adaptive Data Exploitation in Deep Reinforcement Learning
Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

TL;DR
ADEPT is a framework that adaptively manages data usage in deep reinforcement learning using multi-armed bandit algorithms, leading to improved data efficiency, better generalization, and reduced computational costs across various benchmarks.
Contribution
We propose ADEPT, a novel adaptive data management framework for deep RL that enhances data efficiency and generalization by dynamically optimizing data utilization during training.
Findings
Achieves superior performance on Procgen, MiniGrid, and PyBullet benchmarks.
Reduces computational overhead significantly.
Enhances data efficiency and generalization in deep RL.
Abstract
We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms, optimizing data utilization while mitigating overfitting. Moreover, ADEPT can significantly reduce the computational overhead and accelerate a wide range of RL algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet. Extensive simulation demonstrates that ADEPT can achieve superior performance with remarkable computational efficiency, offering a practical solution to data-efficient RL. Our code is available at https://github.com/yuanmingqi/ADEPT.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlockchain Technology Applications and Security · Data Stream Mining Techniques · Reinforcement Learning in Robotics
