Planning with RL and episodic-memory behavioral priors
Shivansh Beohar, Andrew Melnik

TL;DR
This paper introduces a planning-based reinforcement learning method that leverages behavioral priors to improve exploration efficiency and learning speed, addressing limitations of existing imitation learning approaches.
Contribution
It presents a novel planning approach that effectively incorporates behavioral priors, reducing the need for extensive expert demonstrations and complex learning algorithms.
Findings
Behavioral priors enhance exploration efficiency.
The proposed method accelerates learning in RL environments.
Curated exploration policies improve sample efficiency.
Abstract
The practical application of learning agents requires sample efficient and interpretable algorithms. Learning from behavioral priors is a promising way to bootstrap agents with a better-than-random exploration policy or a safe-guard against the pitfalls of early learning. Existing solutions for imitation learning require a large number of expert demonstrations and rely on hard-to-interpret learning methods like Deep Q-learning. In this work we present a planning-based approach that can use these behavioral priors for effective exploration and learning in a reinforcement learning environment, and we demonstrate that curated exploration policies in the form of behavioral priors can help an agent learn faster.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI-based Problem Solving and Planning · Reinforcement Learning in Robotics · Machine Learning and Algorithms
