Contrastive Initial State Buffer for Reinforcement Learning
Nico Messikommer, Yunlong Song, Davide Scaramuzza

TL;DR
This paper introduces a Contrastive Initial State Buffer that strategically reuses past experiences to initialize agents in reinforcement learning, improving efficiency and performance in robotic tasks without prior environment knowledge.
Contribution
The paper proposes a novel Contrastive Initial State Buffer for reinforcement learning that enhances data reuse for better exploration and faster convergence, applicable across different algorithms.
Findings
Achieves higher task performance than baseline methods.
Speeds up training convergence in robotic tasks.
Effective without prior environment information.
Abstract
In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy updates, they often overlook the potential of reusing past experiences for data collection. Independent of the underlying RL algorithm, we introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment in order to guide it toward more informative states. We validate our approach on two complex robotic tasks without relying on any prior information about the environment: (i) locomotion of a quadruped robot traversing challenging terrains and (ii) a quadcopter drone racing through a track. The experimental results show that our initial state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Data Stream Mining Techniques
