TL;DR
This paper introduces RLFP, a framework that leverages foundation models to enable embodied agents to learn manipulation tasks more efficiently with minimal reward engineering, achieving high success rates in real and simulated environments.
Contribution
The paper proposes RLFP and FAC, novel algorithms that utilize foundation models for guidance, resulting in sample-efficient learning and robust performance with minimal reward engineering.
Findings
FAC achieves 86% success rate on real robots after one hour of training.
FAC outperforms baseline methods in simulation with fewer frames.
The framework is robust to noisy priors and agnostic to foundation model types.
Abstract
Reinforcement learning (RL) is a promising approach for solving robotic manipulation tasks. However, it is challenging to apply the RL algorithms directly in the real world. For one thing, RL is data-intensive and typically requires millions of interactions with environments, which are impractical in real scenarios. For another, it is necessary to make heavy engineering efforts to design reward functions manually. To address these issues, we leverage foundation models in this paper. We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models. Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions. The benefits of our framework are threefold: (1) \textit{sample efficient};…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
