Foundation Models for Semantic Novelty in Reinforcement Learning

Tarun Gupta; Peter Karkus; Tong Che; Danfei Xu; Marco Pavone

arXiv:2211.04878·cs.LG·November 10, 2022·1 cites

Foundation Models for Semantic Novelty in Reinforcement Learning

Tarun Gupta, Peter Karkus, Tong Che, Danfei Xu, Marco Pavone

PDF

Open Access

TL;DR

This paper introduces a novel intrinsic reward for reinforcement learning based on foundation models like CLIP, enabling semantically meaningful exploration without additional training, and demonstrating superior performance in complex environments.

Contribution

The paper proposes using pre-trained foundation models as intrinsic rewards in RL, eliminating the need for fine-tuning and improving exploration efficiency.

Findings

01

CLIP-based intrinsic rewards enhance exploration in sparse environments.

02

The method outperforms existing state-of-the-art exploration techniques.

03

Semantic understanding guides RL agents more effectively.

Abstract

Effectively exploring the environment is a key challenge in reinforcement learning (RL). We address this challenge by defining a novel intrinsic reward based on a foundation model, such as contrastive language image pretraining (CLIP), which can encode a wealth of domain-independent semantic visual-language knowledge about the world. Specifically, our intrinsic reward is defined based on pre-trained CLIP embeddings without any fine-tuning or learning on the target RL task. We demonstrate that CLIP-based intrinsic rewards can drive exploration towards semantically meaningful states and outperform state-of-the-art methods in challenging sparse-reward procedurally-generated environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning

MethodsContrastive Language-Image Pre-training