Efficient Policy Adaptation with Contrastive Prompt Ensemble for   Embodied Agents

Wonje Choi; Woo Kyung Kim; SeungHyun Kim; Honguk Woo

arXiv:2412.11484·cs.AI·December 17, 2024·3 cites

Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents

Wonje Choi, Woo Kyung Kim, SeungHyun Kim, Honguk Woo

PDF

Open Access 1 Video

TL;DR

This paper introduces ConPE, a contrastive prompt ensemble framework that leverages a pretrained vision-language model with visual prompts to enable rapid zero-shot policy adaptation for embodied agents across diverse environments.

Contribution

The paper proposes a novel contrastive prompt ensemble approach that enhances policy generalization and adaptation in embodied RL by using multiple visual prompts and a guided-attention mechanism.

Findings

01

ConPE outperforms state-of-the-art methods in various embodied tasks.

02

It improves sample efficiency in policy learning.

03

It enables rapid zero-shot adaptation to unseen environments.

Abstract

For embodied reinforcement learning (RL) agents interacting with the environment, it is desirable to have rapid policy adaptation to unseen visual observations, but achieving zero-shot adaptation capability is considered as a challenging problem in the RL context. To address the problem, we present a novel contrastive prompt ensemble (ConPE) framework which utilizes a pretrained vision-language model and a set of visual prompts, thus enabling efficient policy learning and adaptation upon a wide range of environmental and physical changes encountered by embodied agents. Specifically, we devise a guided-attention-based ensemble approach with multiple visual prompts on the vision-language model to construct robust state representations. Each prompt is contrastively learned in terms of an individual domain factor that significantly affects the agent's egocentric perception and observation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSparse Evolutionary Training · Entropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator