A Minimalist Prompt for Zero-Shot Policy Learning

Meng Song; Xuezhi Wang; Tanay Biradar; Yao Qin; Manmohan Chandraker

arXiv:2405.06063·cs.LG·May 13, 2024

A Minimalist Prompt for Zero-Shot Policy Learning

Meng Song, Xuezhi Wang, Tanay Biradar, Yao Qin, Manmohan Chandraker

PDF

Open Access

TL;DR

This paper investigates the minimal information needed in prompts for zero-shot policy learning, showing that conditioning on task parameters alone can match or surpass demonstration-based generalization, with an added learnable prompt further improving performance.

Contribution

It demonstrates that task parameters are sufficient for zero-shot generalization in decision transformers, and introduces a learnable prompt to enhance this effect.

Findings

01

Conditioning on task parameters enables zero-shot generalization comparable to demonstrations.

02

A learnable prompt further improves zero-shot performance across various tasks.

03

Task parameters are key to extracting essential information for generalization.

Abstract

Transformer-based methods have exhibited significant generalization ability when prompted with target-domain demonstrations or example solutions during inference. Although demonstrations, as a way of task specification, can capture rich information that may be hard to specify by language, it remains unclear what information is extracted from the demonstrations to help generalization. Moreover, assuming access to demonstrations of an unseen task is impractical or unreasonable in many real-world scenarios, especially in robotics applications. These questions motivate us to explore what the minimally sufficient prompt could be to elicit the same level of generalization ability as the demonstrations. We study this problem in the contextural RL setting which allows for quantitative measurement of generalization and is commonly adopted by meta-RL and multi-task RL benchmarks. In this setting,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare