Prompt-based Visual Alignment for Zero-shot Policy Transfer
Haihan Gao, Rui Zhang, Qi Yi, Hantao Yao, Haochen Li, Jiaming Guo,, Shaohui Peng, Yunkai Gao, QiCheng Wang, Xing Hu, Yuanbo Wen, Zihao Zhang,, Zidong Du, Ling Li, Qi Guo, Yunji Chen

TL;DR
This paper introduces prompt-based visual alignment (PVA), a framework that uses visual-language models and prompt tuning to improve zero-shot policy transfer in reinforcement learning by aligning images across domains with semantic constraints.
Contribution
The work presents a novel prompt-based visual alignment method that leverages visual-language models and explicit semantic constraints to enhance cross-domain generalization in RL.
Findings
PVA achieves strong zero-shot generalization in unseen domains.
The framework reduces the need for extensive multi-domain data.
Experiments demonstrate improved performance in autonomous driving tasks.
Abstract
Overfitting in RL has become one of the main obstacles to applications in reinforcement learning(RL). Existing methods do not provide explicit semantic constrain for the feature extractor, hindering the agent from learning a unified cross-domain representation and resulting in performance degradation on unseen domains. Besides, abundant data from multiple domains are needed. To address these issues, in this work, we propose prompt-based visual alignment (PVA), a robust framework to mitigate the detrimental domain bias in the image for zero-shot policy transfer. Inspired that Visual-Language Model (VLM) can serve as a bridge to connect both text space and image space, we leverage the semantic information contained in a text sequence as an explicit constraint to train a visual aligner. Thus, the visual aligner can map images from multiple domains to a unified domain and achieve good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHong Kong and Taiwan Politics · Multimodal Machine Learning Applications · Human Pose and Action Recognition
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
