Policy Contrastive Decoding for Robotic Foundation Models

Shihan Wu; Xu Luo; Ji Zhang; Junlin Xie; Jingkuan Song; Heng Tao Shen; Lianli Gao

arXiv:2505.13255·cs.RO·April 27, 2026

Policy Contrastive Decoding for Robotic Foundation Models

Shihan Wu, Xu Luo, Ji Zhang, Junlin Xie, Jingkuan Song, Heng Tao Shen, Lianli Gao

PDF

1 Repo 1 Video

TL;DR

This paper introduces Policy Contrastive Decoding (PCD), a training-free method that improves robotic policies by emphasizing object-relevant visual cues, enhancing generalization in simulation and real-world tasks.

Contribution

The paper proposes a novel, training-free contrastive decoding approach that can be plugged into existing robot policies to improve their focus and generalization without finetuning.

Findings

01

PCD improves the performance of existing policies in simulation and real-world environments.

02

PCD enhances the state-of-the-art policy $\\pi_0$ by 8.9% in simulation.

03

PCD boosts real-world policy performance by 108%.

Abstract

Robotic foundation models, or generalist robot policies, hold immense potential to enable flexible, general-purpose and dexterous robotic systems. Despite their advancements, our empirical experiments reveal that existing robot policies are prone to learning spurious correlations from pre-training trajectories, adversely affecting their generalization capabilities beyond the training data. To tackle this, we propose a novel Policy Contrastive Decoding (PCD) approach, which redirects the robot policy's focus toward object-relevant visual clues by contrasting action probability distributions derived from original and object-masked visual inputs. As a training-free method, our PCD can be used as a plugin to improve different types of robot policies without needing to finetune or access model weights. We conduct extensive experiments on top of three open-source robot policies, including the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

koorye/PCD
github

Videos

Policy Contrastive Decoding for Robotic Foundation Models· slideslive