TL;DR
This paper introduces an uncertainty-aware policy steering framework that calibrates and combines vision-language models with pre-trained policies to improve robot behavior adaptation and minimize human intervention.
Contribution
It proposes a novel framework called UPS that jointly reasons about semantic and action uncertainties, using conformal prediction for calibration and residual learning for continual improvement.
Findings
UPS effectively distinguishes confident, ambiguous, and incapable scenarios.
The framework reduces the need for costly human interventions during deployment.
Experiments demonstrate improved performance in simulation and hardware over uncalibrated baselines.
Abstract
Policy steering is an emerging way to adapt robot behaviors at deployment-time: a learned verifier analyzes low-level action samples proposed by a pre-trained policy (e.g., diffusion policy) and selects only those aligned with the task. While Vision-Language Models (VLMs) are promising general-purpose verifiers due to their reasoning capabilities, existing frameworks often assume these models are well-calibrated. In practice, the overconfident judgment from VLM can degrade the steering performance under both high-level semantic uncertainty in task specifications and low-level action uncertainty or incapability of the pre-trained policy. We propose uncertainty-aware policy steering (UPS), a framework that jointly reasons about semantic task uncertainty and low-level action feasibility, and selects an uncertainty resolution strategy: execute a high-confidence action, clarify task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
