AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making

Wenbo Li; Shiyi Wang; Yiteng Chen; Huiping Zhuang; Qingyao Wu

arXiv:2506.12374·cs.RO·June 25, 2025

AntiGrounding: Lifting Robotic Actions into VLM Representation Space for Decision Making

Wenbo Li, Shiyi Wang, Yiteng Chen, Huiping Zhuang, Qingyao Wu

PDF

Open Access

TL;DR

AntiGrounding introduces a novel framework that elevates robotic actions into VLM space for improved decision making, enabling zero-shot trajectory synthesis and leveraging past experience for better long-term performance.

Contribution

It reverses instruction grounding to directly lift actions into VLM space, uses multi-view rendering and visual QA for decision making, and incorporates offline policy refinement.

Findings

01

Outperforms baselines in simulation and real-world tasks

02

Enables zero-shot synthesis of robot trajectories

03

Improves long-term performance through offline refinement

Abstract

Vision-Language Models (VLMs) encode knowledge and reasoning capabilities for robotic manipulation within high-dimensional representation spaces. However, current approaches often project them into compressed intermediate representations, discarding important task-specific information such as fine-grained spatial or semantic details. To address this, we propose AntiGrounding, a new framework that reverses the instruction grounding process. It lifts candidate actions directly into the VLM representation space, renders trajectories from multiple views, and uses structured visual question answering for instruction-based decision making. This enables zero-shot synthesis of optimal closed-loop robot trajectories for new tasks. We also propose an offline policy refinement module that leverages past experience to enhance long-term performance. Experiments in both simulation and real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Data Visualization and Analytics