Can Explicit Physical Feasibility Benefit VLA Learning? An Empirical Study

Yubai Wei; Chen Wu; and Hashem Haghbayan

arXiv:2604.17896·cs.LG·May 6, 2026

Can Explicit Physical Feasibility Benefit VLA Learning? An Empirical Study

Yubai Wei, Chen Wu, and Hashem Haghbayan

PDF

TL;DR

This paper investigates whether explicit physical feasibility supervision can improve vision-language-action models for robot control, showing that it enhances reliability, efficiency, and task performance.

Contribution

The study introduces a geometry-grounded feasibility objective into VLA training and demonstrates its benefits through obstacle-aware manipulation experiments.

Findings

01

Feasibility supervision improves physical reliability of VLA policies.

02

Augmenting training with feasibility signals enhances task performance.

03

Explicit feasibility guidance accelerates learning in low-data scenarios.

Abstract

Vision-Language-Action (VLA) models map multimodal inputs directly to robot actions and are typically trained through large-scale imitation learning. While this paradigm has shown strong performance, prevailing VLA training procedures do not explicitly supervise hard physical constraints such as obstacle avoidance or kinematic feasibility. As a result, the geometric structure underlying physically feasible behavior must be inferred only implicitly from demonstrations. In this paper, we study whether introducing explicit feasibility supervision can provide effective structured guidance for VLA policies. We formulate a simple geometry-grounded feasibility objective and integrate it into the training stage of a diffusion-based VLA policy. To evaluate this idea systematically, we use obstacle-aware manipulation as a controlled probe of geometry-dependent physical feasibility. Empirical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.