When Vision Overrides Language: Evaluating and Mitigating Counterfactual Failures in VLAs
Yu Fang, Yuchun Feng, Dong Jing, Jiaqi Liu, Yue Yang, Zhenyu Wei, Daniel Szafir, Mingyu Ding

TL;DR
This paper introduces LIBERO-CF, a benchmark for evaluating counterfactual failures in Vision-Language-Action models, and proposes CAG, a simple inference scheme that reduces reliance on visual shortcuts, improving robustness and accuracy.
Contribution
The paper presents LIBERO-CF for systematic evaluation of counterfactual failures and introduces CAG, a plug-and-play method that enhances VLAs by regularizing language conditioning without retraining.
Findings
CAG improves language following accuracy by 9.7% on LIBERO-CF.
CAG reduces counterfactual failures by 9.4% in real-world tests.
CAG enhances task success rates by up to 17.2%.
Abstract
Vision-Language-Action models (VLAs) promise to ground language instructions in robot control, yet in practice often fail to faithfully follow language. When presented with instructions that lack strong scene-specific supervision, VLAs suffer from counterfactual failures: they act based on vision shortcuts induced by dataset biases, repeatedly executing well-learned behaviors and selecting objects frequently seen during training regardless of language intent. To systematically study it, we introduce LIBERO-CF, the first counterfactual benchmark for VLAs that evaluates language following capability by assigning alternative instructions under visually plausible LIBERO layouts. Our evaluation reveals that counterfactual failures are prevalent yet underexplored across state-of-the-art VLAs. We propose Counterfactual Action Guidance (CAG), a simple yet effective dual-branch inference scheme…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Robot Manipulation and Learning
