CycleVLA: Proactive Self-Correcting Vision-Language-Action Models via Subtask Backtracking and Minimum Bayes Risk Decoding

Chenyang Ma; Guangyu Yang; Kai Lu; Shitong Xu; Bill Byrne; Niki Trigoni; Andrew Markham

arXiv:2601.02295·cs.RO·January 6, 2026

CycleVLA: Proactive Self-Correcting Vision-Language-Action Models via Subtask Backtracking and Minimum Bayes Risk Decoding

Chenyang Ma, Guangyu Yang, Kai Lu, Shitong Xu, Bill Byrne, Niki Trigoni, Andrew Markham

PDF

Open Access

TL;DR

CycleVLA introduces a proactive approach to vision-language-action models, enabling early failure detection and correction through subtask backtracking and MBR decoding, significantly enhancing robot task performance.

Contribution

The paper presents CycleVLA, a novel system that proactively detects and corrects failures in VLAs using subtask backtracking and Minimum Bayes Risk decoding, a new approach in robot failure management.

Findings

01

CycleVLA improves success rates for both well-trained and under-trained VLAs.

02

MBR decoding acts as an effective zero-shot test-time scaling strategy.

03

Proactive failure detection reduces post-failure correction needs.

Abstract

Current work on robot failure detection and correction typically operate in a post hoc manner, analyzing errors and applying corrections only after failures occur. This work introduces CycleVLA, a system that equips Vision-Language-Action models (VLAs) with proactive self-correction, the capability to anticipate incipient failures and recover before they fully manifest during execution. CycleVLA achieves this by integrating a progress-aware VLA that flags critical subtask transition points where failures most frequently occur, a VLM-based failure predictor and planner that triggers subtask backtracking upon predicted failure, and a test-time scaling strategy based on Minimum Bayes Risk (MBR) decoding to improve retry success after backtracking. Extensive experiments show that CycleVLA improves performance for both well-trained and under-trained VLAs, and that MBR serves as an effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications