Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models

Ming Sun; Rui Wang; Xingrui Yu; Lihua Jing; Hangyu Du; Zhenglin Wan; Xu Pan; Ivor Tsang

arXiv:2605.09005·cs.RO·May 12, 2026

Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models

Ming Sun, Rui Wang, Xingrui Yu, Lihua Jing, Hangyu Du, Zhenglin Wan, Xu Pan, Ivor Tsang

PDF

TL;DR

This paper introduces GuardVLA, a backdoor-based framework for verifying ownership of vision-language-action models, embedding secret watermarks during training and reliably detecting them post-release.

Contribution

It is the first to develop a backdoor watermarking method specifically for VLAs, ensuring secure ownership verification without compromising model performance.

Findings

01

GuardVLA reliably verifies ownership across multiple datasets and architectures.

02

The embedded watermark remains detectable after model adaptation.

03

GuardVLA preserves the benign task performance of models.

Abstract

Vision-Language-Action models (VLAs) support generalist robotic control by enabling end-to-end decision policies directly from multi-modal inputs. As trained VLAs are increasingly shared and adapted, protecting model ownership becomes essential for secure deployment and responsible open-source usage. In this paper, we present GuardVLA, the first backdoor-based ownership verification framework specifically designed for VLAs. GuardVLA embeds a stealthy and harmless backdoor watermark into the protected model during training by injecting secret messages into embodied visual data. For post-release verification, we propose a swap-and-detect mechanism, in which the trigger projector and an external classifier head are used to activate and detect the embedded backdoor based on prediction probabilities. Extensive experiments across multiple datasets, model architectures, and adaptation settings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.