VLA-Forget: Vision-Language-Action Unlearning for Embodied Foundation Models

Ravi Ranjan; Agoritsa Polyzou

arXiv:2604.03956·cs.CV·April 24, 2026

VLA-Forget: Vision-Language-Action Unlearning for Embodied Foundation Models

Ravi Ranjan, Agoritsa Polyzou

PDF

TL;DR

VLA-Forget is a hybrid unlearning framework for embodied vision-language-action models that effectively removes undesirable behaviors while preserving perception and reasoning capabilities.

Contribution

It introduces a staged, multi-objective unlearning method that jointly optimizes for forgetting, perception preservation, and reasoning retention in embodied models.

Findings

01

Improves forgetting efficacy by 10%

02

Preserves perceptual specificity by 22%

03

Retains reasoning and task success by 9%

Abstract

Vision-language-action (VLA) models are emerging as embodied foundation models for robotic manipulation, but their deployment introduces a new unlearning challenge: removing unsafe, spurious, or privacy-sensitive behaviors without degrading perception, language grounding, and action control. In OpenVLA-style policies, behavior is produced through a fused visual encoder, a cross-modal projector, and a language backbone that predicts tokenized robot actions, so undesirable knowledge can be distributed across perception, alignment, and reasoning/action layers rather than confined to a single module. Consequently, partial unlearning applied only to the vision stack or only to the language backbone is often insufficient, while conventional unlearning baselines designed for standalone vision or language models may leave residual forgetting or incur unnecessary utility loss in embodied…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.