Observing and Controlling Features in Vision-Language-Action Models

Hugo Buurmeijer; Carmen Amo Alonso; Aiden Swann; Marco Pavone

arXiv:2603.05487·cs.RO·March 6, 2026

Observing and Controlling Features in Vision-Language-Action Models

Hugo Buurmeijer, Carmen Amo Alonso, Aiden Swann, Marco Pavone

PDF

Open Access

TL;DR

This paper introduces methods to observe and control features within Vision-Language-Action Models, enabling real-time, interpretable, and lightweight steering of robotic behaviors without the need for fine-tuning.

Contribution

It proposes the concepts of feature-observability and feature-controllability, providing techniques for linear observation and intervention in VLA internal representations.

Findings

01

Targeted linear interventions can steer robot behavior reliably.

02

VLAs have interpretable internal structures suitable for online adaptation.

03

Interventions preserve closed-loop capabilities during real-time control.

Abstract

Vision-Language-Action Models (VLAs) have shown remarkable progress towards embodied intelligence. While their architecture partially resembles that of Large Language Models (LLMs), VLAs exhibit higher complexity due to their multi-modal inputs/outputs and often hybrid nature of transformer and diffusion heads. This is part of the reason why insights from mechanistic interpretability in LLMs, which explain how the internal model representations relate to their output behavior, do not trivially transfer to VLA counterparts. In this work, we propose to close this gap by introducing and analyzing two main concepts: feature-observability and feature-controllability. In particular, we first study features that are linearly encoded in representation space, and show how they can be observed by means of a linear classifier. Then, we use a minimal linear intervention grounded in optimal control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Language and cultural evolution · Domain Adaptation and Few-Shot Learning