SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models

Anushka Sivakumar; Andrew Zhang; Zaber Hakim; Chris Thomas

arXiv:2510.26769·cs.CV·October 31, 2025

SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models

Anushka Sivakumar, Andrew Zhang, Zaber Hakim, Chris Thomas

PDF

TL;DR

SteerVLM introduces a lightweight, inference-time control module for vision-language models that guides outputs without retraining, using activation steering and a new multimodal dataset.

Contribution

A novel lightweight activation steering method enabling fine-grained, inference-time control of VLMs without modifying weights, supported by the VNIA dataset for evaluation.

Findings

01

Outperforms existing intervention techniques on steering benchmarks

02

Requires only 0.14% of original model parameters for control

03

Effectively mitigates hallucinations in VLM outputs

Abstract

This work introduces SteerVLM, a lightweight steering module designed to guide Vision-Language Models (VLMs) towards outputs that better adhere to desired instructions. Our approach learns from the latent embeddings of paired prompts encoding target and converse behaviors to dynamically adjust activations connecting the language modality with image context. This allows for fine-grained, inference-time control over complex output semantics without modifying model weights while preserving performance on off-target tasks. Our steering module requires learning parameters equal to 0.14% of the original VLM's size. Our steering module gains model control through dimension-wise activation modulation and adaptive steering across layers without requiring pre-extracted static vectors or manual tuning of intervention points. Furthermore, we introduce VNIA (Visual Narrative Intent Alignment), a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.