Visual Sparse Steering (VS2): Unsupervised Adaptation for Image Classification using Sparsity-Guided Steering Vectors

Gerasimos Chatzoudis; Zhuowei Li; Gemma E. Moran; Hao Wang; Dimitris N. Metaxas

arXiv:2506.01247·cs.CV·April 16, 2026

Visual Sparse Steering (VS2): Unsupervised Adaptation for Image Classification using Sparsity-Guided Steering Vectors

Gerasimos Chatzoudis, Zhuowei Li, Gemma E. Moran, Hao Wang, Dimitris N. Metaxas

PDF

TL;DR

VS2 is a lightweight, unsupervised test-time adaptation method that uses sparse autoencoder features to steer vision models without updating weights or using labels, improving accuracy efficiently.

Contribution

Introduces VS2, a novel, label-free, feature-level steering approach using sparse autoencoder representations for efficient test-time adaptation.

Findings

01

VS2 improves zero-shot accuracy on CIFAR-100, CUB-200, Tiny-ImageNet.

02

It requires only a forward pass, no backpropagation or weight updates.

03

SAE reconstruction loss provides a reliability measure for safe fallback.

Abstract

Steering vision foundation models at test time, without updating foundation-model weights or using labeled target data, is a desirable yet challenging goal. We present Visual Sparse Steering (VS2), a lightweight, label-free adaptation method that constructs a steering vector from sparse features extracted by a Sparse Autoencoder (SAE) trained on unlabeled in-domain training-split activations of the vision encoder. VS2 offers three key advantages over existing test-time adaptation methods: (1) a feature-level intervention space in sparse SAE representations; (2) efficiency, requiring only a forward pass with no test-time optimization or backpropagation; and (3) a reliability diagnostic based on SAE reconstruction loss that can skip steering when reconstruction is poor, enabling safe fallback to the baseline, a capability not standard in conventional steering vectors and test-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.