Visual Sparse Steering (VS2): Unsupervised Adaptation for Image Classification using Sparsity-Guided Steering Vectors
Gerasimos Chatzoudis, Zhuowei Li, Gemma E. Moran, Hao Wang, Dimitris N. Metaxas

TL;DR
VS2 is a lightweight, unsupervised test-time adaptation method that uses sparse autoencoder features to steer vision models without updating weights or using labels, improving accuracy efficiently.
Contribution
Introduces VS2, a novel, label-free, feature-level steering approach using sparse autoencoder representations for efficient test-time adaptation.
Findings
VS2 improves zero-shot accuracy on CIFAR-100, CUB-200, Tiny-ImageNet.
It requires only a forward pass, no backpropagation or weight updates.
SAE reconstruction loss provides a reliability measure for safe fallback.
Abstract
Steering vision foundation models at test time, without updating foundation-model weights or using labeled target data, is a desirable yet challenging goal. We present Visual Sparse Steering (VS2), a lightweight, label-free adaptation method that constructs a steering vector from sparse features extracted by a Sparse Autoencoder (SAE) trained on unlabeled in-domain training-split activations of the vision encoder. VS2 offers three key advantages over existing test-time adaptation methods: (1) a feature-level intervention space in sparse SAE representations; (2) efficiency, requiring only a forward pass with no test-time optimization or backpropagation; and (3) a reliability diagnostic based on SAE reconstruction loss that can skip steering when reconstruction is poor, enabling safe fallback to the baseline, a capability not standard in conventional steering vectors and test-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
