One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs

Youxu Shi; Suorong Yang; Dong Liu

arXiv:2601.23041·cs.CV·February 2, 2026

One-shot Optimized Steering Vector for Hallucination Mitigation for VLMs

Youxu Shi, Suorong Yang, Dong Liu

PDF

Open Access

TL;DR

This paper introduces OSGA, a one-shot, input-independent steering vector method that enhances vision language models by mitigating hallucinations and safety issues efficiently and effectively across various tasks.

Contribution

The paper proposes OSGA, a novel one-shot optimization framework that learns a universal steering vector for VLMs, reducing the need for multiple optimizations and improving robustness.

Findings

01

OSGA improves hallucination mitigation across benchmarks.

02

A single steering vector enhances safety with negligible overhead.

03

Universal applicability of the learned vector during inference.

Abstract

Vision Language Models (VLMs) achieve strong performance on multimodal tasks but still suffer from hallucination and safety-related failures that persist even at scale. Steering offers a lightweight technique to improve model performance. However, steering, whether input-dependent or input-independent, achieves a meaningful trade-off between efficiency and effectiveness. In this work, we observe that steering vectors can generalize across inputs when tasks share aligned semantic intent. Based on this insight, we propose \textbf{OSGA} (\textbf{O}ne-shot \textbf{S}teering with \textbf{G}enerative \textbf{A}nchor), an input-independent framework that improves model performance with a single optimization instance. OSGA first selects an informative sample via a variance-based data selection strategy and learns a single steering vector with a contrastive objective with generative anchor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning