Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection

Wenjie Zhu; Yabin Zhang; Xin Jin; Wenjun Zeng; Lei Zhang

arXiv:2507.19847·cs.CV·July 30, 2025

Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection

Wenjie Zhu, Yabin Zhang, Xin Jin, Wenjun Zeng, Lei Zhang

PDF

TL;DR

KR-NFT introduces a knowledge-regularized feature tuning method that enhances out-of-distribution detection in vision-language models by separating features and dynamically adapting to images, outperforming traditional methods especially with limited data.

Contribution

The paper proposes a novel Knowledge Regularized Negative Feature Tuning (KR-NFT) method that improves OOD detection and ID classification by separating features and using adaptive, knowledge-regularized optimization.

Findings

01

KR-NFT outperforms traditional negative prompt tuning in efficiency and scalability.

02

It significantly reduces false positive rate (FPR95) by 5.44% with few-shot ImageNet training.

03

The method enhances OOD detection on unseen ID datasets while maintaining ID classification accuracy.

Abstract

Out-of-distribution (OOD) detection is crucial for building reliable machine learning models. Although negative prompt tuning has enhanced the OOD detection capabilities of vision-language models, these tuned models often suffer from reduced generalization performance on unseen classes and styles. To address this challenge, we propose a novel method called Knowledge Regularized Negative Feature Tuning (KR-NFT), which integrates an innovative adaptation architecture termed Negative Feature Tuning (NFT) and a corresponding knowledge-regularization (KR) optimization strategy. Specifically, NFT applies distribution-aware transformations to pre-trained text features, effectively separating positive and negative features into distinct spaces. This separation maximizes the distinction between in-distribution (ID) and OOD images. Additionally, we introduce image-conditional learnable factors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.