Delineating Knowledge Boundaries for Honest Large Vision-Language Models

Junru Song; Yimeng Hu; Yijing Chen; Huining Li; Qian Li; Lizhen Cui; Yuntao Du

arXiv:2604.26419·cs.CV·April 30, 2026

Delineating Knowledge Boundaries for Honest Large Vision-Language Models

Junru Song, Yimeng Hu, Yijing Chen, Huining Li, Qian Li, Lizhen Cui, Yuntao Du

PDF

TL;DR

This paper introduces a framework to improve large vision-language models' ability to recognize their knowledge limits and refuse to answer unknown questions, enhancing trustworthiness.

Contribution

It presents a systematic approach including dataset creation and fine-tuning techniques to delineate knowledge boundaries in VLMs, with demonstrated improvements.

Findings

01

Truthful Rate increased from 57.9% to 67.3%.

02

Model genuinely recognizes its boundaries, not just refusal patterns.

03

Framework generalizes to medical and perceptual domains.

Abstract

Large Vision-Language Models (VLMs) have achieved remarkable multimodal performance yet remain prone to factual hallucinations, particularly in long-tail or specialized domains. Moreover, current models exhibit a weak capacity to refuse queries that exceed their parametric knowledge. In this paper, we propose a systematic framework to enhance the refusal capability of VLMs when facing such unknown questions. We first curate a model-specific "Visual-Idk" (Visual-I don't know) dataset, leveraging multi-sample consistency probing to distinguish between known and unknown facts. We then align the model using supervised fine-tuning followed by preference-aware optimization (e.g., DPO, ORPO) to effectively delineate its knowledge boundaries. Results on the Visual-Idk dataset show our method improves the Truthful Rate from 57.9\% to 67.3\%. Additionally, internal probing also demonstrates that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.