Delineating Knowledge Boundaries for Honest Large Vision-Language Models
Junru Song, Yimeng Hu, Yijing Chen, Huining Li, Qian Li, Lizhen Cui, Yuntao Du

TL;DR
This paper introduces a framework to improve large vision-language models' ability to recognize their knowledge limits and refuse to answer unknown questions, enhancing trustworthiness.
Contribution
It presents a systematic approach including dataset creation and fine-tuning techniques to delineate knowledge boundaries in VLMs, with demonstrated improvements.
Findings
Truthful Rate increased from 57.9% to 67.3%.
Model genuinely recognizes its boundaries, not just refusal patterns.
Framework generalizes to medical and perceptual domains.
Abstract
Large Vision-Language Models (VLMs) have achieved remarkable multimodal performance yet remain prone to factual hallucinations, particularly in long-tail or specialized domains. Moreover, current models exhibit a weak capacity to refuse queries that exceed their parametric knowledge. In this paper, we propose a systematic framework to enhance the refusal capability of VLMs when facing such unknown questions. We first curate a model-specific "Visual-Idk" (Visual-I don't know) dataset, leveraging multi-sample consistency probing to distinguish between known and unknown facts. We then align the model using supervised fine-tuning followed by preference-aware optimization (e.g., DPO, ORPO) to effectively delineate its knowledge boundaries. Results on the Visual-Idk dataset show our method improves the Truthful Rate from 57.9\% to 67.3\%. Additionally, internal probing also demonstrates that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
