Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power   of Refusal

Yuhao Wang; Zhiyuan Zhu; Heyang Liu; Yusheng Liao; Hongcheng Liu,; Yanfeng Wang; Yu Wang

arXiv:2412.11196·cs.CL·December 17, 2024

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

Yuhao Wang, Zhiyuan Zhu, Heyang Liu, Yusheng Liao, Hongcheng Liu,, Yanfeng Wang, Yu Wang

PDF

Open Access

TL;DR

This paper introduces InBoL, a novel framework that enhances the trustworthiness of multimodal large language models by enabling them to refuse to answer when information is insufficient, thereby reducing hallucinations.

Contribution

InBoL systematically defines refusal conditions for MLLMs using information boundaries and develops a training pipeline to improve refusal responses.

Findings

01

Significant improvement in refusal accuracy.

02

Maintains model helpfulness.

03

Advances trustworthiness of MLLMs.

Abstract

Multimodal large language models (MLLMs) excel at multimodal perception and understanding, yet their tendency to generate hallucinated or inaccurate responses undermines their trustworthiness. Existing methods have largely overlooked the importance of refusal responses as a means of enhancing MLLMs reliability. To bridge this gap, we present the Information Boundary-aware Learning Framework (InBoL), a novel approach that empowers MLLMs to refuse to answer user queries when encountering insufficient information. To the best of our knowledge, InBoL is the first framework that systematically defines the conditions under which refusal is appropriate for MLLMs using the concept of information boundaries proposed in our paper. This framework introduces a comprehensive data generation pipeline and tailored training strategies to improve the model's ability to deliver appropriate refusal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAccess Control and Trust