BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models

Huu-Thien Tran; Thanh-Dat Truong; Khoa Luu

arXiv:2505.24649·cs.CV·June 2, 2025

BIMA: Bijective Maximum Likelihood Learning Approach to Hallucination Prediction and Mitigation in Large Vision-Language Models

Huu-Thien Tran, Thanh-Dat Truong, Khoa Luu

PDF

Open Access

TL;DR

This paper introduces BIMA, a novel bijective maximum likelihood learning method based on normalizing flows, to effectively reduce hallucinations in large vision-language models, improving their reliability and interpretability.

Contribution

BIMA is the first approach to apply bijective normalizing flow techniques for hallucination mitigation in vision-language models, offering a new direction for trustworthy AI systems.

Findings

01

Achieves an average F1 score of 85.06% on POPE benchmark

02

Reduces CHAIRS and CHAIRI hallucination metrics by 7.6% and 2.6%

03

Demonstrates significant improvements over existing methods

Abstract

Large vision-language models have become widely adopted to advance in various domains. However, developing a trustworthy system with minimal interpretable characteristics of large-scale models presents a significant challenge. One of the most prevalent terms associated with the fallacy functions caused by these systems is hallucination, where the language model generates a response that does not correspond to the visual content. To mitigate this problem, several approaches have been developed, and one prominent direction is to ameliorate the decoding process. In this paper, we propose a new Bijective Maximum Likelihood Learning (BIMA) approach to hallucination mitigation using normalizing flow theories. The proposed BIMA method can efficiently mitigate the hallucination problem in prevailing vision-language models, resulting in significant improvements. Notably, BIMA achieves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Anomaly Detection Techniques and Applications · Brain Tumor Detection and Classification