Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models

Ruiyang Zhang; Hu Zhang; Hao Fei; Zhedong Zheng

arXiv:2506.07575·cs.CV·June 10, 2025

Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models

Ruiyang Zhang, Hu Zhang, Hao Fei, Zhedong Zheng

PDF

Open Access

TL;DR

Uncertainty-o is a versatile framework that evaluates and quantifies uncertainty in large multimodal models, improving their reliability in tasks like hallucination detection and reasoning.

Contribution

It introduces a model-agnostic method to reveal and quantify uncertainty in multimodal models, addressing evaluation, prompting, and downstream application challenges.

Findings

01

Effective uncertainty estimation across 18 benchmarks and 10 models.

02

Improved performance in hallucination detection and mitigation.

03

Enhanced reasoning with uncertainty-aware Chain-of-Thought.

Abstract

Large Multimodal Models (LMMs), harnessing the complementarity among diverse modalities, are often considered more robust than pure Language Large Models (LLMs); yet do LMMs know what they do not know? There are three key open questions remaining: (1) how to evaluate the uncertainty of diverse LMMs in a unified manner, (2) how to prompt LMMs to show its uncertainty, and (3) how to quantify uncertainty for downstream tasks. In an attempt to address these challenges, we introduce Uncertainty-o: (1) a model-agnostic framework designed to reveal uncertainty in LMMs regardless of their modalities, architectures, or capabilities, (2) an empirical exploration of multimodal prompt perturbations to uncover LMM uncertainty, offering insights and findings, and (3) derive the formulation of multimodal semantic uncertainty, which enables quantifying uncertainty from multimodal responses. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)