Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume

Gregory Kang Ruey Lau; Hieu Dao; Nicole Kan Hui Lin; Bryan Kian Hsiang Low

arXiv:2602.24195·cs.AI·March 2, 2026

Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume

Gregory Kang Ruey Lau, Hieu Dao, Nicole Kan Hui Lin, Bryan Kian Hsiang Low

PDF

Open Access

TL;DR

This paper introduces UMPIRE, a training-free uncertainty quantification method for Multimodal Large Language Models that effectively detects errors and calibrates uncertainty across various modalities without external tools.

Contribution

UMPIRE is a novel, efficient, modality-agnostic uncertainty metric based on incoherence-adjusted semantic volume, requiring no additional training or external tools.

Findings

01

UMPIRE outperforms baseline metrics in error detection.

02

UMPIRE provides better uncertainty calibration across multiple modalities.

03

UMPIRE generalizes to non-text output tasks.

Abstract

Despite their capabilities, Multimodal Large Language Models (MLLMs) may produce plausible but erroneous outputs, hindering reliable deployment. Accurate uncertainty metrics could enable escalation of unreliable queries to human experts or larger models for improved performance. However, existing uncertainty metrics have practical constraints, such as being designed only for specific modalities, reliant on external tools, or computationally expensive. We introduce UMPIRE, a training-free uncertainty quantification framework for MLLMs that works efficiently across various input and output modalities without external tools, relying only on the models' own internal modality features. UMPIRE computes the incoherence-adjusted semantic volume of sampled MLLM responses for a given task instance, effectively capturing both the global semantic diversity of samples and the local incoherence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Topic Modeling