Dynamic Modality Scheduling for Multimodal Large Models via Confidence, Uncertainty, and Semantic Consistency

Hiroshi Tanaka; Anika Rao; Hana Satou; Michael Johnson; Sofia Garc\'ia

arXiv:2506.12724·cs.CV·June 17, 2025

Dynamic Modality Scheduling for Multimodal Large Models via Confidence, Uncertainty, and Semantic Consistency

Hiroshi Tanaka, Anika Rao, Hana Satou, Michael Johnson, Sofia Garc\'ia

PDF

Open Access

TL;DR

This paper introduces Dynamic Modality Scheduling (DMS), a framework that adaptively weights modalities in multimodal large models based on confidence, uncertainty, and semantic consistency, improving robustness and performance.

Contribution

The paper presents a novel adaptive modality weighting method for MLLMs that enhances robustness and performance by evaluating each modality's reliability at the instance level.

Findings

01

DMS improves performance on vision-language tasks under modality noise.

02

DMS enhances robustness against modality dropout and corruption.

03

The method is compatible with models like BLIP-2 and LLaVA.

Abstract

Multimodal Large Models (MLLMs) have achieved remarkable progress in vision-language understanding and generation tasks. However, existing MLLMs typically rely on static modality fusion strategies, which treat all modalities equally regardless of their instance-level reliability or semantic contribution. This often leads to suboptimal performance, especially in scenarios with noisy, missing, or misaligned modalities. In this paper, we propose Dynamic Modality Scheduling (DMS), a novel framework that adaptively adjusts the contribution of each modality at a per-sample level. DMS evaluates each modality based on three key factors: (1) \textit{confidence}, estimated from predictive entropy; (2) \textit{uncertainty}, obtained via Monte Carlo dropout; and (3) \textit{semantic consistency}, computed through inter-modal similarity. These signals are combined through a learnable or rule-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · AI-based Problem Solving and Planning · Constraint Satisfaction and Optimization