Rethinking Explainability in the Era of Multimodal AI

Chirag Agarwal

arXiv:2506.13060·cs.AI·June 17, 2025

Rethinking Explainability in the Era of Multimodal AI

Chirag Agarwal

PDF

Open Access

TL;DR

This paper emphasizes the importance of developing multimodal explanations that capture cross-modal interactions, criticizing unimodal methods and proposing principles for more faithful and stable explanations in multimodal AI systems.

Contribution

It introduces key principles for multimodal explanations—Granger-style influence, synergistic faithfulness, and unified stability—to improve interpretability of multimodal models.

Findings

01

Unimodal explanations fail to capture cross-modal influences.

02

Proposed principles guide the development of more faithful multimodal explanations.

03

Enhanced explanations can uncover shortcuts and reduce modality bias.

Abstract

While multimodal AI systems (models jointly trained on heterogeneous data types such as text, time series, graphs, and images) have become ubiquitous and achieved remarkable performance across high-stakes applications, transparent and accurate explanation algorithms are crucial for their safe deployment and ensure user trust. However, most existing explainability techniques remain unimodal, generating modality-specific feature attributions, concepts, or circuit traces in isolation and thus failing to capture cross-modal interactions. This paper argues that such unimodal explanations systematically misrepresent and fail to capture the cross-modal influence that drives multimodal model decisions, and the community should stop relying on them for interpreting multimodal models. To support our position, we outline key principles for multimodal explanations grounded in modality:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Explainable Artificial Intelligence (XAI)