Feature-level Interaction Explanations in Multimodal Transformers
Yeji Kim, Housam Khalifa Bashier Babiker, Mi-Young Kim, and Randy Goebel

TL;DR
This paper introduces FL-I2MoE, a structured explanation method for multimodal Transformers that identifies and quantifies feature interactions, such as synergy and redundancy, to improve interpretability and faithfulness.
Contribution
We propose FL-I2MoE, a novel Mixture-of-Experts layer that explicitly separates and explains feature interactions in multimodal Transformers, enhancing interpretability.
Findings
FL-I2MoE produces more interaction-specific importance patterns.
Removing high-scoring pairs degrades model performance more than random removal.
The method effectively quantifies synergistic and redundant feature pairs.
Abstract
Multimodal Transformers often produce predictions without clarifying how different modalities jointly support a decision. Most existing multimodal explainable AI (MXAI) methods extend unimodal saliency to multimodal backbones, highlighting important tokens or patches within each modality, but they rarely pinpoint which cross-modal feature pairs provide complementary evidence (synergy) or serve as reliable backups (redundancy). We present Feature-level I2MoE (FL-I2MoE), a structured Mixture-of-Experts layer that operates directly on token/patch sequences from frozen pretrained encoders and explicitly separates unique, synergistic, and redundant evidence at the feature level. We further develop an expert-wise explanation pipeline that combines attribution with top-K% masking to assess faithfulness, and we introduce Monte Carlo interaction probes to quantify pairwise behavior: the Shapley…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
