Visual Explanations from Hadamard Product in Multimodal Deep Networks
Jin-Hwa Kim, Byoung-Tak Zhang

TL;DR
This paper demonstrates that the Hadamard product in multimodal deep networks acts as an implicit attention mechanism for both visual and textual inputs, and introduces a gradient-based visualization method to analyze this effect.
Contribution
It extends previous work by showing Hadamard product's attentional role for both modalities and proposes a new visualization technique for interpretation.
Findings
Hadamard product implicitly performs attention for visual inputs.
The method visualizes attention for both visual and textual inputs.
Comparison with learned attention weights validates the approach.
Abstract
The visual explanation of learned representation of models helps to understand the fundamentals of learning. The attentional models of previous works used to visualize the attended regions over an image or text using their learned weights to confirm their intended mechanism. Kim et al. (2016) show that the Hadamard product in multimodal deep networks, which is well-known for the joint function of visual question answering tasks, implicitly performs an attentional mechanism for visual inputs. In this work, we extend their work to show that the Hadamard product in multimodal deep networks performs not only for visual inputs but also for textual inputs simultaneously using the proposed gradient-based visualization technique. The attentional effect of Hadamard product is visualized for both visual and textual inputs by analyzing the two inputs and an output of the Hadamard product with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
