Interpretable Tensor Fusion
Saurabh Varshneya, Antoine Ledent, Philipp Liznerski, Andriy, Balinskyy, Purvanshi Mehta, Waleed Mustafa, Marius Kloft

TL;DR
Interpretable Tensor Fusion (InTense) is a novel multimodal neural network method that learns to fuse diverse data types like text, images, and audio while providing interpretability through relevance scores, outperforming existing approaches.
Contribution
The paper introduces InTense, a theoretically grounded multimodal fusion method that captures both linear and multiplicative interactions and offers interpretability out of the box.
Findings
InTense outperforms state-of-the-art methods in accuracy.
InTense provides meaningful relevance scores on real-world datasets.
The approach effectively disentangles modality interactions.
Abstract
Conventional machine learning methods are predominantly designed to predict outcomes based on a single data type. However, practical applications may encompass data of diverse types, such as text, images, and audio. We introduce interpretable tensor fusion (InTense), a multimodal learning method for training neural networks to simultaneously learn multimodal data representations and their interpretable fusion. InTense can separately capture both linear combinations and multiplicative interactions of diverse data types, thereby disentangling higher-order interactions from the individual effects of each modality. InTense provides interpretability out of the box by assigning relevance scores to modalities and their associations. The approach is theoretically grounded and yields meaningful relevance scores on multiple synthetic and real-world datasets. Experiments on six real-world datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Model Reduction and Neural Networks
