MAEA: Multimodal Attribution for Embodied AI

Vidhi Jain; Jayant Sravan Tamarapalli; Sahiti Yerramilli; Yonatan Bisk

arXiv:2307.13850·cs.LG·July 27, 2023·1 cites

MAEA: Multimodal Attribution for Embodied AI

Vidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, Yonatan Bisk

PDF

Open Access

TL;DR

This paper introduces MAEA, a framework for analyzing the contribution of visual, language, and action inputs in multimodal embodied AI policies, aiding robustness and trust assessment.

Contribution

MAEA provides a novel method to compute global modality attributions in differentiable policies, enhancing interpretability and analysis of multimodal embodied AI.

Findings

01

Attributions reveal modality importance across policies.

02

Analysis uncovers biases and failure modes.

03

Framework improves understanding of multimodal policy behavior.

Abstract

Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies for language and visual attributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Explainable Artificial Intelligence (XAI)