FastRM: An efficient and automatic explainability framework for   multimodal generative models

Gabriela Ben-Melech Stan; Estelle Aflalo; Man Luo; Shachar Rosenman,; Tiep Le; Sayak Paul; Shao-Yen Tseng; Vasudev Lal

arXiv:2412.01487·cs.AI·May 7, 2025

FastRM: An efficient and automatic explainability framework for multimodal generative models

Gabriela Ben-Melech Stan, Estelle Aflalo, Man Luo, Shachar Rosenman,, Tiep Le, Sayak Paul, Shao-Yen Tseng, Vasudev Lal

PDF

Open Access

TL;DR

FastRM is a novel, efficient framework that significantly reduces computation time and memory usage for generating explainability maps in large vision-language models, enhancing real-time trustworthiness.

Contribution

The paper introduces FastRM, a new method that provides fast, scalable relevancy maps and confidence assessments for LVLMs, improving explainability and reliability in practical applications.

Findings

01

Achieves 99.8% reduction in computation time

02

Reduces memory footprint by 44.4%

03

Enables real-time explainability for LVLMs

Abstract

Large Vision Language Models (LVLMs) have demonstrated remarkable reasoning capabilities over textual and visual inputs. However, these models remain prone to generating misinformation. Identifying and mitigating ungrounded responses is crucial for developing trustworthy AI. Traditional explainability methods such as gradient-based relevancy maps, offer insight into the decision process of models, but are often computationally expensive and unsuitable for real-time output validation. In this work, we introduce FastRM, an efficient method for predicting explainable Relevancy Maps of LVLMs. Furthermore, FastRM provides both quantitative and qualitative assessment of model confidence. Experimental results demonstrate that FastRM achieves a 99.8% reduction in computation time and a 44.4% reduction in memory footprint compared to traditional relevancy map generation. FastRM allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling