Forward Learning for Gradient-based Black-box Saliency Map Generation
Zeliang Zhang, Mingqian Feng, Jinyang Jiang, Rongyi Zhu, Yijie Peng,, Chenliang Xu

TL;DR
This paper introduces a unified framework for estimating gradients in black-box models to generate saliency maps, enabling interpretability of complex, closed-source neural networks like GPT-Vision.
Contribution
It presents a novel likelihood ratio-based gradient estimation method with blockwise techniques for improved accuracy in black-box settings.
Findings
Effective gradient estimation in black-box models
Generation of accurate saliency maps for interpretability
Scalability demonstrated on GPT-Vision
Abstract
Gradient-based saliency maps are widely used to explain deep neural network decisions. However, as models become deeper and more black-box, such as in closed-source APIs like ChatGPT, computing gradients become challenging, hindering conventional explanation methods. In this work, we introduce a novel unified framework for estimating gradients in black-box settings and generating saliency maps to interpret model decisions. We employ the likelihood ratio method to estimate output-to-input gradients and utilize them for saliency map generation. Additionally, we propose blockwise computation techniques to enhance estimation accuracy. Extensive experiments in black-box settings validate the effectiveness of our method, demonstrating accurate gradient estimation and explainability of generated saliency maps. Furthermore, we showcase the scalability of our approach by applying it to explain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Spatial Cognition and Navigation · Advanced Image and Video Retrieval Techniques
