LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions
Faridoun Mehri (1), Mahdieh Soleymani Baghshah (1), Mohammad Taher, Pilehvar (2) ((1) Sharif University of Technology, (2) Cardiff University)

TL;DR
LibraGrad is a post-hoc method that corrects gradient flow imbalances in Transformers to improve the faithfulness and quality of gradient-based attribution explanations across various architectures and datasets.
Contribution
It introduces LibraGrad, a theoretically grounded approach that prunes and scales backward paths to enhance gradient explanations without modifying the forward pass or adding overhead.
Findings
Universal improvement of gradient explanations across 8 architectures.
Outperforms existing white-box attribution methods on all metrics.
Effective even on attention-free architectures like MLP-Mixer.
Abstract
Why do gradient-based explanations struggle with Transformers, and how can we improve them? We identify gradient flow imbalances in Transformers that violate FullGrad-completeness, a critical property for attribution faithfulness that CNNs naturally possess. To address this issue, we introduce LibraGrad -- a theoretically grounded post-hoc approach that corrects gradient imbalances through pruning and scaling of backward paths, without changing the forward pass or adding computational overhead. We evaluate LibraGrad using three metric families: Faithfulness, which quantifies prediction changes under perturbations of the most and least relevant features; Completeness Error, which measures attribution conservation relative to model outputs; and Segmentation AP, which assesses alignment with human perception. Extensive experiments across 8 architectures, 4 model sizes, and 4 datasets show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Average Pooling · Layer Normalization · Residual Connection · Dense Connections · Dropout · Global Average Pooling · MLP-Mixer · Pruning · Contrastive Language-Image Pre-training
