LibraGrad: Balancing Gradient Flow for Universally Better Vision   Transformer Attributions

Faridoun Mehri (1); Mahdieh Soleymani Baghshah (1); Mohammad Taher; Pilehvar (2) ((1) Sharif University of Technology; (2) Cardiff University)

arXiv:2411.16760·cs.CV·November 27, 2024

LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions

Faridoun Mehri (1), Mahdieh Soleymani Baghshah (1), Mohammad Taher, Pilehvar (2) ((1) Sharif University of Technology, (2) Cardiff University)

PDF

Open Access 1 Repo

TL;DR

LibraGrad is a post-hoc method that corrects gradient flow imbalances in Transformers to improve the faithfulness and quality of gradient-based attribution explanations across various architectures and datasets.

Contribution

It introduces LibraGrad, a theoretically grounded approach that prunes and scales backward paths to enhance gradient explanations without modifying the forward pass or adding overhead.

Findings

01

Universal improvement of gradient explanations across 8 architectures.

02

Outperforms existing white-box attribution methods on all metrics.

03

Effective even on attention-free architectures like MLP-Mixer.

Abstract

Why do gradient-based explanations struggle with Transformers, and how can we improve them? We identify gradient flow imbalances in Transformers that violate FullGrad-completeness, a critical property for attribution faithfulness that CNNs naturally possess. To address this issue, we introduce LibraGrad -- a theoretically grounded post-hoc approach that corrects gradient imbalances through pruning and scaling of backward paths, without changing the forward pass or adding computational overhead. We evaluate LibraGrad using three metric families: Faithfulness, which quantifies prediction changes under perturbations of the most and least relevant features; Completeness Error, which measures attribution conservation relative to model outputs; and Segmentation AP, which assesses alignment with human perception. Extensive experiments across 8 architectures, 4 model sizes, and 4 datasets show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nightmachinery/libragrad
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Average Pooling · Layer Normalization · Residual Connection · Dense Connections · Dropout · Global Average Pooling · MLP-Mixer · Pruning · Contrastive Language-Image Pre-training