Disentangling Visual Transformers: Patch-level Interpretability for   Image Classification

Guillaume Jeanneret; Lo\"ic Simon; Fr\'ed\'eric Jurie

arXiv:2502.17196·cs.CV·April 25, 2025

Disentangling Visual Transformers: Patch-level Interpretability for Image Classification

Guillaume Jeanneret, Lo\"ic Simon, Fr\'ed\'eric Jurie

PDF

Open Access

TL;DR

This paper introduces HiT, an interpretable transformer architecture that disentangles patch influences for image classification, balancing interpretability with performance.

Contribution

We propose HiT, a novel transformer design that enhances interpretability by enabling patch-level analysis while maintaining competitive accuracy.

Findings

01

HiT allows linear patch influence interpretation.

02

HiT achieves comparable performance to standard transformers.

03

HiT improves transparency in decision-making processes.

Abstract

Visual transformers have achieved remarkable performance in image classification tasks, but this performance gain has come at the cost of interpretability. One of the main obstacles to the interpretation of transformers is the self-attention mechanism, which mixes visual information across the whole image in a complex way. In this paper, we propose Hindered Transformer (HiT), a novel interpretable by design architecture inspired by visual transformers. Our proposed architecture rethinks the design of transformers to better disentangle patch influences at the classification stage. Ultimately, HiT can be interpreted as a linear combination of patch-level information. We show that the advantages of our approach in terms of explicability come with a reasonable trade-off in performance, making it an attractive alternative for applications where interpretability is paramount.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning

MethodsAttention Is All You Need · Absolute Position Encodings · Linear Layer · Layer Normalization · Byte Pair Encoding · Dense Connections · Residual Connection · Label Smoothing · Multi-Head Attention · Position-Wise Feed-Forward Layer