From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models

Yearim Kim; Sangyu Han; and Nojun Kwak

arXiv:2605.00474·cs.CV·May 4, 2026

From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models

Yearim Kim, Sangyu Han, and Nojun Kwak

PDF

TL;DR

This paper introduces a unified interpretability framework for vision models centered on pointwise feature vectors and effective receptive fields, providing high-fidelity explanations and concept grounding across various architectures.

Contribution

It proposes a novel iERF-centered framework unifying local, global, and mechanistic interpretability with new methods like SRD, CAFE, and ICAT, enhancing explanation fidelity and robustness.

Findings

01

Outperforms baselines in fidelity and robustness across models.

02

Effectively interprets dispersed SAE features in Transformers.

03

Identifies dominant concept routes in various classification scenarios.

Abstract

Modern vision models achieve remarkable accuracy, but explaining where evidence arises, what the model encodes, and how internal computations assemble that evidence remains fragmented. We introduce an iERF-centric framework that unifies local, global, and mechanistic interpretability around a single analysis unit: the pointwise feature vector (PFV) paired with its instance-specific Effective Receptive Field (iERF). On the local side, Sharing Ratio Decomposition (SRD) expresses each PFV as a mixture of upstream PFVs via sharing ratios and propagates iERFs to construct class-discriminative saliency maps. SRD yields high-resolution, activation-faithful explanations, is robust to targeted manipulation and noise, and remains activation-agnostic across common nonlinearities. For the global view, we introduce Concept-Anchored Feature Explanation (CAFE), which utilizes the iERF as a semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.