Pulling Back the Curtain on Deep Networks

Maciej Satkiewicz; Roberto Corizzo; Marcin Pietro\'n

arXiv:2507.22832·cs.LG·May 8, 2026

Pulling Back the Curtain on Deep Networks

Maciej Satkiewicz, Roberto Corizzo, Marcin Pietro\'n

PDF

1 Repo

TL;DR

This paper introduces Semantic Pullbacks, a novel method for interpreting deep networks by reconstructing meaningful input features from neuron activations, improving explanation quality and robustness.

Contribution

It presents a unified framework for local explanations of deep models using input-conditioned affine operators and iterative enhancement, outperforming existing methods.

Findings

01

Semantic Pullbacks produce perceptually aligned, class-conditional explanations.

02

They enable coherent counterfactual perturbations.

03

They achieve state-of-the-art trade-offs on faithfulness, stability, and sensitivity benchmarks.

Abstract

In linear models, visualizing a weight vector naturally reveals the model's preferred input direction, but extending this intuition to deep networks via gradients or gradient ascent often yields brittle or adversarial-looking features. We argue that deep networks are better understood as input-conditioned affine operators, whose natural adjoint action pulls a neuron's preferred direction back to input space. We further refine this representation by backward-only softening and iterative enhancement to reconstruct coherent local structures encoded by the target neuron. This provides a unifying perspective on previously disparate ideas such as SmoothGrad, B-cos-style alignment, and Feature Accentuation. The resulting Semantic Pullbacks (SP) generate perceptually aligned, class-conditional post-hoc explanations that emphasize semantically meaningful features, facilitate coherent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

314-foundation/SemanticPullbacks
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.