Winsor-CAM: Human-Tunable Visual Explanations from Deep Networks via Layer-Wise Winsorization

Casey Wall; Longwei Wang; Rodrigue Rizk; and KC Santosh

arXiv:2507.10846·cs.CV·February 24, 2026

Winsor-CAM: Human-Tunable Visual Explanations from Deep Networks via Layer-Wise Winsorization

Casey Wall, Longwei Wang, Rodrigue Rizk, and KC Santosh

PDF

Open Access

TL;DR

Winsor-CAM is a gradient-based visualization method that combines multi-layer CNN saliency maps with percentile-based Winsorization, allowing human-tunable, multi-scale explanations that outperform existing methods in accuracy and stability.

Contribution

We introduce Winsor-CAM, a novel single-pass gradient method that aggregates multi-layer saliency maps with Winsorization for improved, human-tunable explanations in CNNs.

Findings

01

Winsor-CAM outperforms baselines in localization and fidelity metrics.

02

The method is effective across multiple CNN architectures and datasets.

03

Human-tunable parameter p enables semantic-level explanation control.

Abstract

Interpreting Convolutional Neural Networks (CNNs) is critical for safety-sensitive applications such as healthcare and autonomous systems. Popular visual explanation methods like Grad-CAM use a single convolutional layer, potentially missing multi-scale cues and producing unstable saliency maps. We introduce Winsor-CAM, a single-pass gradient-based method that aggregates Grad-CAM maps from all convolutional layers and applies percentile-based Winsorization to attenuate outlier contributions. A user-controllable percentile parameter p enables semantic-level tuning from low-level textures to high-level object patterns. We evaluate Winsor-CAM on six CNN architectures using PASCAL VOC 2012 and PolypGen, comparing localization (IoU, center-of-mass distance) and fidelity (insertion/deletion AUC) against seven baselines including Grad-CAM, Grad-CAM++, LayerCAM, ScoreCAM, AblationCAM,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Cell Image Analysis Techniques · Multimodal Machine Learning Applications