Winsor-CAM: Human-Tunable Visual Explanations from Deep Networks via Layer-Wise Winsorization
Casey Wall, Longwei Wang, Rodrigue Rizk, and KC Santosh

TL;DR
Winsor-CAM is a gradient-based visualization method that combines multi-layer CNN saliency maps with percentile-based Winsorization, allowing human-tunable, multi-scale explanations that outperform existing methods in accuracy and stability.
Contribution
We introduce Winsor-CAM, a novel single-pass gradient method that aggregates multi-layer saliency maps with Winsorization for improved, human-tunable explanations in CNNs.
Findings
Winsor-CAM outperforms baselines in localization and fidelity metrics.
The method is effective across multiple CNN architectures and datasets.
Human-tunable parameter p enables semantic-level explanation control.
Abstract
Interpreting Convolutional Neural Networks (CNNs) is critical for safety-sensitive applications such as healthcare and autonomous systems. Popular visual explanation methods like Grad-CAM use a single convolutional layer, potentially missing multi-scale cues and producing unstable saliency maps. We introduce Winsor-CAM, a single-pass gradient-based method that aggregates Grad-CAM maps from all convolutional layers and applies percentile-based Winsorization to attenuate outlier contributions. A user-controllable percentile parameter p enables semantic-level tuning from low-level textures to high-level object patterns. We evaluate Winsor-CAM on six CNN architectures using PASCAL VOC 2012 and PolypGen, comparing localization (IoU, center-of-mass distance) and fidelity (insertion/deletion AUC) against seven baselines including Grad-CAM, Grad-CAM++, LayerCAM, ScoreCAM, AblationCAM,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Cell Image Analysis Techniques · Multimodal Machine Learning Applications
