H-Sets: Hessian-Guided Discovery of Set-Level Feature Interactions in Image Classifiers
Ayushi Mehrotra, Dipkamal Bhusal, Michael Clifford, Nidhi Rastogi

TL;DR
H-Sets is a novel framework that detects and attributes higher-order feature interactions in image classifiers using Hessians and set-level attribution, improving interpretability and faithfulness.
Contribution
Introduces H-Sets, a two-stage method combining Hessian-based detection and set-level attribution for better interpretability of image model decisions.
Findings
H-Sets produces sparser, more faithful saliency maps.
Evaluations show improved interpretability across multiple models and datasets.
H-Sets outperforms existing interaction attribution methods.
Abstract
Feature attribution methods explain the predictions of deep neural networks by assigning importance scores to individual input features. However, most existing methods focus solely on marginal effects, overlooking feature interactions, where groups of features jointly influence model output. Such interactions are especially important in image classification tasks, where semantic meaning often arises from pixel interdependencies rather than isolated features. Existing interaction-based methods for images are either coarse (e.g., superpixel-only) or, fail to satisfy core interpretability axioms. In this work, we introduce H-Sets, a novel two-stage framework for discovering and attributing higher-order feature interactions in image classifiers. First, we detect locally interacting pairs via input Hessians and recursively merge them into semantically coherent sets; segmentation from Segment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
