From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
Moritz Vandenhirtz, Julia E. Vogt

TL;DR
This paper introduces an interpretable image classification method that sparsifies input images at the region level, aligning with human perception, and dynamically adjusts sparsity per instance, leading to more understandable predictions.
Contribution
The work presents a novel instance-wise region sparsification approach with dynamic sparsity control, enhancing interpretability over existing pixel-level methods.
Findings
Produces more human-understandable predictions than benchmarks
Effective on semi-synthetic and natural image datasets
Aligns sparsification with semantic regions for better interpretability
Abstract
Understanding the decision-making process of machine learning models provides valuable insights into the task, the data, and the reasons behind a model's failures. In this work, we propose a method that performs inherently interpretable predictions through the instance-wise sparsification of input images. To align the sparsification with human perception, we learn the masking in the space of semantically meaningful pixel regions rather than on pixel-level. Additionally, we introduce an explicit way to dynamically determine the required level of sparsity for each instance. We show empirically on semi-synthetic and natural image datasets that our inherently interpretable classifier produces more meaningful, human-understandable predictions than state-of-the-art benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
MethodsALIGN
