Class Introspection: A Novel Technique for Detecting Unlabeled Subclasses by Leveraging Classifier Explainability Methods
Patrick Kage, Pavlos Andreadis

TL;DR
This paper introduces Class Introspection, a technique that uses classifier explainability methods to detect unlabeled subclasses within datasets, applicable to both simple and deep models, outperforming existing methods.
Contribution
It presents a novel subclass discovery approach leveraging explanation methods, extending classifier analysis to reveal latent data structures with a new pipeline and interactive web tool.
Findings
Outperforms baseline in latent class detection
Works with deep neural networks and simple classifiers
Provides an automated analysis pipeline and interactive exploration tool
Abstract
Detecting latent structure within a dataset is a crucial step in performing analysis of a dataset. However, existing state-of-the-art techniques for subclass discovery are limited: either they are limited to detecting very small numbers of outliers or they lack the statistical power to deal with complex data such as image or audio. This paper proposes a solution to this subclass discovery problem: by leveraging instance explanation methods, an existing classifier can be extended to detect latent classes via differences in the classifier's internal decisions about each instance. This works not only with simple classification techniques but also with deep neural networks, allowing for a powerful and flexible approach to detecting latent structure within datasets. Effectively, this represents a projection of the dataset into the classifier's "explanation space," and preliminary results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques
