Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization
Adrianna Janik, Kris Sankaran

TL;DR
This paper introduces two new methods, statistical inference and interactive visualization, to help users discover meaningful concepts in learned representations, addressing the challenge of navigating high-dimensional spaces.
Contribution
It proposes two novel approaches for concept discovery—one based on multiple hypothesis testing and another on interactive visualization—enhancing interpretability without full automation.
Findings
Methods show promise in identifying relevant concepts in high-dimensional spaces.
Simulation experiments validate the effectiveness of the proposed techniques.
Demo interface demonstrates practical application to real data.
Abstract
Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisions for classes of interest. However, finding relevant concepts is tedious, as representation spaces are high-dimensional and hard to navigate. Current approaches include hand-crafting concept datasets and then converting them to latent space directions; alternatively, the process can be automated by clustering the latent space. In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Statistical and Computational Modeling
