NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations
Yeong-Joon Ju, Ji-Hoon Park, and Seong-Whan Lee

TL;DR
NeuroInspect is a neuron-based debugging framework that visualizes and mitigates errors in deep learning models by pinpointing responsible neurons and providing human-interpretable explanations without needing extra data.
Contribution
It introduces CLIP-Illusion for class-conditional feature visualization and a method to mitigate false correlations, enhancing interpretability and debugging of neural networks.
Findings
Effective identification of neurons responsible for errors.
Improved interpretability through class-conditional visualizations.
Reduction of false correlations in model decisions.
Abstract
Despite deep learning (DL) has achieved remarkable progress in various domains, the DL models are still prone to making mistakes. This issue necessitates effective debugging tools for DL practitioners to interpret the decision-making process within the networks. However, existing debugging methods often demand extra data or adjustments to the decision process, limiting their applicability. To tackle this problem, we present NeuroInspect, an interpretable neuron-based debugging framework with three key stages: counterfactual explanations, feature visualizations, and false correlation mitigation. Our debugging framework first pinpoints neurons responsible for mistakes in the network and then visualizes features embedded in the neurons to be human-interpretable. To provide these explanations, we introduce CLIP-Illusion, a novel feature visualization method that generates images…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Cell Image Analysis Techniques · Machine Learning and Data Classification
