NeuCEPT: Locally Discover Neural Networks' Mechanism via Critical   Neurons Identification with Precision Guarantee

Minh N. Vu; Truc D. Nguyen; My T. Thai

arXiv:2209.08448·cs.LG·September 20, 2022·1 cites

NeuCEPT: Locally Discover Neural Networks' Mechanism via Critical Neurons Identification with Precision Guarantee

Minh N. Vu, Truc D. Nguyen, My T. Thai

PDF

Open Access

TL;DR

NeuCEPT is a method that identifies critical neurons in neural networks to understand their prediction mechanisms, providing a theoretical framework and unsupervised learning approach with guaranteed precision.

Contribution

NeuCEPT introduces a novel mutual-information based formulation and theoretical framework for critical neuron identification with precision guarantees, advancing interpretability of neural networks.

Findings

01

Identified neurons strongly influence model predictions.

02

Neurons encode meaningful information about model mechanisms.

03

Method outperforms baseline approaches in interpretability tasks.

Abstract

Despite recent studies on understanding deep neural networks (DNNs), there exists numerous questions on how DNNs generate their predictions. Especially, given similar predictions on different input samples, are the underlying mechanisms generating those predictions the same? In this work, we propose NeuCEPT, a method to locally discover critical neurons that play a major role in the model's predictions and identify model's mechanisms in generating those predictions. We first formulate a critical neurons identification problem as maximizing a sequence of mutual-information objectives and provide a theoretical framework to efficiently solve for critical neurons while keeping the precision under control. NeuCEPT next heuristically learns different model's mechanisms in an unsupervised manner. Our experimental results show that neurons identified by NeuCEPT not only have strong influence on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification