Prototype-based interpretation of the functionality of neurons in winner-take-all neural networks
Ramin Zarei Sabzevar, Kamaledin Ghiasi-Shirazi, Ahad Harati

TL;DR
This paper introduces a prototype-based interpretation of winner-take-all neural networks, proposing a novel $m{±}$ED-WTA model that enhances interpretability and outlier detection by modeling neurons with positive and negative prototypes.
Contribution
It presents a new $m{±}$ED-WTA model with a training algorithm that produces interpretable prototypes, linking neuron functionality to prototype differences and improving outlier detection.
Findings
Proposed $m{±}$ED-WTA models neuron function as prototype differences.
Training algorithm enables interpretable prototypes.
Effective detection of outliers and adversarial examples.
Abstract
Prototype-based learning (PbL) using a winner-take-all (WTA) network based on minimum Euclidean distance (ED-WTA) is an intuitive approach to multiclass classification. By constructing meaningful class centers, PbL provides higher interpretability and generalization than hyperplane-based learning (HbL) methods based on maximum Inner Product (IP-WTA) and can efficiently detect and reject samples that do not belong to any classes. In this paper, we first prove the equivalence of IP-WTA and ED-WTA from a representational point of view. Then, we show that naively using this equivalence leads to unintuitive ED-WTA networks in which the centers have high distances to data that they represent. We propose ED-WTA which models each neuron with two prototypes: one positive prototype representing samples that are modeled by this neuron and a negative prototype representing the samples that are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Fault Detection and Control Systems · Neural Networks and Applications
MethodsInterpretability
