Peter Parker or Spiderman? Disambiguating Multiple Class Labels
Nuthan Mummani, Simran Ketha, Venkatakrishnan Ramaswamy

TL;DR
This paper introduces a novel framework that disambiguates whether multiple top-k predictions from deep networks are driven by distinct entities or a single entity, enhancing interpretability with verifiable counterfactual proofs.
Contribution
It presents a new method combining segmentation and attribution techniques to distinguish between separate and shared entity explanations for multiple class predictions.
Findings
Effective disambiguation on ImageNet samples
Works across multiple models
Provides verifiable counterfactual explanations
Abstract
In the supervised classification setting, during inference, deep networks typically make multiple predictions. For a pair of such predictions (that are in the top-k predictions), two distinct possibilities might occur. On the one hand, each of the two predictions might be primarily driven by two distinct sets of entities in the input. On the other hand, it is possible that there is a single entity or set of entities that is driving the prediction for both the classes in question. This latter case, in effect, corresponds to the network making two separate guesses about the identity of a single entity type. Clearly, both the guesses cannot be true, i.e. both the labels cannot be present in the input. Current techniques in interpretability research do not readily disambiguate these two cases, since they typically consider input attributions for one class label at a time. Here, we present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSubtitles and Audiovisual Media · Translation Studies and Practices
MethodsSparse Evolutionary Training
