An interpretation of the final fully connected layer

Siddhartha

arXiv:2205.11908·cs.LG·May 25, 2022

An interpretation of the final fully connected layer

Siddhartha

PDF

Open Access

TL;DR

This paper introduces a novel method to interpret the weights of the final fully connected layer in image classification neural networks by linking supervised learning to policy gradient concepts, enabling identification of key image regions.

Contribution

It proposes a new interpretation technique that does not assume specific network architectures and is computationally efficient, connecting supervised learning with reinforcement learning principles.

Findings

01

Identifies discriminative image regions effectively

02

Works with various pre-trained models

03

Provides insights into neural network decision-making

Abstract

In recent years neural networks have achieved state-of-the-art accuracy for various tasks but the the interpretation of the generated outputs still remains difficult. In this work we attempt to provide a method to understand the learnt weights in the final fully connected layer in image classification models. We motivate our method by drawing a connection between the policy gradient objective in RL and supervised learning objective. We suggest that the commonly used cross entropy based supervised learning objective can be regarded as a special case of the policy gradient objective. Using this insight we propose a method to find the most discriminative and confusing parts of an image. Our method does not make any prior assumption about neural network achitecture and has low computational cost. We apply our method on publicly available pre-trained models and report the generated results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)