Leaky ReLUs That Differ in Forward and Backward Pass Facilitate   Activation Maximization in Deep Neural Networks

Christoph Linse; Erhardt Barth; Thomas Martinetz

arXiv:2410.16958·cs.CV·October 23, 2024

Leaky ReLUs That Differ in Forward and Backward Pass Facilitate Activation Maximization in Deep Neural Networks

Christoph Linse, Erhardt Barth, Thomas Martinetz

PDF

TL;DR

This paper introduces a novel activation maximization method using Leaky ReLUs with different forward and backward slopes, improving interpretability and training performance of neural networks.

Contribution

It proposes a new approach with Leaky ReLUs in the backward pass and introduces ProxyGrad, a secondary network technique for better gradient estimation and network training.

Findings

01

AM fails with standard ReLUs, but improves with Leaky ReLUs in the backward pass.

02

ProxyGrad outperforms traditional training methods on several benchmarks.

03

Using different slopes in Leaky ReLUs enhances interpretability and training efficiency.

Abstract

Activation maximization (AM) strives to generate optimal input stimuli, revealing features that trigger high responses in trained deep neural networks. AM is an important method of explainable AI. We demonstrate that AM fails to produce optimal input stimuli for simple functions containing ReLUs or Leaky ReLUs, casting doubt on the practical usefulness of AM and the visual interpretation of the generated images. This paper proposes a solution based on using Leaky ReLUs with a high negative slope in the backward pass while keeping the original, usually zero, slope in the forward pass. The approach significantly increases the maxima found by AM. The resulting ProxyGrad algorithm implements a novel optimization technique for neural networks that employs a secondary network as a proxy for gradient computation. This proxy network is designed to have a simpler loss landscape with fewer local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Model