A generalized neural tangent kernel for surrogate gradient learning

Luke Eilers; Raoul-Martin Memmesheimer; Sven Goedeke

arXiv:2405.15539·stat.ML·June 24, 2025

A generalized neural tangent kernel for surrogate gradient learning

Luke Eilers, Raoul-Martin Memmesheimer, Sven Goedeke

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper introduces a generalized neural tangent kernel, called the surrogate gradient NTK, enabling theoretical analysis of surrogate gradient learning in neural networks with non-differentiable activation functions, supported by numerical experiments.

Contribution

It extends the neural tangent kernel framework to surrogate gradient learning, providing a rigorous theoretical foundation for analyzing networks with non-differentiable activations.

Findings

01

Surrogate gradient NTK accurately characterizes SGL behavior.

02

Naive NTK extension fails for activation functions with jumps.

03

Numerical experiments validate the surrogate gradient NTK's effectiveness.

Abstract

State-of-the-art neural network training methods depend on the gradient of the network function. Therefore, they cannot be applied to networks whose activation functions do not have useful derivatives, such as binary and discrete-time spiking neural networks. To overcome this problem, the activation function's derivative is commonly substituted with a surrogate derivative, giving rise to surrogate gradient learning (SGL). This method works well in practice but lacks theoretical foundation. The neural tangent kernel (NTK) has proven successful in the analysis of gradient descent. Here, we provide a generalization of the NTK, which we call the surrogate gradient NTK, that enables the analysis of SGL. First, we study a naive extension of the NTK to activation functions with jumps, demonstrating that gradient descent for such activation functions is also ill-posed in the infinite-width…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Kylan12/Synthetic-AI-ML-Dataset
dataset· 42 dl
42 dl

Videos

A generalized neural tangent kernel for surrogate gradient learning· slideslive

Taxonomy

TopicsNeural Networks and Applications

MethodsNeural Tangent Kernel