Analytical aspects of non-differentiable neural networks
Gian Paolo Leonardi, Matteo Spallanzani

TL;DR
This paper investigates the expressivity and approximation capabilities of non-differentiable neural networks, proposing regularisation techniques and approximation results for networks with quantized and Heaviside-type activations.
Contribution
It demonstrates that quantized neural networks match the expressivity of deep neural networks and introduces a stochastic regularisation method for differentiable approximation of non-differentiable networks.
Findings
QNNs have the same approximation power as DNNs for Lipschitz functions.
A layer-wise stochastic regularisation technique effectively approximates non-differentiable networks.
Smooth networks can approximate Heaviside-type activation networks under certain conditions.
Abstract
Research in computational deep learning has directed considerable efforts towards hardware-oriented optimisations for deep neural networks, via the simplification of the activation functions, or the quantization of both activations and weights. The resulting non-differentiability (or even discontinuity) of the networks poses some challenging problems, especially in connection with the learning process. In this paper, we address several questions regarding both the expressivity of quantized neural networks and approximation techniques for non-differentiable networks. First, we answer in the affirmative the question of whether QNNs have the same expressivity as DNNs in terms of approximation of Lipschitz functions in the norm. Then, considering a continuous but not necessarily differentiable network, we describe a layer-wise stochastic regularisation technique to produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
