A Closer Look at Double Backpropagation
Christian Etmann

TL;DR
This paper provides a comprehensive theoretical framework for double backpropagation in neural networks, optimizing calculations and analyzing the impact of discontinuities in ReLU networks.
Contribution
It offers a general Hilbert space framework for derivatives in double backpropagation, optimizing computation and analyzing discontinuities in ReLU networks.
Findings
Reduced Jacobian penalty calculations by about one-third for certain activations
Described the discontinuous loss surface of ReLU networks in inputs and parameters
Showed that discontinuities do not significantly affect practical training
Abstract
In recent years, an increasing number of neural network models have included derivatives with respect to inputs in their loss functions, resulting in so-called double backpropagation for first-order optimization. However, so far no general description of the involved derivatives exists. Here, we cover a wide array of special cases in a very general Hilbert space framework, which allows us to provide optimized backpropagation rules for many real-world scenarios. This includes the reduction of calculations for Frobenius-norm-penalties on Jacobians by roughly a third for locally linear activation functions. Furthermore, we provide a description of the discontinuous loss surface of ReLU networks both in the inputs and the parameters and demonstrate why the discontinuities do not pose a big problem in reality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Algorithms and Data Compression · Blind Source Separation Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia?
