Deep Learning in Target Space
Michael Fairbank, Spyridon Samothrakis, Luca Citi

TL;DR
This paper introduces a re-parameterization of neural network training into target space, which simplifies optimization, mitigates exploding gradients, and enhances training speed and generalization, especially for deep and recurrent networks.
Contribution
It proposes a novel target space parameterization for neural networks, improving training efficiency and enabling better handling of deep and recurrent architectures.
Findings
Faster training using target space re-parameterization
Improved generalization in neural networks
Effective for deep and recurrent network structures
Abstract
Deep learning uses neural networks which are parameterised by their weights. The neural networks are usually trained by tuning the weights to directly minimise a given loss function. In this paper we propose to re-parameterise the weights into targets for the firing strengths of the individual nodes in the network. Given a set of targets, it is possible to calculate the weights which make the firing strengths best meet those targets. It is argued that using targets for training addresses the problem of exploding gradients, by a process which we call cascade untangling, and makes the loss-function surface smoother to traverse, and so leads to easier, faster training, and also potentially better generalisation, of the neural network. It also allows for easier learning of deeper and recurrent network structures. The necessary conversion of targets to weights comes at an extra computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
