Training of Deep Neural Networks based on Distance Measures using RMSProp
Thomas Kurbiel, Shahrzad Khaleghian

TL;DR
This paper demonstrates that neural networks based on distance measures and Gaussian activations can be effectively trained with RMSProp, reducing vanishing gradient issues compared to traditional dot-product networks.
Contribution
It introduces a novel approach to training distance-based neural networks using RMSProp, showing improved gradient stability and training efficiency for deep architectures.
Findings
Distance-based neural networks are trainable with RMSProp.
Proper initialization reduces vanishing/exploding gradients.
Distance networks outperform traditional networks in deep settings.
Abstract
The vanishing gradient problem was a major obstacle for the success of deep learning. In recent years it was gradually alleviated through multiple different techniques. However the problem was not really overcome in a fundamental way, since it is inherent to neural networks with activation functions based on dot products. In a series of papers, we are going to analyze alternative neural network structures which are not based on dot products. In this first paper, we revisit neural networks built up of layers based on distance measures and Gaussian activation functions. These kinds of networks were only sparsely used in the past since they are hard to train when using plain stochastic gradient descent methods. We show that by using Root Mean Square Propagation (RMSProp) it is possible to efficiently learn multi-layer neural networks. Furthermore we show that when appropriately initialized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Face and Expression Recognition
