On a novel training algorithm for sequence-to-sequence predictive recurrent networks
Boris Rubinstein

TL;DR
This paper introduces a new memoryless training algorithm for sequence-to-sequence predictive recurrent networks that improves robustness and accuracy over traditional methods, with potential implications for neuroscience-inspired models.
Contribution
The paper proposes a novel memoryless training algorithm for seq2seq networks, reducing memory requirements and enhancing prediction performance compared to traditional approaches.
Findings
The new algorithm is more robust in time series prediction.
It achieves higher accuracy than traditional seq2seq algorithms.
Parameters of trained networks show interdependence that can be exploited for efficiency.
Abstract
Neural networks mapping sequences to sequences (seq2seq) lead to significant progress in machine translation and speech recognition. Their traditional architecture includes two recurrent networks (RNs) followed by a linear predictor. In this manuscript we perform analysis of a corresponding algorithm and show that the parameters of the RNs of the well trained predictive network are not independent of each other. Their dependence can be used to significantly improve the network effectiveness. The traditional seq2seq algorithms require short term memory of a size proportional to the predicted sequence length. This requirement is quite difficult to implement in a neuroscience context. We present a novel memoryless algorithm for seq2seq predictive networks and compare it to the traditional one in the context of time series prediction. We show that the new algorithm is more robust and makes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Network Packet Processing and Optimization · Network Security and Intrusion Detection
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
