Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning
Nuri Mert Vural, Fatih Ilhan, Suleyman S. Kozat

TL;DR
This paper analyzes the stability of the decoupled extended Kalman filter (DEKF) in LSTM online learning, establishing conditions for convergence and comparing its performance with other training methods through simulations.
Contribution
It models DEKF as a perturbed EKF and derives stability conditions, demonstrating its comparable convergence properties to the global EKF in LSTM training.
Findings
DEKF remains stable if perturbations stay bounded.
DEKF achieves similar convergence as the global EKF.
Hyper-parameter choices in literature satisfy stability conditions.
Abstract
We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due to decoupling -- stay bounded, DEKF learns LSTM parameters with similar convergence and stability properties of the global extended Kalman filter learning algorithm. We verify our results with several numerical simulations and compare DEKF with other LSTM training methods. In our simulations, we also observe that the well-known hyper-parameter selection approaches used for DEKF in the literature satisfy our conditions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Adaptive Filtering Techniques · Neural Networks Stability and Synchronization
MethodsSigmoid Activation · Tanh Activation · Memory Network · Long Short-Term Memory
