Online Training of LSTM Networks in Distributed Systems for Variable   Length Data Sequences

Tolga Ergen; Suleyman Serdar Kozat

arXiv:1710.08744·eess.SP·February 25, 2020·IEEE Trans. Neural Networks Learn. Syst.

Online Training of LSTM Networks in Distributed Systems for Variable Length Data Sequences

Tolga Ergen, Suleyman Serdar Kozat

PDF

TL;DR

This paper presents a novel online distributed training method for LSTM networks using particle filtering, enabling efficient learning from variable-length data sequences in networked systems with improved performance.

Contribution

It introduces a distributed particle filtering algorithm for online LSTM training in networks, ensuring convergence and efficiency compared to existing methods.

Findings

01

Distributed particle filtering guarantees convergence to optimal LSTM coefficients.

02

Achieves performance comparable to centralized methods with low communication overhead.

03

Demonstrates significant improvements over state-of-the-art techniques in simulations and real data.

Abstract

In this brief paper, we investigate online training of Long Short Term Memory (LSTM) architectures in a distributed network of nodes, where each node employs an LSTM based structure for online regression. In particular, each node sequentially receives a variable length data sequence with its label and can only exchange information with its neighbors to train the LSTM architecture. We first provide a generic LSTM based regression structure for each node. In order to train this structure, we put the LSTM equations in a nonlinear state space form for each node and then introduce a highly effective and efficient Distributed Particle Filtering (DPF) based training algorithm. We also introduce a Distributed Extended Kalman Filtering (DEKF) based training algorithm for comparison. Here, our DPF based training algorithm guarantees convergence to the performance of the optimal LSTM coefficients…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.