LiteLSTM Architecture for Deep Recurrent Neural Networks

Nelly Elsayed; Zag ElSayed; Anthony S. Maida

arXiv:2201.11624·cs.LG·October 26, 2022·1 cites

LiteLSTM Architecture for Deep Recurrent Neural Networks

Nelly Elsayed, Zag ElSayed, Anthony S. Maida

PDF

Open Access

TL;DR

This paper introduces LiteLSTM, a computationally efficient variant of traditional LSTM that reduces complexity through weight sharing, suitable for big data applications like IoT security and medical data analysis.

Contribution

It presents a novel LiteLSTM architecture that maintains performance while significantly reducing computational costs using weight sharing techniques.

Findings

01

Reduces computation and energy consumption compared to standard LSTM

02

Maintains comparable performance on vision and cybersecurity datasets

03

Potentially lowers CO2 footprint of deep learning models

Abstract

Long short-term memory (LSTM) is a robust recurrent neural network architecture for learning spatiotemporal sequential data. However, it requires significant computational power for learning and implementing from both software and hardware aspects. This paper proposes a novel LiteLSTM architecture based on reducing the computation components of the LSTM using the weights sharing concept to reduce the overall architecture cost and maintain the architecture performance. The proposed LiteLSTM can be significant for learning big data where time-consumption is crucial such as the security of IoT devices and medical data. Moreover, it helps to reduce the CO2 footprint. The proposed model was evaluated and tested empirically on two different datasets from computer vision and cybersecurity domains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Anomaly Detection Techniques and Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory