A Video Recognition Method by using Adaptive Structural Learning of Long Short Term Memory based Deep Belief Network
Shin Kamada, Takumi Ichimura

TL;DR
This paper introduces an adaptive structural learning approach for Deep Belief Networks combined with LSTM ideas, achieving high accuracy in video recognition tasks like Moving MNIST.
Contribution
It proposes a novel adaptive learning method for DBNs with LSTM extensions, optimizing network structure during training for improved video recognition.
Findings
Achieved over 90% prediction accuracy on Moving MNIST.
Demonstrated superior performance compared to standard LSTM models.
Extended adaptive RBM and DBN algorithms for time-series analysis.
Abstract
Deep learning builds deep architectures such as multi-layered artificial neural networks to effectively represent multiple features of input patterns. The adaptive structural learning method of Deep Belief Network (DBN) can realize a high classification capability while searching the optimal network structure during the training. The method can find the optimal number of hidden neurons of a Restricted Boltzmann Machine (RBM) by neuron generation-annihilation algorithm to train the given input data, and then it can make a new layer in DBN by the layer generation algorithm to actualize a deep data representation. Moreover, the learning algorithm of Adaptive RBM and Adaptive DBN was extended to the time-series analysis by using the idea of LSTM (Long Short Term Memory). In this paper, our proposed prediction method was applied to Moving MNIST, which is a benchmark data set for video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Deep Belief Network · Restricted Boltzmann Machine · Long Short-Term Memory
