Convolutional Tensor-Train LSTM for Spatio-temporal Learning
Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz,, Animashree Anandkumar

TL;DR
This paper introduces a convolutional tensor-train LSTM model that efficiently captures long-term spatio-temporal correlations in video data, achieving state-of-the-art results with fewer parameters.
Contribution
It proposes a novel tensor train decomposition for convolutional LSTMs, enabling efficient learning of long-term dependencies in video sequences.
Findings
Outperforms existing methods on video prediction tasks
Uses significantly fewer parameters than baseline models
Achieves state-of-the-art results on multiple datasets
Abstract
Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation.However, existing methods still perform poorly on challenging video tasks such as long-term forecasting. This is because these kinds of challenging tasks require learning long-term spatio-temporal correlations in the video sequence. In this paper, we propose a higher-order convolutional LSTM model that can efficiently learn these correlations, along with a succinct representations of the history. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. To make this feasible in terms of computation and memory requirements, we propose a novel convolutional tensor-train decomposition of the higher-order model. This decomposition reduces the model complexity by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTensor decomposition and applications · Advanced Neuroimaging Techniques and Applications · Human Pose and Action Recognition
MethodsConvolution · ConvLSTM · Long Short-Term Memory
