Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

Mateusz Malinowski; Dimitrios Vytiniotis; Grzegorz Swirszcz and; Viorica Patraucean; Joao Carreira

arXiv:2106.08318·cs.CV·July 13, 2021

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz and, Viorica Patraucean, Joao Carreira

PDF

TL;DR

This paper introduces Skip-Sideways, a novel neural network training method for large-scale temporal video data that enables low-latency, distributed, and parallel training, improving action recognition and future frame prediction.

Contribution

It extends Sideways by incorporating skip connections for better temporal integration and supports distributed training, enhancing efficiency and performance in large-scale video modeling.

Findings

01

Achieves low latency training and model parallelism.

02

Improves accuracy on HMDB51, UCF101, Kinetics-600 datasets.

03

Models generate better future frames, capturing motion cues.

Abstract

How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. For temporal signals, this introduces high latency and hinders real-time learning. It also creates a coupling between consecutive layers, which limits model parallelism and increases memory consumption. In this paper, we build upon Sideways, which avoids blocking by propagating approximate gradients forward in time, and we propose mechanisms for temporal integration of information based on different variants of skip connections. We also show how to decouple computation and delegate individual neural modules to different devices, allowing distributed and parallel training. The proposed Skip-Sideways achieves low latency training, model parallelism, and, importantly, is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.