Long-term Leap Attention, Short-term Periodic Shift for Video   Classification

Hao Zhang; Lechao Cheng; Yanbin Hao; Chong-Wah Ngo

arXiv:2207.05526·cs.CV·July 26, 2022

Long-term Leap Attention, Short-term Periodic Shift for Video Classification

Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo

PDF

1 Repo

TL;DR

This paper introduces LAPS, a novel video transformer method combining long-term leap attention and short-term periodic shift, reducing computational complexity while maintaining competitive accuracy on video classification tasks.

Contribution

The paper proposes LAPS, a new attention mechanism for video transformers that efficiently captures long-term and short-term temporal information with minimal additional computation.

Findings

01

Achieves competitive accuracy on Kinetics-400 benchmark.

02

Reduces computational complexity by approximately 2.6%.

03

Maintains performance with zero extra parameters.

Abstract

Video transformer naturally incurs a heavier computation burden than a static vision transformer, as the former processes $T$ times longer sequence than the latter under the current attention of quadratic complexity $(T^{2} N^{2})$ . The existing works treat the temporal axis as a simple extension of spatial axes, focusing on shortening the spatio-temporal sequence by either generic pooling or local windowing without utilizing temporal redundancy. However, videos naturally contain redundant information between neighboring frames; thereby, we could potentially suppress attention on visually similar frames in a dilated manner. Based on this hypothesis, we propose the LAPS, a long-term ``\textbf{\textit{Leap Attention}}'' (LA), short-term ``\textbf{\textit{Periodic Shift}}'' (\textit{P}-Shift) module for video transformers, with $(2 T N^{2})$ complexity. Specifically, the ``LA'' groups long-term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

videonetworks/laps-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.