Motion meets Attention: Video Motion Prompts

Qixiang Chen; Lei Wang; Piotr Koniusz; Tom Gedeon

arXiv:2407.03179·cs.CV·October 3, 2024·2 cites

Motion meets Attention: Video Motion Prompts

Qixiang Chen, Lei Wang, Piotr Koniusz, Tom Gedeon

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel motion prompt layer that uses attention mechanisms and regularization to extract and highlight relevant motion cues in videos, improving action recognition performance.

Contribution

It proposes a learnable attention-based motion prompt layer with regularization, enabling better motion feature extraction in video models.

Findings

01

Enhanced action recognition accuracy on benchmarks

02

Seamless integration with existing models like SlowFast and TimeSformer

03

Effective suppression of noise in motion signals

Abstract

Videos contain rich spatio-temporal information. Traditional methods for extracting motion, used in tasks such as action recognition, often rely on visual contents rather than precise motion features. This phenomenon is referred to as 'blind motion extraction' behavior, which proves inefficient in capturing motions of interest due to a lack of motion-guided cues. Recently, attention mechanisms have enhanced many computer vision tasks by effectively highlighting salient visual areas. Inspired by this, we propose a modified Sigmoid function with learnable slope and shift parameters as an attention mechanism to modulate motion signals from frame differencing maps. This approach generates a sequence of attention maps that enhance the processing of motion-related video content. To ensure temporal continuity and smoothness of the attention maps, we apply pair-wise temporal attention variation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

q1xiangchen/vmps
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCinema and Media Studies · Data Visualization and Analytics

MethodsSoftmax · Attention Is All You Need · TimeSformer · Adapter