STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond
Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao

TL;DR
This paper introduces STAU, a novel unit that models the complex correlations between spatial and temporal information in videos, significantly improving video prediction, early action recognition, and object detection tasks.
Contribution
The paper proposes a SpatioTemporal-Aware Unit (STAU) that explores and models the correlations between spatial and temporal features for enhanced video analysis.
Findings
Outperforms existing methods in video prediction tasks.
Improves accuracy in early action recognition and object detection.
Enhances computational efficiency across tasks.
Abstract
Video prediction aims to predict future frames by modeling the complex spatiotemporal dynamics in videos. However, most of the existing methods only model the temporal information and the spatial information for videos in an independent manner but haven't fully explored the correlations between both terms. In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos. On the one hand, the motion-aware attention weights are learned from the spatial states to help aggregate the temporal states in the temporal domain. On the other hand, the appearance-aware attention weights are learned from the temporal states to help aggregate the spatial states in the spatial domain. In this way, the temporal information and the spatial information can be greatly aware of each other in both domains, during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
MethodsAttentive Walk-Aggregating Graph Neural Network
