TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang, Zhan Tong, Bin Ji, Gangshan Wu

TL;DR
The paper introduces Temporal Difference Networks (TDN), a novel architecture that efficiently models multi-scale temporal information for action recognition in videos, achieving state-of-the-art results on several datasets.
Contribution
The paper proposes a new temporal difference module and a two-level difference modeling paradigm to improve motion modeling in video action recognition.
Findings
State-of-the-art performance on Something-Something V1 & V2 datasets
Competitive results on Kinetics-400 dataset
Provides detailed ablation studies and visualization insights
Abstract
Temporal modeling still remains challenging for action recognition in videos. To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal information for efficient action recognition. The core of our TDN is to devise an efficient temporal module (TDM) by explicitly leveraging a temporal difference operator, and systematically assess its effect on short-term and long-term motion modeling. To fully capture temporal information over the entire video, our TDN is established with a two-level difference modeling paradigm. Specifically, for local motion modeling, temporal difference over consecutive frames is used to supply 2D CNNs with finer motion pattern, while for global motion modeling, temporal difference across segments is incorporated to capture long-range structure for motion feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation
MethodsTemporaral Difference Network
