S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks
Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang

TL;DR
This paper introduces S3D, a fully 3D convolutional network for real-time, end-to-end temporal activity detection in videos, achieving state-of-the-art accuracy and high efficiency on benchmark datasets.
Contribution
The paper proposes a novel single-shot, fully 3D convolutional architecture for temporal activity detection that simplifies the detection pipeline and improves speed and accuracy.
Findings
Achieves state-of-the-art performance on THUMOS'14 benchmark.
Operates at 1271 FPS, demonstrating high efficiency.
Provides an end-to-end detection framework without separate proposal stages.
Abstract
In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network. Our architecture, named S3D, encodes the entire video stream and discretizes the output space of temporal activity spans into a set of default spans over different temporal locations and scales. At prediction time, S3D predicts scores for the presence of activity categories in each default span and produces temporal adjustments relative to the span location to predict the precise activity duration. Unlike many state-of-the-art systems that require a separate proposal and classification stage, our S3D is intrinsically simple and dedicatedly designed for single-shot, end-to-end temporal activity detection. When evaluating on THUMOS'14 detection benchmark, S3D achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
