Online Detection of Action Start in Untrimmed, Streaming Videos
Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan, Mansour, Anthony Vetro, Xavier Giro-i-Nieto, Shih-Fu Chang

TL;DR
This paper introduces the task of online action start detection in untrimmed streaming videos, proposing three novel methods to improve detection accuracy and latency, validated through extensive experiments on benchmark datasets.
Contribution
It presents three innovative techniques—GAN-based hard negative sample generation, temporal consistency modeling, and adaptive sampling—for training ODAS models, advancing the state-of-the-art.
Findings
Significant performance improvements over existing methods.
Effective handling of ambiguous backgrounds and data scarcity.
Validated on THUMOS'14 and ActivityNet datasets.
Abstract
We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos. The goal of ODAS is to detect the start of an action instance, with high categorization accuracy and low detection latency. ODAS is important in many applications such as early alert generation to allow timely security or emergency response. We propose three novel methods to specifically address the challenges in training ODAS models: (1) hard negative samples generation based on Generative Adversarial Network (GAN) to distinguish ambiguous background, (2) explicitly modeling the temporal consistency between data around action start and data succeeding action start, and (3) adaptive sampling strategy to handle the scarcity of training data. We conduct extensive experiments using THUMOS'14 and ActivityNet. We show that our proposed methods lead to significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
