Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
Le Yang, Junwei Han, Dingwen Zhang

TL;DR
Colar introduces an exemplar-consultation mechanism for online action detection that efficiently models long-term dependencies and category-level features, achieving high accuracy with a lightweight architecture.
Contribution
The paper proposes a novel exemplar-consultation mechanism that enhances online action detection by combining similarity-based feature aggregation with category-level modeling, improving efficiency and accuracy.
Findings
Achieves state-of-the-art performance on three benchmarks.
Balances effectiveness and efficiency with a lightweight model.
Utilizes exemplar similarity to capture long-term dependencies.
Abstract
Online action detection has attracted increasing research interests in recent years. Current works model historical dependencies and anticipate the future to perceive the action evolution within a video segment and improve the detection accuracy. However, the existing paradigm ignores category-level modeling and does not pay sufficient attention to efficiency. Considering a category, its representative frames exhibit various characteristics. Thus, the category-level modeling can provide complimentary guidance to the temporal dependencies modeling. This paper develops an effective exemplar-consultation mechanism that first measures the similarity between a frame and exemplary frames, and then aggregates exemplary features based on the similarity weights. This is also an efficient mechanism, as both similarity measurement and feature aggregation require limited computations. Based on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
