TL;DR
JOADAA is a unified model that jointly performs online action detection and action anticipation, leveraging the full temporal context to improve accuracy on multiple challenging datasets.
Contribution
The paper introduces JOADAA, the first unified architecture that combines online action detection and action anticipation to utilize past, present, and future information.
Findings
Achieves state-of-the-art results on THUMOS'14, CHARADES, and Multi-THUMOS datasets.
Effectively models dependencies across past, present, and future for improved action understanding.
Outperforms existing methods in both online detection and anticipation tasks.
Abstract
Action anticipation involves forecasting future actions by connecting past events to future ones. However, this reasoning ignores the real-life hierarchy of events which is considered to be composed of three main parts: past, present, and future. We argue that considering these three main parts and their dependencies could improve performance. On the other hand, online action detection is the task of predicting actions in a streaming manner. In this case, one has access only to the past and present information. Therefore, in online action detection (OAD) the existing approaches miss semantics or future information which limits their performance. To sum up, for both of these tasks, the complete set of knowledge (past-present-future) is missing, which makes it challenging to infer action dependencies, therefore having low performances. To address this limitation, we propose to fuse both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
JOADAA: Joint Online Action Detection and Action Anticipation· youtube
