Delving into 3D Action Anticipation from Streaming Videos
Hongsong Wang, Jiashi Feng

TL;DR
This paper investigates 3D action anticipation from streaming videos, proposing best practices and a novel multi-task learning method that achieves state-of-the-art results on benchmarks.
Contribution
It introduces a comprehensive evaluation framework and a new multi-task loss-based method for improved 3D action anticipation performance.
Findings
Multi-task learning enhances anticipation accuracy.
Optimal clip length and sampling methods are crucial.
Proposed method outperforms existing approaches.
Abstract
Action anticipation, which aims to recognize the action with a partial observation, becomes increasingly popular due to a wide range of applications. In this paper, we investigate the problem of 3D action anticipation from streaming videos with the target of understanding best practices for solving this problem. We first introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification. To achieve better performance, we then investigate two important factors, i.e., the length of the training clip and clip sampling method. We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label. Our comprehensive experiments uncover the best practices for 3D action anticipation, and accordingly we propose a novel method with a multi-task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Diabetic Foot Ulcer Assessment and Management
