From Detection to Anticipation: Online Understanding of Struggles across Various Tasks and Activities
Shijia Feng, Michael Wray, Walterio Mayol-Cuevas

TL;DR
This paper introduces real-time online models for detecting and anticipating human struggles across various tasks, enabling proactive assistance by predicting difficulties before they occur with high speed and generalization capabilities.
Contribution
It reformulates struggle localization as an online detection and anticipation task, adapting existing models for real-time performance and analyzing their generalization across tasks and activities.
Findings
Online struggle detection achieves 70-80% per-frame mAP.
Anticipation up to 2 seconds ahead maintains comparable performance.
Models operate at up to 143 FPS, suitable for real-time applications.
Abstract
Understanding human skill performance is essential for intelligent assistive systems, with struggle recognition offering a natural cue for identifying user difficulties. While prior work focuses on offline struggle classification and localization, real-time applications require models capable of detecting and anticipating struggle online. We reformulate struggle localization as an online detection task and further extend it to anticipation, predicting struggle moments before they occur. We adapt two off-the-shelf models as baselines for online struggle detection and anticipation. Online struggle detection achieves 70-80% per-frame mAP, while struggle anticipation up to 2 seconds ahead yields comparable performance with slight drops. We further examine generalization across tasks and activities and analyse the impact of skill evolution. Despite larger domain gaps in activity-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Robot Manipulation and Learning · Human Pose and Action Recognition
