Vision-Based Mistake Analysis in Procedural Activities: A Review of Advances and Challenges
Konstantinos Bacharidis, Antonis A. Argyros

TL;DR
This review paper discusses recent advances in vision-based methods for mistake detection in procedural activities, highlighting challenges, datasets, and future research directions to improve safety and efficiency.
Contribution
It provides a comprehensive overview of current vision-based mistake analysis techniques, categorizes approaches, and discusses open challenges and future directions in the field.
Findings
Vision-based systems can identify deviations like incorrect sequencing and timing errors.
Challenges include intra-class variability and viewpoint differences.
Future directions involve neuro-symbolic reasoning and counterfactual modeling.
Abstract
Mistake analysis in procedural activities is a critical area of research with applications spanning industrial automation, physical rehabilitation, education and human-robot collaboration. This paper reviews vision-based methods for detecting and predicting mistakes in structured tasks, focusing on procedural and executional errors. By leveraging advancements in computer vision, including action recognition, anticipation and activity understanding, vision-based systems can identify deviations in task execution, such as incorrect sequencing, use of improper techniques, or timing errors. We explore the challenges posed by intra-class variability, viewpoint differences and compositional activity structures, which complicate mistake detection. Additionally, we provide a comprehensive overview of existing datasets, evaluation metrics and state-of-the-art methods, categorizing approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human-Automation Interaction and Safety · Human Pose and Action Recognition
