Multiview Progress Prediction of Robot Activities
Elena Zoppellari, Federico Becattini, Marco Fiorucci, Lamberto Ballan

TL;DR
This paper introduces a multi-view architecture for predicting the progress of robot actions, addressing challenges like self-occlusion, to improve robot understanding and interaction with humans.
Contribution
It presents a novel multi-view approach for action progress prediction in robotics, enhancing perception accuracy over single-view methods.
Findings
Effective prediction of robot action progress demonstrated on Mobile ALOHA.
Multi-view architecture outperforms single-view models.
Improves robot safety and collaboration capabilities.
Abstract
For robots to operate effectively and safely alongside humans, they must be able to understand the progress of ongoing actions. This ability, known as action progress prediction, is critical for tasks ranging from timely assistance to autonomous decision-making. However, modeling action progression in robotics has often been overlooked. Moreover, a single camera may be insufficient for understanding robot's ego-actions, as self-occlusion can significantly hinder perception and model performance. In this paper, we propose a multi-view architecture for action progress prediction in robot manipulation tasks. Experiments on Mobile ALOHA demonstrate the effectiveness of the proposed approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Social Robot Interaction and HRI
