Multiview Progress Prediction of Robot Activities

Elena Zoppellari; Federico Becattini; Marco Fiorucci; Lamberto Ballan

arXiv:2603.00151·cs.RO·March 3, 2026

Multiview Progress Prediction of Robot Activities

Elena Zoppellari, Federico Becattini, Marco Fiorucci, Lamberto Ballan

PDF

Open Access

TL;DR

This paper introduces a multi-view architecture for predicting the progress of robot actions, addressing challenges like self-occlusion, to improve robot understanding and interaction with humans.

Contribution

It presents a novel multi-view approach for action progress prediction in robotics, enhancing perception accuracy over single-view methods.

Findings

01

Effective prediction of robot action progress demonstrated on Mobile ALOHA.

02

Multi-view architecture outperforms single-view models.

03

Improves robot safety and collaboration capabilities.

Abstract

For robots to operate effectively and safely alongside humans, they must be able to understand the progress of ongoing actions. This ability, known as action progress prediction, is critical for tasks ranging from timely assistance to autonomous decision-making. However, modeling action progression in robotics has often been overlooked. Moreover, a single camera may be insufficient for understanding robot's ego-actions, as self-occlusion can significantly hinder perception and model performance. In this paper, we propose a multi-view architecture for action progress prediction in robot manipulation tasks. Experiments on Mobile ALOHA demonstrate the effectiveness of the proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · Social Robot Interaction and HRI