VP-GO: a "light" action-conditioned visual prediction model

Anji Ma; Yoann Fleytoux; Jean-Bapstiste Mouret; Serena Ivaldi

arXiv:2109.12694·cs.RO·September 28, 2021

VP-GO: a "light" action-conditioned visual prediction model

Anji Ma, Yoann Fleytoux, Jean-Bapstiste Mouret, Serena Ivaldi

PDF

Open Access

TL;DR

VP-GO is a lightweight, stochastic, action-conditioned visual prediction model designed for robotic grasping, offering improved qualitative predictions of complex grasps while maintaining computational efficiency.

Contribution

The paper introduces VP-GO, a novel lightweight stochastic visual prediction model with hierarchical action decomposition and releases a new dataset for robotic grasp prediction.

Findings

01

Performs comparably to complex models on signal prediction metrics.

02

Outperforms in qualitative prediction of complex robotic grasps.

03

Compatible with existing datasets like RoboNet and PandaGrasp.

Abstract

Visual prediction models are a promising solution for visual-based robotic grasping of cluttered, unknown soft objects. Previous models from the literature are computationally greedy, which limits reproducibility; although some consider stochasticity in the prediction model, it is often too weak to catch the reality of robotics experiments involving grasping such objects. Furthermore, previous work focused on elementary movements that are not efficient to reason in terms of more complex semantic actions. To address these limitations, we propose VP-GO, a ``light'' stochastic action-conditioned visual prediction model. We propose a hierarchical decomposition of semantic grasping and manipulation actions into elementary end-effector movements, to ensure compatibility with existing models and datasets for visual prediction of robotic actions such as RoboNet. We also record and release a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Human Pose and Action Recognition