VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects
Ruihai Wu, Yan Zhao, Kaichun Mo, Zizheng Guo, Yian Wang, Tianhao Wu,, Qingnan Fan, Xuelin Chen, Leonidas Guibas, Hao Dong

TL;DR
VAT-Mart introduces a novel perception-interaction framework that predicts dense, geometry-aware visual action proposals for manipulating 3D articulated objects, enhancing robot interaction capabilities in complex environments.
Contribution
The paper proposes object-centric visual priors and an interaction-for-perception framework that outperform traditional kinematic-based methods in manipulating 3D articulated objects.
Findings
Effective in predicting dense action affordances across diverse shapes.
Generalizes well to unseen object categories and real-world data.
Improves manipulation guidance over existing kinematic estimation methods.
Abstract
Perceiving and manipulating 3D articulated objects (e.g., cabinets, doors) in human environments is an important yet challenging task for future home-assistant robots. The space of 3D articulated objects is exceptionally rich in their myriad semantic categories, diverse shape geometry, and complicated part functionality. Previous works mostly abstract kinematic structure with estimated joint parameters and part poses as the visual representations for manipulating 3D articulated objects. In this paper, we propose object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation, by predicting dense geometry-aware, interaction-aware, and task-aware visual action affordance and trajectory proposals. We design an interaction-for-perception framework VAT-Mart to learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization
