ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio, Savarese, Anima Anandkumar, Yuke Zhu

TL;DR
ACID introduces a novel action-conditional implicit visual dynamics model for deformable object manipulation, leveraging implicit representations and geodesics-based contrastive learning to improve prediction accuracy and task success in complex, real-world scenarios.
Contribution
The paper presents a new implicit neural representation framework for deformable object dynamics, incorporating geodesics-based contrastive learning for better state correspondence understanding.
Findings
Achieves superior geometry, correspondence, and dynamics prediction performance.
Increases task success rate by 30% over baselines.
Successfully manipulates real-world deformable objects using simulation-trained models.
Abstract
Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Human Motion and Animation
