TL;DR
This paper introduces Visual Foresight Trees (VFT), a method that uses deep learning and tree search to rearrange cluttered objects for easier robotic grasping, improving success rates in dense environments.
Contribution
The paper presents VFT, a novel approach combining neural network predictions with tree search for nonprehensile rearrangement in cluttered object retrieval tasks.
Findings
VFT outperforms model-free and myopic methods in success rate.
VFT reduces the number of actions needed for successful retrieval.
The approach is effective in both simulation and real robot experiments.
Abstract
This paper considers the problem of retrieving an object from many tightly packed objects using a combination of robotic pushing and grasping actions. Object retrieval in dense clutter is an important skill for robots to operate in households and everyday environments effectively. The proposed solution, Visual Foresight Trees (VFT), intelligently rearranges the clutter surrounding a target object so that it can be grasped easily. Rearrangement with nested nonprehensile actions is challenging as it requires predicting complex object interactions in a combinatorially large configuration space of multiple objects. We first show that a deep neural network can be trained to accurately predict the poses of the packed objects when the robot pushes one of them. The predictive network provides visual foresight and is used in a tree search as a state transition function in the space of scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
