Deep Visual Constraints: Neural Implicit Models for Manipulation Planning from Visual Input
Jung-Su Ha, Danny Driess, Marc Toussaint

TL;DR
This paper introduces a neural implicit model that represents objects as continuous functions derived from images, enabling manipulation planning directly from visual input without manual object modeling.
Contribution
It proposes a novel pixel-aligned neural implicit representation for objects, facilitating manipulation planning solely from visual data with known camera geometry.
Findings
Enables long-horizon manipulation planning from images
Reduces manual engineering of object representations
Integrates perception and planning in a unified framework
Abstract
Manipulation planning is the problem of finding a sequence of robot configurations that involves interactions with objects in the scene, e.g., grasping and placing an object, or more general tool-use. To achieve such interactions, traditional approaches require hand-engineering of object representations and interaction constraints, which easily becomes tedious when complex objects/interactions are considered. Inspired by recent advances in 3D modeling, e.g. NeRF, we propose a method to represent objects as continuous functions upon which constraint features are defined and jointly trained. In particular, the proposed pixel-aligned representation is directly inferred from images with known camera geometry and naturally acts as a perception component in the whole manipulation pipeline, thereby enabling long-horizon planning only from visual input. Project page:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Robotics and Sensor-Based Localization
