Visually Grounded Task and Motion Planning for Mobile Manipulation
Xiaohan Zhang, Yifeng Zhu, Yan Ding, Yuke Zhu, Peter Stone, Shiqi, Zhang

TL;DR
This paper introduces GROP, a visual grounding-based TAMP algorithm for mobile manipulation that improves task success rates and efficiency, validated through extensive simulation and real-world experiments.
Contribution
The paper presents a novel TAMP algorithm, GROP, that probabilistically evaluates action feasibility using visual grounding and is trained on a large dataset of simulated trials.
Findings
GROP achieved higher task completion rates than baselines.
GROP maintained lower or comparable action costs.
The approach was successfully tested on real robots.
Abstract
Task and motion planning (TAMP) algorithms aim to help robots achieve task-level goals, while maintaining motion-level feasibility. This paper focuses on TAMP domains that involve robot behaviors that take extended periods of time (e.g., long-distance navigation). In this paper, we develop a visual grounding approach to help robots probabilistically evaluate action feasibility, and introduce a TAMP algorithm, called GROP, that optimizes both feasibility and efficiency. We have collected a dataset that includes 96,000 simulated trials of a robot conducting mobile manipulation tasks, and then used the dataset to learn to ground symbolic spatial relationships for action feasibility evaluation. Compared with competitive TAMP baselines, GROP exhibited a higher task-completion rate while maintaining lower or comparable action costs. In addition to these extensive experiments in simulation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Robotic Path Planning Algorithms
