STAP: Sequencing Task-Agnostic Policies
Christopher Agia, Toki Migimatsu, Jiajun Wu, Jeannette Bohg

TL;DR
STAP introduces a scalable framework that trains manipulation skills and plans their sequences by optimizing skill feasibility, enabling robots to successfully perform complex long-horizon tasks unseen during training.
Contribution
The paper proposes a novel method for sequencing manipulation skills using Q-functions to optimize plan feasibility, improving long-horizon task success in robotics.
Findings
Q-function-based optimization correlates with ground truth feasibility.
Using the objective reduces myopic planning behavior.
Effective in both simulation and real robot experiments.
Abstract
Advances in robotic skill acquisition have made it possible to build general-purpose libraries of learned skills for downstream manipulation tasks. However, naively executing these skills one after the other is unlikely to succeed without accounting for dependencies between actions prevalent in long-horizon plans. We present Sequencing Task-Agnostic Policies (STAP), a scalable framework for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks never seen by any skill during training. Given that Q-functions encode a measure of skill feasibility, we formulate an optimization problem to maximize the joint success of all skills sequenced in a plan, which we estimate by the product of their Q-values. Our experiments indicate that this objective function approximates ground truth plan feasibility and, when used as a planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
