Reactive Long Horizon Task Execution via Visual Skill and Precondition Models
Shohin Mukherjee, Chris Paxton, Arsalan Mousavian, Adam Fishman, Maxim, Likhachev, Dieter Fox

TL;DR
This paper presents a sim-to-real approach for zero-shot robotic task execution by learning a library of skills and preconditions in simulation, enabling robots to perform complex, long-horizon tasks in real-world environments without fine-tuning.
Contribution
It introduces a method to transfer simulation-trained skill and precondition models to real robots for unseen tasks, improving success rates significantly.
Findings
Success rate increased from 91.6% to 98% in simulation.
Success rate increased from 10% to 80% in the real-world.
Method generalizes to various tasks beyond block-stacking.
Abstract
Zero-shot execution of unseen robotic tasks is important to allowing robots to perform a wide variety of tasks in human environments, but collecting the amounts of data necessary to train end-to-end policies in the real-world is often infeasible. We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We learn a library of parameterized skills, along with a set of predicates-based preconditions and termination conditions, entirely in simulation. We explore a block-stacking task because it has a clear structure, where multiple skills must be chained together, but our methods are applicable to a wide range of other problems and domains, and can transfer from simulation to the real-world with no fine tuning. The system is able to recognize failures and accomplish long-horizon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Multimodal Machine Learning Applications
