Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation
Pierrick Lorang, Johannes Huemer, Timothy Duggan, Kai Goebel, Patrik Zips, Matthias Scheutz

TL;DR
This paper introduces a scalable neuro-symbolic framework that enables robots to learn complex manipulation tasks from minimal unannotated demonstrations by automatically constructing symbolic domains and control policies.
Contribution
It presents a novel method that autonomously builds symbolic planning domains and control policies from few demonstrations without manual domain engineering.
Findings
Successfully applied on a real industrial forklift with statistically rigorous trials.
Demonstrated cross-platform generality on a Kinova Gen3 robotic arm.
Achieved data-efficient learning with minimal demonstrations, reducing reliance on manual annotations.
Abstract
Enabling robots to learn long-horizon manipulation tasks from a handful of demonstrations remains a central challenge in robotics. Existing neuro-symbolic approaches often rely on hand-crafted symbolic abstractions, semantically labeled trajectories or large demonstration datasets, limiting their scalability and real-world applicability. We present a scalable neuro-symbolic framework that autonomously constructs symbolic planning domains and data-efficient control policies from as few as one to thirty unannotated skill demonstrations, without requiring manual domain engineering. Our method segments demonstrations into skills and employs a Vision-Language Model (VLM) to classify skills and identify equivalent high-level states, enabling automatic construction of a state-transition graph. This graph is processed by an Answer Set Programming solver to synthesize a PDDL planning domain,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
