SDS -- See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration

Maria Stamatopoulou; Jeffrey Li; and Dimitrios Kanoulas

arXiv:2410.11571·cs.RO·August 21, 2025·2 cites

SDS -- See it, Do it, Sorted: Quadruped Skill Synthesis from Single Video Demonstration

Maria Stamatopoulou, Jeffrey Li, and Dimitrios Kanoulas

PDF

Open Access

TL;DR

SDS enables quadruped robots to learn multiple gaits from a single unstructured video demonstration using GPT-4o-based reward functions, achieving high fidelity and real-world stability efficiently.

Contribution

The paper introduces SDS, a novel pipeline that synthesizes quadruped locomotion skills from a single video without labels, leveraging GPT-4o for reward generation and self-supervised training.

Findings

01

Achieves 100% gait matching fidelity in simulation and real world

02

Generalizes to different quadruped morphologies like ANYmal

03

Outperforms prior methods in data efficiency and training speed

Abstract

Imagine a robot learning locomotion skills from any single video, without labels or reward engineering. We introduce SDS ("See it. Do it. Sorted."), an automated pipeline for skill acquisition from unstructured demonstrations. Using GPT-4o, SDS applies novel prompting techniques, in the form of spatio-temporal grid-based visual encoding ( $G_{v}$ ) and structured input decomposition (SUS). These produce executable reward functions (RF) from the raw input videos. The RFs are used to train PPO policies and are optimized through closed-loop evolution, using training footage and performance metrics as self-supervised signals. SDS allows quadrupeds (e.g. Unitree Go1) to learn four gaits -- trot, bound, pace, and hop -- achieving 100% gait matching fidelity, Dynamic Time Warping (DTW) distance in the order of $1 0^{- 6}$ , and stable locomotion with zero failures, both in simulation and the real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Pedagogy · Multimodal Machine Learning Applications