QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

Li Gao; Fuzhi Yang; Jianhui Chen; Liu Liu; Yao Zheng; Yang Cai; Ziqiao Li

arXiv:2603.24021·cs.RO·March 26, 2026

QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control

Li Gao, Fuzhi Yang, Jianhui Chen, Liu Liu, Yao Zheng, Yang Cai, Ziqiao Li

PDF

Open Access

TL;DR

QuadFM is a comprehensive, high-fidelity dataset of quadruped motions with rich language annotations, enabling advanced text-to-motion generation and control for more intuitive human-robot interactions.

Contribution

The paper introduces QuadFM, the first large-scale dataset combining diverse quadruped motions with detailed language grounding, and proposes Gen2Control RL for real-time, end-to-end motion synthesis on edge hardware.

Findings

01

Achieves real-time motion synthesis with <500 ms latency.

02

Supports diverse, realistic quadruped behaviors including emotional and interactive motions.

03

Demonstrates effective transfer from simulation to real-world robot control.

Abstract

Despite significant advances in quadrupedal robotics, a critical gap persists in foundational motion resources that holistically integrate diverse locomotion, emotionally expressive behaviors, and rich language semantics-essential for agile, intuitive human-robot interaction. Current quadruped motion datasets are limited to a few mocap primitives (e.g., walk, trot, sit) and lack diverse behaviors with rich language grounding. To bridge this gap, we introduce Quadruped Foundational Motion (QuadFM) , the first large-scale, ultra-high-fidelity dataset designed for text-to-motion generation and general motion control. QuadFM contains 11,784 curated motion clips spanning locomotion, interactive, and emotion-expressive behaviors (e.g., dancing, stretching, peeing), each with three-layer annotation-fine-grained action labels, interaction scenarios, and natural language commands-totaling 35,352…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Social Robot Interaction and HRI · Human Pose and Action Recognition