Hierarchical Intention-Aware Expressive Motion Generation for Humanoid Robots
Lingfan Bao, Yan Pan, Tianhu Peng, Dimitrios Kanoulas, Chengxu Zhou

TL;DR
This paper presents a hierarchical framework combining intention-aware reasoning with diffusion models to generate expressive, socially appropriate humanoid robot motions in real-time, enhancing human-robot interaction.
Contribution
It introduces a novel hierarchical system integrating intention reasoning via in-context learning with diffusion-based motion synthesis for humanoid robots.
Findings
Robust real-time motion generation in dynamic scenarios
High social appropriateness of generated gestures
Effective intention refinement and adaptive responses
Abstract
Effective human-robot interaction requires robots to identify human intentions and generate expressive, socially appropriate motions in real-time. Existing approaches often rely on fixed motion libraries or computationally expensive generative models. We propose a hierarchical framework that combines intention-aware reasoning via in-context learning (ICL) with real-time motion generation using diffusion models. Our system introduces structured prompting with confidence scoring, fallback behaviors, and social context awareness to enable intention refinement and adaptive response. Leveraging large-scale motion datasets and efficient latent-space denoising, the framework generates diverse, physically plausible gestures suitable for dynamic humanoid interactions. Experimental validation on a physical platform demonstrates the robustness and social alignment of our method in realistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Social Robot Interaction and HRI · Human Pose and Action Recognition
