ActionPlan: Future-Aware Streaming Motion Synthesis via Frame-Level Action Planning
Eric Nazarenus, Chuqiao Li, Yannan He, Xianghui Xie, Jan Eric Lenssen, Gerard Pons-Moll

TL;DR
ActionPlan is a unified motion diffusion framework that enables real-time streaming and high-quality offline motion synthesis through frame-level action planning, improving speed and quality while supporting editing and in-betweening.
Contribution
It introduces a novel per-frame action plan and latent-specific diffusion steps, unifying real-time streaming with offline high-quality motion generation in a single model.
Findings
Real-time streaming is 5.25x faster than previous methods.
Achieves 18% improvement in motion quality (FID) over prior work.
Supports zero-shot motion editing and in-betweening without extra models.
Abstract
We present ActionPlan, a unified motion diffusion framework that bridges real-time streaming with high-quality offline generation within a single model. The core idea is to introduce a per-frame action plan: the model predicts frame-level text latents that act as dense semantic anchors throughout denoising, and uses them to denoise the full motion sequence with combined semantic and motion cues. To support this structured workflow, we design latent-specific diffusion steps, allowing each motion latent to be denoised independently and sampled in flexible orders at inference. As a result, ActionPlan can run in a history-conditioned, future-aware mode for real-time streaming, while also supporting high-quality offline generation. The same mechanism further enables zero-shot motion editing and in-betweening without additional models. Experiments demonstrate that our real-time streaming is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
