MotionCFG: Boosting Motion Dynamics via Stochastic Concept Perturbation
Byungjun Kim, Soobin Um, Jong Chul Ye

TL;DR
MotionCFG introduces a noise-contrastive approach to enhance motion dynamics in text-to-video synthesis, avoiding semantic bias from explicit negative prompts and improving temporal detail refinement.
Contribution
The paper presents MotionCFG, a novel noise-perturbed contrastive guidance method that improves motion quality without semantic distortion in T2V models.
Findings
Enhanced motion dynamics across state-of-the-art frameworks
Effective steering of complex concepts like object numerosity
Minimal additional computational overhead
Abstract
Despite recent advances in Text-to-Video (T2V) synthesis, generating high-fidelity and dynamic motion remains a significant challenge. Existing methods primarily rely on Classifier-Free Guidance (CFG), often with explicit negative prompts (e.g. "static", "blurry"), to suppress undesired artifacts. However, such explicit negations frequently introduce unintended semantic bias and distort object integrity; a phenomenon we define as Content-Motion Drift. To address this, we propose MotionCFG, a framework that enhances motion dynamics by contrasting a target concept with its noise-perturbed counterparts. Specifically, by injecting Gaussian noise into the concept embeddings, MotionCFG creates localized negative anchors that encapsulate a broad complementary space of sub-optimal motion variations. Unlike explicit negations, this approach facilitates implicit hard negative mining without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Multimodal Machine Learning Applications
