Move-in-2D: 2D-Conditioned Human Motion Generation

Hsin-Ping Huang; Yang Zhou; Jui-Hsien Wang; Difan Liu; Feng Liu,; Ming-Hsuan Yang; Zhan Xu

arXiv:2412.13185·cs.CV·December 18, 2024

Move-in-2D: 2D-Conditioned Human Motion Generation

Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang, Difan Liu, Feng Liu,, Ming-Hsuan Yang, Zhan Xu

PDF

Open Access

TL;DR

Move-in-2D introduces a diffusion-based method for generating human motion sequences conditioned on scene images and text prompts, enabling diverse and scene-adaptive human motion synthesis for improved video quality.

Contribution

It presents a novel diffusion model that generates scene-conditioned human motion sequences using scene images and text prompts, trained on a large-scale annotated video dataset.

Findings

01

Effective scene-aligned human motion prediction

02

Enhanced motion diversity and scene adaptation

03

Improved human motion quality in video synthesis

Abstract

Generating realistic human videos remains a challenging task, with the most effective methods currently relying on a human motion sequence as a control signal. Existing approaches often use existing motion extracted from other videos, which restricts applications to specific motion types and global scene matching. We propose Move-in-2D, a novel approach to generate human motion sequences conditioned on a scene image, allowing for diverse motion that adapts to different scenes. Our approach utilizes a diffusion model that accepts both a scene image and text prompt as inputs, producing a motion sequence tailored to the scene. To train this model, we collect a large-scale video dataset featuring single-human activities, annotating each video with the corresponding human motion as the target output. Experiments demonstrate that our method effectively predicts human motion that aligns with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · 3D Shape Modeling and Analysis

MethodsDiffusion