Do As I Do: Pose Guided Human Motion Copy
Sifan Wu, Zhenguang Liu, Beibei Zhang, Roger Zimmermann, Zhongjie Ba,, Xiaosong Zhang, Kui Ren

TL;DR
This paper introduces a novel pose-guided human motion copying method that enhances realism and temporal consistency in generated videos by combining perceptual and Gromov-Wasserstein losses, episodic memory, and sequence-based foreground generation.
Contribution
The paper proposes a new approach integrating perceptual and Gromov-Wasserstein losses, episodic memory, and sequence-to-sequence foreground generation for improved human motion copying.
Findings
Outperforms state-of-the-art methods in PSNR and FID metrics.
Achieves more realistic and temporally consistent human motion videos.
Demonstrates effectiveness across five diverse datasets.
Abstract
Human motion copy is an intriguing yet challenging task in artificial intelligence and computer vision, which strives to generate a fake video of a target person performing the motion of a source person. The problem is inherently challenging due to the subtle human-body texture details to be generated and the temporal consistency to be considered. Existing approaches typically adopt a conventional GAN with an L1 or L2 loss to produce the target fake video, which intrinsically necessitates a large number of training samples that are challenging to acquire. Meanwhile, current methods still have difficulties in attaining realistic image details and temporal consistency, which unfortunately can be easily perceived by human observers. Motivated by this, we try to tackle the issues from three aspects: (1) We constrain pose-to-appearance generation with a perceptual loss and a theoretically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Human Motion and Animation
