GUNet: A Graph Convolutional Network United Diffusion Model for Stable and Diversity Pose Generation
Shuowen Liang, Sisi Li, Qingyun Wang, Cen Zhang, Kaiquan Zhu, Tian, Yang

TL;DR
This paper introduces GUNet and PoseDiffusion, a novel diffusion-based framework for generating diverse, accurate, and aesthetically pleasing human pose skeletons from textual descriptions, outperforming existing GAN-based methods.
Contribution
It presents the first diffusion model-based framework for text-driven pose skeleton generation, incorporating graph convolutional networks for improved structural accuracy and diversity.
Findings
Outperforms state-of-the-art GAN-based methods in stability and diversity
Incorporates skeletal information via graph convolutional networks for better structural learning
Demonstrates superior controllability and aesthetic quality in pose generation
Abstract
Pose skeleton images are an important reference in pose-controllable image generation. In order to enrich the source of skeleton images, recent works have investigated the generation of pose skeletons based on natural language. These methods are based on GANs. However, it remains challenging to perform diverse, structurally correct and aesthetically pleasing human pose skeleton generation with various textual inputs. To address this problem, we propose a framework with GUNet as the main model, PoseDiffusion. It is the first generative framework based on a diffusion model and also contains a series of variants fine-tuned based on a stable diffusion model. PoseDiffusion demonstrates several desired properties that outperform existing methods. 1) Correct Skeletons. GUNet, a denoising model of PoseDiffusion, is designed to incorporate graphical convolutional neural networks. It is able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Robot Manipulation and Learning
MethodsDiffusion
