GUNet: A Graph Convolutional Network United Diffusion Model for Stable   and Diversity Pose Generation

Shuowen Liang; Sisi Li; Qingyun Wang; Cen Zhang; Kaiquan Zhu; Tian; Yang

arXiv:2409.11689·cs.CV·September 19, 2024

GUNet: A Graph Convolutional Network United Diffusion Model for Stable and Diversity Pose Generation

Shuowen Liang, Sisi Li, Qingyun Wang, Cen Zhang, Kaiquan Zhu, Tian, Yang

PDF

Open Access

TL;DR

This paper introduces GUNet and PoseDiffusion, a novel diffusion-based framework for generating diverse, accurate, and aesthetically pleasing human pose skeletons from textual descriptions, outperforming existing GAN-based methods.

Contribution

It presents the first diffusion model-based framework for text-driven pose skeleton generation, incorporating graph convolutional networks for improved structural accuracy and diversity.

Findings

01

Outperforms state-of-the-art GAN-based methods in stability and diversity

02

Incorporates skeletal information via graph convolutional networks for better structural learning

03

Demonstrates superior controllability and aesthetic quality in pose generation

Abstract

Pose skeleton images are an important reference in pose-controllable image generation. In order to enrich the source of skeleton images, recent works have investigated the generation of pose skeletons based on natural language. These methods are based on GANs. However, it remains challenging to perform diverse, structurally correct and aesthetically pleasing human pose skeleton generation with various textual inputs. To address this problem, we propose a framework with GUNet as the main model, PoseDiffusion. It is the first generative framework based on a diffusion model and also contains a series of variants fine-tuned based on a stable diffusion model. PoseDiffusion demonstrates several desired properties that outperform existing methods. 1) Correct Skeletons. GUNet, a denoising model of PoseDiffusion, is designed to incorporate graphical convolutional neural networks. It is able to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Robot Manipulation and Learning

MethodsDiffusion