KB-DMGen: Knowledge-Based Global Guidance and Dynamic Pose Masking for Human Image Generation

Shibang Liu; Xuemei Xie; Guangming Shi

arXiv:2507.20083·cs.CV·September 16, 2025

KB-DMGen: Knowledge-Based Global Guidance and Dynamic Pose Masking for Human Image Generation

Shibang Liu, Xuemei Xie, Guangming Shi

PDF

TL;DR

KB-DMGen introduces a novel framework combining a visual codebook and dynamic pose masking to improve human image generation, achieving state-of-the-art results in pose accuracy and image quality.

Contribution

It proposes a knowledge-based global guidance and dynamic pose masking approach that enhances pose accuracy without sacrificing image quality in human image generation.

Findings

01

Achieves new state-of-the-art AP and CAP scores on HumanArt dataset.

02

Effectively balances pose accuracy and image quality.

03

Demonstrates superior performance over existing methods.

Abstract

Recent methods using diffusion models have made significant progress in Human Image Generation (HIG) with various control signals such as pose priors. In HIG, both accurate human poses and coherent visual quality are crucial for image generation. However, most existing methods mainly focus on pose accuracy while neglecting overall image quality, often improving pose alignment at the cost of image quality. To address this, we propose Knowledge-Based Global Guidance and Dynamic pose Masking for human image Generation (KB-DMGen). The Knowledge Base (KB), implemented as a visual codebook, provides coarse, global guidance based on input text-related visual features, improving pose accuracy while maintaining image quality, while the Dynamic pose Mask (DM) offers fine-grained local control to enhance precise pose accuracy. By injecting KB and DM at different stages of the diffusion process,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.