# FreqPose: Frequency-Aware Diffusion with Fractional Gabor Filters and Global Pose–Semantic Alignment

**Authors:** Meng Wang, Bing Wang, Huiling Chen, Jing Ren, Xueping Tang

PMC · DOI: 10.3390/s26041334 · 2026-02-19

## TL;DR

This paper introduces FreqPose, a new method for generating realistic person images from poses by preserving texture details and maintaining identity consistency.

## Contribution

The novel framework combines frequency-aware diffusion with global semantic-pose alignment for improved image generation.

## Key findings

- FreqPose outperforms existing methods in SSIM and FID metrics on DeepFashion and Market1501 datasets.
- The method preserves high-frequency textures like hair and fabric under complex pose changes.
- Global semantic alignment ensures consistent appearance and identity during pose variations.

## Abstract

The task of pose-guided person image generation has long been confronted with two major challenges: high-frequency texture details tend to blur and be lost during appearance transfer, while the semantic identity of the person is difficult to maintain consistently during pose changes. To address these issues, this paper proposes a diffusion-based generative framework that integrates frequency awareness and global semantic alignment. The framework consists of two core modules: a multi-level fractional-order Gabor frequency-aware network, which accurately extracts and reconstructs high-frequency texture features such as hair strands and fabric wrinkles, enhances image detail fidelity through fractional-order filtering and complex domain modeling; and a global semantic-pose alignment module that utilizes a cross-modal attention mechanism to establish a global mapping between pose features and appearance semantics, ensuring pose-driven semantic alignment and appearance consistency. The collaborative function of these two modules ensures that the generated results maintain structural integrity and natural textures even under complex pose variations and large-angle rotations. The experimental results on the DeepFashion and Market1501 datasets demonstrate that the proposed method outperforms existing state-of-the-art approaches in terms of SSIM, FID, and perceptual quality, validating the effectiveness of the model in enhancing texture fidelity and semantic consistency.

## Full-text entities

- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** VAE (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12944463/full.md

---
Source: https://tomesphere.com/paper/PMC12944463