ASemConsist: Adaptive Semantic Feature Control for Training-Free Identity-Consistent Generation

Shin Seong Kim; Minjung Shin; Hyunin Cho; Youngjung Uh

arXiv:2512.23245·cs.CV·January 5, 2026

ASemConsist: Adaptive Semantic Feature Control for Training-Free Identity-Consistent Generation

Shin Seong Kim, Minjung Shin, Hyunin Cho, Youngjung Uh

PDF

Open Access

TL;DR

ASemConsist introduces a novel, training-free method for maintaining consistent character identity across generated images in text-to-image diffusion models by selectively modifying semantic features and evaluating identity preservation.

Contribution

It proposes a new semantic control framework that improves identity consistency without sacrificing prompt alignment, using adaptive feature sharing and a unified evaluation protocol.

Findings

01

Achieves state-of-the-art identity consistency in image sequences.

02

Effectively balances identity preservation and prompt alignment.

03

Introduces the CQS metric for comprehensive evaluation.

Abstract

Recent text-to-image diffusion models have significantly improved visual quality and text alignment. However, generating a sequence of images while preserving consistent character identity across diverse scene descriptions remains a challenging task. Existing methods often struggle with a trade-off between maintaining identity consistency and ensuring per-image prompt alignment. In this paper, we introduce a novel framework, ASemconsist, that addresses this challenge through selective text embedding modification, enabling explicit semantic control over character identity without sacrificing prompt alignment. Furthermore, based on our analysis of padding embeddings in FLUX, we propose a semantic control strategy that repurposes padding embeddings as semantic containers. Additionally, we introduce an adaptive feature-sharing strategy that automatically evaluates textual ambiguity and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Face recognition and analysis