CharaConsist: Fine-Grained Consistent Character Generation
Mengyu Wang, Henghui Ding, Jianing Peng, Yao Zhao, Yunpeng Chen, Yunchao Wei

TL;DR
CharaConsist introduces a novel method for fine-grained consistent character generation in text-to-image models, effectively maintaining identity, clothing, and background details across varied scenes and motions.
Contribution
It presents the first tailored approach for consistent character generation in text-to-image diffusion models, using point-tracking attention and adaptive token merging for enhanced fidelity.
Findings
Achieves high-fidelity, consistent character generation across scenes.
Supports both continuous and discrete shot consistency.
Outperforms existing methods in maintaining identity and background details.
Abstract
In text-to-image generation, producing a series of consistent contents that preserve the same identity is highly valuable for real-world applications. Although a few works have explored training-free methods to enhance the consistency of generated subjects, we observe that they suffer from the following problems. First, they fail to maintain consistent background details, which limits their applicability. Furthermore, when the foreground character undergoes large motion variations, inconsistencies in identity and clothing details become evident. To address these problems, we propose CharaConsist, which employs point-tracking attention and adaptive token merge along with decoupled control of the foreground and background. CharaConsist enables fine-grained consistency for both foreground and background, supporting the generation of one character in continuous shots within a fixed scene or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques
MethodsBalanced Selection
