StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Zhengguang Zhou, Jing Li, Huaxia Li, Nemo Chen, Xu Tang

TL;DR
StoryMaker is a novel method that ensures holistic consistency of multiple characters in text-to-image generation, maintaining facial features, clothing, hairstyles, and body details to create cohesive narrative images.
Contribution
It introduces a comprehensive personalization approach combining face identities, character features, and pose conditioning, with techniques to prevent inter-character interference and improve image fidelity.
Findings
Effective in maintaining multi-character scene consistency
Supports diverse applications and societal plug-ins
Improves image fidelity with LoRA enhancement
Abstract
Tuning-free personalized image generation methods have achieved significant success in maintaining facial consistency, i.e., identities, even with multiple characters. However, the lack of holistic consistency in scenes with multiple characters hampers these methods' ability to create a cohesive narrative. In this paper, we introduce StoryMaker, a personalization solution that preserves not only facial consistency but also clothing, hairstyles, and body consistency, thus facilitating the creation of a story through a series of images. StoryMaker incorporates conditions based on face identities and cropped character images, which include clothing, hairstyles, and bodies. Specifically, we integrate the facial identity information with the cropped character images using the Positional-aware Perceiver Resampler (PPR) to obtain distinct character features. To prevent intermingling of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Storytelling and Education · Digital Humanities and Scholarship · Topic Modeling
