Consistent Story Generation: Unlocking the Potential of Zigzag Sampling
Mingxiao Li, Mang Ning, Marie-Francine Moens

TL;DR
This paper introduces Zigzag Sampling, a training-free method that improves subject consistency in visual story generation by alternating prompts and sharing visual cues, outperforming previous techniques.
Contribution
The paper presents a novel training-free sampling strategy combining asymmetric prompts and visual sharing to enhance subject consistency in visual storytelling.
Findings
Significantly better subject consistency in generated stories.
Outperforms previous methods in coherence and visual quality.
Effective without additional training or fine-tuning.
Abstract
Text-to-image generation models have made significant progress in producing high-quality images from textual descriptions, yet they continue to struggle with maintaining subject consistency across multiple images, a fundamental requirement for visual storytelling. Existing methods attempt to address this by either fine-tuning models on large-scale story visualization datasets, which is resource-intensive, or by using training-free techniques that share information across generations, which still yield limited success. In this paper, we introduce a novel training-free sampling strategy called Zigzag Sampling with Asymmetric Prompts and Visual Sharing to enhance subject consistency in visual story generation. Our approach proposes a zigzag sampling mechanism that alternates between asymmetric prompting to retain subject characteristics, while a visual sharing module transfers visual cues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Data Visualization and Analytics
