Retrieval Augmented Comic Image Generation

Yunhao Shui; Xuekuan Wang; Feng Qiu; Yuqiu Huang; Jinzhu Li; Haoyu Zheng; Jinru Han; Zhuo Zeng; Pengpeng Zhang; Jiarui Han; Keqiang Sun

arXiv:2506.12517·cs.CV·June 17, 2025

Retrieval Augmented Comic Image Generation

Yunhao Shui, Xuekuan Wang, Feng Qiu, Yuqiu Huang, Jinzhu Li, Haoyu Zheng, Jinru Han, Zhuo Zeng, Pengpeng Zhang, Jiarui Han, Keqiang Sun

PDF

Open Access

TL;DR

RaCig is a new system that generates comic-style image sequences with consistent characters and expressive gestures by combining retrieval-based character alignment and regional feature embedding.

Contribution

It introduces a retrieval-based character assignment and regional injection mechanism for coherent and expressive comic image generation.

Findings

01

Effective generation of comic narratives with consistent characters

02

Maintains character identity and costume across frames

03

Produces diverse, vivid character gestures

Abstract

We present RaCig, a novel system for generating comic-style image sequences with consistent characters and expressive gestures. RaCig addresses two key challenges: (1) maintaining character identity and costume consistency across frames, and (2) producing diverse and vivid character gestures. Our approach integrates a retrieval-based character assignment module, which aligns characters in textual prompts with reference images, and a regional character injection mechanism that embeds character features into specified image regions. Experimental results demonstrate that RaCig effectively generates engaging comic narratives with coherent characters and dynamic interactions. The source code will be publicly available to support further research in this area.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Artificial Intelligence in Games · Handwritten Text Recognition Techniques