CalliffusionV2: Personalized Natural Calligraphy Generation with Flexible Multi-modal Control
Qisheng Liao, Liang Li, Yulang Fei, Gus Xia

TL;DR
CalliffusionV2 is a versatile system for generating natural Chinese calligraphy with multi-modal control, enabling style customization, quick style learning, and support for non-Chinese characters, validated by both neural and human assessments.
Contribution
It introduces a multi-modal controlled calligraphy generation system that allows fine-grained style customization and rapid style adaptation with minimal data.
Findings
Produces stylistically accurate calligraphy recognized by classifiers and humans
Supports quick learning of new styles with few-shot training
Generates non-Chinese characters without prior training
Abstract
In this paper, we introduce CalliffusionV2, a novel system designed to produce natural Chinese calligraphy with flexible multi-modal control. Unlike previous approaches that rely solely on image or text inputs and lack fine-grained control, our system leverages both images to guide generations at fine-grained levels and natural language texts to describe the features of generations. CalliffusionV2 excels at creating a broad range of characters and can quickly learn new styles through a few-shot learning approach. It is also capable of generating non-Chinese characters without prior training. Comprehensive tests confirm that our system produces calligraphy that is both stylistically accurate and recognizable by neural network classifiers and human evaluators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
