Illustrious: an Open Advanced Illustration Model
Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon, Moon, Hyunju Lee, Min Song

TL;DR
Illustrious is an open-source text-to-image anime generation model that achieves high resolution, dynamic colors, and detailed character depiction through innovative training techniques and multi-level captioning.
Contribution
The paper introduces novel training strategies, higher resolution capabilities, and refined captioning methods to significantly improve anime image generation quality.
Findings
Achieves state-of-the-art anime image quality
Handles images over 20MP with high detail
Outperforms existing illustration models
Abstract
In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Digital Games and Media
MethodsDropout · Focus
