Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu

TL;DR
This paper introduces MGAD, a diffusion-based digital art synthesis model that uses multimodal prompts and CLIP to enhance diversity and expressiveness in generated artworks, validated by extensive experiments.
Contribution
The paper presents a novel multimodal guided diffusion model for digital art synthesis, integrating CLIP for unified text-image guidance, improving diversity and quality of generated art.
Findings
Enhanced diversity in generated artworks.
Improved quality of digital paintings.
Effective multimodal guidance demonstrated.
Abstract
Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting the expressiveness of the model and the diversity of generated results. To solve this problem, we propose the multimodal guided artwork diffusion (MGAD) model, which is a diffusion-based digital artwork generation approach that utilizes multimodal prompts as guidance to control the classifier-free diffusion model. Additionally, the contrastive language-image pretraining (CLIP) model is used to unify text and image modalities. Extensive experimental results on the quality and quantity of the generated digital art paintings confirm the effectiveness of the combination of the diffusion model and multimodal guidance. Code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Aesthetic Perception and Analysis
MethodsDiffusion
