Enhancing Image Aesthetics with Dual-Conditioned Diffusion Models Guided by Multimodal Perception
Xinyu Nan, Ning Wang, Yuyao Zhai, Mei Yang

TL;DR
This paper introduces DIAE, a diffusion-based model that enhances image aesthetics by integrating multimodal perception and weak supervision, effectively addressing challenges in aesthetic editing and data scarcity.
Contribution
It proposes a novel dual-supervised diffusion model with multimodal aesthetic perception and a new dataset to improve aesthetic enhancement of images.
Findings
DIAE outperforms baseline models in aesthetic score metrics.
The model maintains high content consistency in edited images.
Experimental results validate the effectiveness of multimodal guidance.
Abstract
Image aesthetic enhancement aims to perceive aesthetic deficiencies in images and perform corresponding editing operations, which is highly challenging and requires the model to possess creativity and aesthetic perception capabilities. Although recent advancements in image editing models have significantly enhanced their controllability and flexibility, they struggle with enhancing image aesthetic. The primary challenges are twofold: first, following editing instructions with aesthetic perception is difficult, and second, there is a scarcity of "perfectly-paired" images that have consistent content but distinct aesthetic qualities. In this paper, we propose Dual-supervised Image Aesthetic Enhancement (DIAE), a diffusion-based generative model with multimodal aesthetic perception. First, DIAE incorporates Multimodal Aesthetic Perception (MAP) to convert the ambiguous aesthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis
