Exploring CNN-based models for image's aesthetic score prediction with   using ensemble

Ying Dai

arXiv:2210.05119·cs.CV·December 6, 2024

Exploring CNN-based models for image's aesthetic score prediction with using ensemble

Ying Dai

PDF

Open Access

TL;DR

This paper presents an ensemble CNN framework for automatic image aesthetic assessment, enhancing prediction accuracy and analyzing attention regions to understand model focus, with experiments confirming its effectiveness.

Contribution

Introduces an ensemble CNN approach for image aesthetic scoring and analyzes attention regions to interpret model focus, improving prediction performance.

Findings

01

Ensemble models outperform individual CNN architectures in aesthetic score prediction.

02

Attention regions align with subject areas, indicating model focus on relevant image parts.

03

Models trained on XiheAA dataset capture latent photography principles.

Abstract

In this paper, we proposed a framework of constructing two types of the automatic image aesthetics assessment models with different CNN architectures and improving the performance of the image's aesthetic score prediction by the ensemble. Moreover, the attention regions of the models to the images are extracted to analyze the consistency with the subjects in the images. The experimental results verify that the proposed method is effective for improving the AS prediction. Moreover, it is found that the AS classification models trained on XiheAA dataset seem to learn the latent photography principles, although it can't be said that they learn the aesthetic sense.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Aesthetic Perception and Analysis · Image and Video Quality Assessment