TL;DR
This paper introduces a multitask deep CNN that predicts aesthetic scores and attributes of photos, achieving near-human accuracy and providing interpretability through visualization of attribute-related image regions.
Contribution
It presents a novel multitask deep learning model that jointly predicts aesthetic attributes and overall scores, enhancing interpretability and performance in photo aesthetic assessment.
Findings
Near human performance in aesthetic score prediction
Visualization of important image regions for attributes
Analysis of attribute diversity and complexity
Abstract
Automatic photo aesthetic assessment is a challenging artificial intelligence task. Existing computational approaches have focused on modeling a single aesthetic score or a class (good or bad), however these do not provide any details on why the photograph is good or bad, or which attributes contribute to the quality of the photograph. To obtain both accuracy and human interpretation of the score, we advocate learning the aesthetic attributes along with the prediction of the overall score. For this purpose, we propose a novel multitask deep convolution neural network, which jointly learns eight aesthetic attributes along with the overall aesthetic score. We report near human performance in the prediction of the overall aesthetic score. To understand the internal representation of these attributes in the learned model, we also develop the visualization technique using back propagation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
