Deep Cropping via Attention Box Prediction and Aesthetics Assessment
Wenguan Wang, Jianbing Shen

TL;DR
This paper introduces a deep learning approach for photo cropping that combines attention box prediction and aesthetic assessment to generate high-quality, efficient crops with minimal data.
Contribution
It proposes a novel neural network with dual branches for attention and aesthetics, improving cropping quality and efficiency over previous methods.
Findings
Achieves high-quality cropping results.
Operates at 5 frames per second.
Uses shared features for efficiency.
Abstract
We model the photo cropping problem as a cascade of attention box regression and aesthetic quality classification, based on deep learning. A neural network is designed that has two branches for predicting attention bounding box and analyzing aesthetics, respectively. The predicted attention box is treated as an initial crop window where a set of cropping candidates are generated around it, without missing important information. Then, aesthetics assessment is employed to select the final crop as the one with the best aesthetic quality. With our network, cropping candidates share features within full-image convolutional feature maps, thus avoiding repeated feature computation and leading to higher computation efficiency. Via leveraging rich data for attention prediction and aesthetics assessment, the proposed method produces high-quality cropping results, even with the limited…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Olfactory and Sensory Function Studies · Aesthetic Perception and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
