ProCrop: Learning Aesthetic Image Cropping from Professional Compositions
Ke Zhang, Tianyu Ding, Jiachen Jiang, Tianyi Chen, Ilya Zharkov, Vishal M. Patel, Luming Liang

TL;DR
ProCrop is a retrieval-based image cropping method that learns from professional photographs and a large-scale weakly-annotated dataset, significantly improving cropping performance and aesthetic quality over existing approaches.
Contribution
The paper introduces ProCrop, a novel retrieval-based approach leveraging professional images and a new large-scale dataset for improved aesthetic image cropping.
Findings
ProCrop outperforms existing methods in both supervised and weakly-supervised settings.
Training on the new dataset allows ProCrop to surpass previous weakly-supervised methods.
ProCrop matches the performance of fully supervised approaches.
Abstract
Image cropping is crucial for enhancing the visual appeal and narrative impact of photographs, yet existing rule-based and data-driven approaches often lack diversity or require annotated training data. We introduce ProCrop, a retrieval-based method that leverages professional photography to guide cropping decisions. By fusing features from professional photographs with those of the query image, ProCrop learns from professional compositions, significantly boosting performance. Additionally, we present a large-scale dataset of 242K weakly-annotated images, generated by out-painting professional images and iteratively refining diverse crop proposals. This composition-aware dataset generation offers diverse high-quality crop proposals guided by aesthetic principles and becomes the largest publicly available dataset for image cropping. Extensive experiments show that ProCrop significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Aesthetic Perception and Analysis · Multimodal Machine Learning Applications
