ProCrop: Learning Aesthetic Image Cropping from Professional Compositions

Ke Zhang; Tianyu Ding; Jiachen Jiang; Tianyi Chen; Ilya Zharkov; Vishal M. Patel; Luming Liang

arXiv:2505.22490·cs.CV·May 29, 2025

ProCrop: Learning Aesthetic Image Cropping from Professional Compositions

Ke Zhang, Tianyu Ding, Jiachen Jiang, Tianyi Chen, Ilya Zharkov, Vishal M. Patel, Luming Liang

PDF

Open Access

TL;DR

ProCrop is a retrieval-based image cropping method that learns from professional photographs and a large-scale weakly-annotated dataset, significantly improving cropping performance and aesthetic quality over existing approaches.

Contribution

The paper introduces ProCrop, a novel retrieval-based approach leveraging professional images and a new large-scale dataset for improved aesthetic image cropping.

Findings

01

ProCrop outperforms existing methods in both supervised and weakly-supervised settings.

02

Training on the new dataset allows ProCrop to surpass previous weakly-supervised methods.

03

ProCrop matches the performance of fully supervised approaches.

Abstract

Image cropping is crucial for enhancing the visual appeal and narrative impact of photographs, yet existing rule-based and data-driven approaches often lack diversity or require annotated training data. We introduce ProCrop, a retrieval-based method that leverages professional photography to guide cropping decisions. By fusing features from professional photographs with those of the query image, ProCrop learns from professional compositions, significantly boosting performance. Additionally, we present a large-scale dataset of 242K weakly-annotated images, generated by out-painting professional images and iteratively refining diverse crop proposals. This composition-aware dataset generation offers diverse high-quality crop proposals guided by aesthetic principles and becomes the largest publicly available dataset for image cropping. Extensive experiments show that ProCrop significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Aesthetic Perception and Analysis · Multimodal Machine Learning Applications