Learning Subject-Aware Cropping by Outpainting Professional Photos
James Hong, Lu Yuan, Micha\"el Gharbi, Matthew Fisher, Kayvon, Fatahalian

TL;DR
This paper introduces GenCrop, a weakly-supervised method that learns subject-aware image cropping from professional photos by combining stock images and a diffusion model to generate training data without manual annotations.
Contribution
GenCrop is the first weakly-supervised approach to subject-aware cropping that leverages existing stock images and a diffusion model to generate training pairs without manual labels.
Findings
GenCrop performs competitively with state-of-the-art supervised methods.
It significantly outperforms other weakly-supervised baselines.
The method effectively learns subject-aware cropping without manual annotations.
Abstract
How to frame (or crop) a photo often depends on the image subject and its context; e.g., a human portrait. Recent works have defined the subject-aware image cropping task as a nuanced and practical version of image cropping. We propose a weakly-supervised approach (GenCrop) to learn what makes a high-quality, subject-aware crop from professional stock images. Unlike supervised prior work, GenCrop requires no new manual annotations beyond the existing stock image collection. The key challenge in learning from this data, however, is that the images are already cropped and we do not know what regions were removed. Our insight is to combine a library of stock images with a modern, pre-trained text-to-image diffusion model. The stock image collection provides diversity and its images serve as pseudo-labels for a good crop, while the text-image diffusion model is used to out-paint (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedia, Gender, and Advertising
MethodsLib · Diffusion
