An Experience-based Direct Generation approach to Automatic Image Cropping
Casper Christensen, Aneesh Vartakavi

TL;DR
This paper introduces a CNN-based method for automatic image cropping that directly predicts crop bounding boxes without explicitly modeling aesthetics or saliency, achieving competitive results and faster inference.
Contribution
A novel one-stage CNN approach for image cropping that bypasses explicit aesthetic or saliency modeling, trained on a large editor-cropped dataset, and capable of handling multiple aspect ratios.
Findings
Performs better or comparably to existing methods on public datasets.
Faster inference and easier training compared to multi-stage approaches.
Generalizes well to unseen datasets and preserves image composition.
Abstract
Automatic Image Cropping is a challenging task with many practical downstream applications. The task is often divided into sub-problems - generating cropping candidates, finding the visually important regions, and determining aesthetics to select the most appealing candidate. Prior approaches model one or more of these sub-problems separately, and often combine them sequentially. We propose a novel convolutional neural network (CNN) based method to crop images directly, without explicitly modeling image aesthetics, evaluating multiple crop candidates, or detecting visually salient regions. Our model is trained on a large dataset of images cropped by experienced editors and can simultaneously predict bounding boxes for multiple fixed aspect ratios. We consider the aspect ratio of the cropped image to be a critical factor that influences aesthetics. Prior approaches for automatic image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
