TL;DR
This paper introduces a style transfer-based data augmentation method to improve neural network object detection in art images, addressing the cross depiction challenge by enhancing training datasets.
Contribution
It presents a novel approach of using style transfer to generate training data, significantly boosting object detection accuracy in art images.
Findings
Improved detection accuracy on the People-Art dataset.
Significant performance gain over previous methods.
Demonstrated effectiveness of style transfer for dataset augmentation.
Abstract
Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generate a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer. This dataset is used to fine-tune a Faster R-CNN object detection network, which is then tested on the existing People-Art testing dataset. The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · RoIPool · Region Proposal Network · Convolution · Faster R-CNN
