Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings
Jaskirat Singh, Liang Zheng

TL;DR
This paper introduces a novel semantic guidance pipeline combined with deep reinforcement learning to generate high-quality, diverse, and human-like paintings from complex datasets without requiring stroke supervision.
Contribution
It presents a bi-level painting procedure, neural alignment for invariance, and a focus reward, enabling the generation of detailed paintings with varied foreground objects without supervised stroke data.
Findings
Successfully handles variations in position, scale, and saliency.
Produces higher quality canvases for complex datasets.
Effective on datasets with multiple foreground objects.
Abstract
Generation of stroke-based non-photorealistic imagery, is an important problem in the computer vision community. As an endeavor in this direction, substantial recent research efforts have been focused on teaching machines "how to paint", in a manner similar to a human painter. However, the applicability of previous methods has been limited to datasets with little variation in position, scale and saliency of the foreground object. As a consequence, we find that these methods struggle to cover the granularity and diversity possessed by real world images. To this end, we propose a Semantic Guidance pipeline with 1) a bi-level painting procedure for learning the distinction between foreground and background brush strokes at training time. 2) We also introduce invariance to the position and scale of the foreground object through a neural alignment model, which combines object localization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image Enhancement Techniques · Advanced Vision and Imaging
MethodsSpatial Transformer
