Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Songhua Liu; Tianwei Lin; Dongliang He; Fu Li; Ruifeng Deng; Xin Li,; Errui Ding; Hao Wang

arXiv:2108.03798·cs.CV·August 12, 2021·1 cites

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li,, Errui Ding, Hao Wang

PDF

Open Access 2 Repos

TL;DR

The paper introduces Paint Transformer, a novel neural network that predicts stroke sets for image painting in parallel, enabling near real-time non-photo-realistic image recreation without requiring pre-existing datasets.

Contribution

It formulates neural painting as a set prediction problem and proposes a Transformer-based feed forward model trained via self-supervision, improving efficiency and generalization.

Findings

01

Achieves better painting quality than previous methods.

02

Operates in near real-time for 512x512 images.

03

Does not require external datasets for training.

Abstract

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Dropout · Label Smoothing · Residual Connection · Byte Pair Encoding