Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories

Yajing Chen; Shikui Tu; Yuqi Yi; Lei Xu

arXiv:1709.04121·cs.CV·September 14, 2017·45 cites

Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories

Yajing Chen, Shikui Tu, Yuqi Yi, Lei Xu

PDF

Open Access

TL;DR

This paper introduces sketch-pix2seq, a novel model that effectively learns and generates sketches across multiple categories by replacing RNN with CNN encoders and removing KL-divergence, enhancing multi-category sketch synthesis.

Contribution

The paper proposes sketch-pix2seq, a new multi-category sketch generation model that improves upon sketch-rnn by using CNN encoders and removing KL-divergence, enabling better multi-category learning and generation.

Findings

01

CNN encoders outperform RNN encoders in sketch generation.

02

Removing KL-divergence improves multi-category sketch learning.

03

Sketch-pix2seq shows promising creativity in sketch synthesis.

Abstract

Sketch is an important media for human to communicate ideas, which reflects the superiority of human intelligence. Studies on sketch can be roughly summarized into recognition and generation. Existing models on image recognition failed to obtain satisfying performance on sketch classification. But for sketch generation, a recent study proposed a sequence-to-sequence variational-auto-encoder (VAE) model called sketch-rnn which was able to generate sketches based on human inputs. The model achieved amazing results when asked to learn one category of object, such as an animal or a vehicle. However, the performance dropped when multiple categories were fed into the model. Here, we proposed a model called sketch-pix2seq which could learn and draw multiple categories of sketches. Two modifications were made to improve the sketch-rnn model: one is to replace the bidirectional recurrent neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection

MethodsUSD Coin Customer Service Number +1-833-534-1729