Sketch-a-Net that Beats Humans

Qian Yu; Yongxin Yang; Yi-Zhe Song; Tao Xiang; Timothy Hospedales

arXiv:1501.07873·cs.CV·July 22, 2015·38 cites

Sketch-a-Net that Beats Humans

Qian Yu, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales

PDF

Open Access 3 Repos

TL;DR

This paper introduces a specialized deep neural network for sketch recognition that outperforms human accuracy by incorporating sketch-specific features, multi-scale ensemble methods, and sequential encoding.

Contribution

The authors develop a novel multi-scale, multi-channel deep network architecture tailored for sketches, surpassing human performance and existing photo-based models in sketch recognition.

Findings

01

Achieves state-of-the-art accuracy on large sketch dataset

02

Outperforms humans in sketch recognition tasks

03

Efficient training with CPU-based implementation

Abstract

We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans. Our superior performance is a result of explicitly embedding the unique characteristics of sketches in our model: (i) a network architecture designed for sketch rather than natural photo statistics, (ii) a multi-channel generalisation that encodes sequential ordering in the sketching process, and (iii) a multi-scale network ensemble with joint Bayesian fusion that accounts for the different levels of abstraction exhibited in free-hand sketches. We show that state-of-the-art deep networks specifically engineered for photos of natural objects fail to perform well on sketch recognition, regardless whether they are trained using photo or sketch. Our network on the other hand not only delivers the best performance on the largest human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization