COCO-Stuff: Thing and Stuff Classes in Context

Holger Caesar; Jasper Uijlings; Vittorio Ferrari

arXiv:1612.03716·cs.CV·March 29, 2018·38 cites

COCO-Stuff: Thing and Stuff Classes in Context

Holger Caesar, Jasper Uijlings, Vittorio Ferrari

PDF

Open Access 5 Repos 1 Datasets

TL;DR

COCO-Stuff enhances the COCO dataset with detailed pixel-wise annotations for 91 stuff classes, enabling better understanding of scene context, object relationships, and segmentation performance analysis.

Contribution

The paper introduces COCO-Stuff, a large-scale dataset with pixel-wise annotations for stuff classes, and presents an annotation protocol and analysis of scene context and segmentation.

Findings

01

Stuff classes cover significant image surface area.

02

Rich spatial relations between stuff and things are identified.

03

Segmentation performance varies between stuff and thing classes.

Abstract

Semantic classes can be either things (objects with a well-defined shape, e.g. car, person) or stuff (amorphous background regions, e.g. grass, sky). While lots of classification and detection works focus on thing classes, less attention has been given to stuff classes. Nonetheless, stuff classes are important as they allow to explain important aspects of an image, including (1) scene type; (2) which thing classes are likely to be present and their location (through contextual reasoning); (3) physical attributes, material types and geometric properties of the scene. To understand stuff and things in context we introduce COCO-Stuff, which augments all 164K images of the COCO 2017 dataset with pixel-wise annotations for 91 stuff classes. We introduce an efficient stuff annotation protocol based on superpixels, which leverages the original thing annotations. We quantify the speed versus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

shunk031/cocostuff
dataset· 64 dl
64 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings