Semantic Understanding of Scenes through the ADE20K Dataset

Bolei Zhou; Hang Zhao; Xavier Puig; Tete Xiao; Sanja; Fidler; Adela Barriuso; Antonio Torralba

arXiv:1608.05442·cs.CV·October 17, 2018·189 cites

Semantic Understanding of Scenes through the ADE20K Dataset

Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja, Fidler, Adela Barriuso, Antonio Torralba

PDF

Open Access 5 Repos 5 Datasets

TL;DR

This paper introduces the ADE20K dataset for comprehensive scene parsing and proposes a Cascade Segmentation Module that improves segmentation performance across diverse scenes and objects.

Contribution

The paper provides a new, richly annotated dataset for scene parsing and introduces a novel cascade segmentation approach that enhances segmentation accuracy.

Findings

01

Significant improvement in scene parsing accuracy with the proposed module

02

ADE20K covers a wide range of scenes and detailed annotations

03

Models trained on ADE20K generalize well to various scene types

Abstract

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the community's efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. A generic network design called Cascade Segmentation Module is then proposed to enable the segmentation networks to parse a scene into stuff, objects, and object parts in a cascade. We evaluate the proposed module integrated within two existing semantic segmentation networks, yielding significant improvements for scene parsing. We further show that the scene parsing networks trained on ADE20K can be applied to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques