OCNet: Object Context Network for Scene Parsing
Yuhui Yuan, Lang Huang, Jianyuan Guo, Chao Zhang, Xilin Chen, Jingdong, Wang

TL;DR
This paper introduces OCNet, a novel scene parsing model that leverages object context via an efficient sparse self-attention mechanism, improving semantic segmentation accuracy across multiple benchmarks.
Contribution
The paper proposes a new object context scheme with an interlaced sparse self-attention to efficiently model pixel relations, enhancing scene parsing performance.
Findings
Achieves competitive results on five benchmarks.
Demonstrates the effectiveness of object context modeling.
Outperforms previous methods in accuracy.
Abstract
In this paper, we address the semantic segmentation task with a new context aggregation scheme named \emph{object context}, which focuses on enhancing the role of object information. Motivated by the fact that the category of each pixel is inherited from the object it belongs to, we define the object context for each pixel as the set of pixels that belong to the same category as the given pixel in the image. We use a binary relation matrix to represent the relationship between all pixels, where the value one indicates the two selected pixels belong to the same category and zero otherwise. We propose to use a dense relation matrix to serve as a surrogate for the binary relation matrix. The dense relation matrix is capable to emphasize the contribution of object information as the relation scores tend to be larger on the object pixels than the other pixels. Considering that the dense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Average Pooling · Pyramid Pooling Module · Auxiliary Classifier · Dilated Convolution · PSPNet
