Framework-agnostic Semantically-aware Global Reasoning for Segmentation

Mir Rayat Imtiaz Hossain; Leonid Sigal; James J. Little

arXiv:2212.03338·cs.CV·April 19, 2024·1 cites

Framework-agnostic Semantically-aware Global Reasoning for Segmentation

Mir Rayat Imtiaz Hossain, Leonid Sigal, James J. Little

PDF

Open Access 1 Video

TL;DR

This paper introduces a flexible, scene-semantic global reasoning module that enhances segmentation by learning to project features into interpretable latent regions and reasoning over them with transformers, improving performance across various models.

Contribution

It proposes a novel semantic global reasoning component that can be integrated into different segmentation architectures, enabling scene-aware reasoning and improved results.

Findings

01

Improved segmentation accuracy across multiple datasets.

02

Latent tokens are semantically interpretable and diverse.

03

Enhanced downstream task performance, such as object detection.

Abstract

Recent advances in pixel-level tasks (e.g. segmentation) illustrate the benefit of of long-range interactions between aggregated region-based representations that can enhance local features. However, such aggregated representations, often in the form of attention, fail to model the underlying semantics of the scene (e.g. individual objects and, by extension, their interactions). In this work, we address the issue by proposing a component that learns to project image features into latent representations and reason between them using a transformer encoder to generate contextualized and scene-consistent representations which are fused with original image features. Our design encourages the latent regions to represent semantic concepts by ensuring that the activated regions are spatially disjoint and the union of such regions corresponds to a connected object segment. The proposed semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Framework-Agnostic Semantically-Aware Global Reasoning for Segmentation· youtube

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsSpatial Pyramid Pooling · Batch Normalization · Dilated Convolution · Atrous Spatial Pyramid Pooling · 1x1 Convolution · DeepLabv3