Category Feature Transformer for Semantic Segmentation

Quan Tang; Chuanjian Liu; Fagui Liu; Yifan Liu; Jun Jiang; Bowen; Zhang; Kai Han; Yunhe Wang

arXiv:2308.05581·cs.CV·August 11, 2023·1 cites

Category Feature Transformer for Semantic Segmentation

Quan Tang, Chuanjian Liu, Fagui Liu, Yifan Liu, Jun Jiang, Bowen, Zhang, Kai Han, Yunhe Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Category Feature Transformer (CFT), a novel attention-based module for semantic segmentation that improves multi-stage feature aggregation by learning and broadcasting category embeddings, leading to state-of-the-art results.

Contribution

The paper proposes CFT, a new attention-based feature aggregation method that learns category embeddings and enhances semantic segmentation performance.

Findings

01

Achieves 55.1% mIoU on ADE20K dataset.

02

Reduces model parameters and computations.

03

Outperforms previous methods on benchmarks.

Abstract

Aggregation of multi-stage features has been revealed to play a significant role in semantic segmentation. Unlike previous methods employing point-wise summation or concatenation for feature aggregation, this study proposes the Category Feature Transformer (CFT) that explores the flow of category embedding and transformation among multi-stage features through the prevalent multi-head attention mechanism. CFT learns unified feature embeddings for individual semantic categories from high-level features during each aggregation process and dynamically broadcasts them to high-resolution features. Integrating the proposed CFT into a typical feature pyramid structure exhibits superior performance over a broad range of backbone networks. We conduct extensive experiments on popular semantic segmentation benchmarks. Specifically, the proposed CFT obtains a compelling 55.1% mIoU with greatly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BebDong/EMOSeg
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Label Smoothing · Linear Layer · Adam · Residual Connection · Dense Connections · Dropout · Absolute Position Encodings