ISDA: Position-Aware Instance Segmentation with Deformable Attention

Kaining Ying; Zhenhua Wang; Cong Bai; Pengfei Zhou

arXiv:2202.12251·cs.CV·February 25, 2022

ISDA: Position-Aware Instance Segmentation with Deformable Attention

Kaining Ying, Zhenhua Wang, Cong Bai, Pengfei Zhou

PDF

Open Access 1 Repo

TL;DR

ISDA introduces an end-to-end, NMS-free instance segmentation approach using deformable attention and position-aware kernels, achieving superior performance on MS-COCO compared to traditional methods.

Contribution

The paper presents a novel set-prediction based instance segmentation method that is end-to-end trainable and does not require NMS, leveraging deformable attention for improved accuracy.

Findings

01

Outperforms Mask R-CNN by 2.6 points on MS-COCO

02

NMS-free and end-to-end trainable architecture

03

Leverages deformable attention with multi-scale features

Abstract

Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with learned position-aware kernels and features of objects. Such kernels and features are learned by leveraging a deformable attention network with multi-scale representation. Thanks to the introduced set-prediction mechanism, the proposed method is NMS-free. Empirically, ISDA outperforms Mask R-CNN (the strong baseline) by 2.6 points on MS-COCO, and achieves leading performance compared with recent models. Code will be available soon.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yingkaining/isda
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsRegion Proposal Network · Softmax · Convolution · RoIAlign · Mask R-CNN