Task Specific Attention is one more thing you need for object detection

Sang Yon Lee

arXiv:2202.09048·cs.CV·June 16, 2022·1 cites

Task Specific Attention is one more thing you need for object detection

Sang Yon Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel attention-based model called Task Specific Split Transformer (TSST) that achieves state-of-the-art object detection performance on COCO without relying on traditional hand-designed components like anchors and NMS.

Contribution

The paper proposes TSST, a new attention module that splits general-purpose attention into goal-specific parts, enabling simpler and more effective end-to-end object detection models.

Findings

01

TSST achieves state-of-the-art results on COCO.

02

The approach eliminates the need for anchors and NMS.

03

Extensive experiments validate the effectiveness of the method.

Abstract

Various models have been proposed to perform object detection. However, most require many handdesigned components such as anchors and non-maximum-suppression(NMS) to demonstrate good performance. To mitigate these issues, Transformer-based DETR and its variant, Deformable DETR, were suggested. These have solved much of the complex issue in designing a head for object detection models; however, doubts about performance still exist when considering Transformer-based models as state-of-the-art methods in object detection for other models depending on anchors and NMS revealed better results. Furthermore, it has been unclear whether it would be possible to build an end-to-end pipeline in combination only with attention modules, because the DETR-adapted Transformer method used a convolutional neural network (CNN) for the backbone body. In this study, we propose that combining several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

navervision/tsst
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Adam · Dropout · Absolute Position Encodings · Convolution · Softmax