Towards Data-Efficient Detection Transformers

Wen Wang; Jing Zhang; Yang Cao; Yongliang Shen; Dacheng Tao

arXiv:2203.09507·cs.CV·August 26, 2022·1 cites

Towards Data-Efficient Detection Transformers

Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, Dacheng Tao

PDF

Open Access 2 Repos

TL;DR

This paper identifies the data inefficiency of detection transformers on small datasets and proposes a simple modification to improve their performance by focusing on local feature sampling and label augmentation.

Contribution

It introduces a minimal modification to detection transformers' cross-attention mechanism and a label augmentation method to enhance data efficiency on small datasets.

Findings

01

Improved detection transformer performance on small datasets.

02

Effective simple modifications applicable to various models.

03

Enhanced data efficiency with minimal changes.

Abstract

Detection Transformers have achieved competitive performance on the sample-rich COCO dataset. However, we show most of them suffer from significant performance drops on small-size datasets, like Cityscapes. In other words, the detection transformers are generally data-hungry. To tackle this problem, we empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR. The empirical results suggest that sparse feature sampling from local image areas holds the key. Based on this observation, we alleviate the data-hungry issue of existing detection transformers by simply alternating how key and value sequences are constructed in the cross-attention layer, with minimum modifications to the original models. Besides, we introduce a simple yet effective label augmentation method to provide richer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Remote-Sensing Image Classification