DN-DETR: Accelerate DETR Training by Introducing Query DeNoising
Feng Li, Hao Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang

TL;DR
This paper introduces DN-DETR, a denoising training method that accelerates DETR training by reducing bipartite graph matching difficulty, leading to faster convergence and improved detection performance.
Contribution
The paper proposes a universal denoising training approach for DETR-like models that significantly speeds up training and enhances accuracy.
Findings
DN-DETR improves AP by 1.9 points over baseline.
Achieves top performance among DETR-like methods with ResNet-50.
Reduces training epochs by 50% to reach comparable performance.
Abstract
We present in this paper a novel denoising training method to speedup DETR (DEtection TRansformer) training and offer a deepened understanding of the slow convergence issue of DETR-like methods. We show that the slow convergence results from the instability of bipartite graph matching which causes inconsistent optimization goals in early training stages. To address this issue, except for the Hungarian loss, our method additionally feeds ground-truth bounding boxes with noises into Transformer decoder and trains the model to reconstruct the original boxes, which effectively reduces the bipartite graph matching difficulty and leads to a faster convergence. Our method is universal and can be easily plugged into any DETR-like methods by adding dozens of lines of code to achieve a remarkable improvement. As a result, our DN-DETR results in a remarkable improvement (AP) under the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Advanced Neural Network Applications · Advanced Graph Neural Networks
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Feedforward Network · Dense Connections · Softmax · Absolute Position Encodings · Convolution · Byte Pair Encoding · Position-Wise Feed-Forward Layer
