What Makes for End-to-End Object Detection?
Peize Sun, Yi Jiang, Enze Xie, Wenqi Shao, Zehuan Yuan, Changhu Wang,, Ping Luo

TL;DR
This paper investigates the key factors enabling end-to-end object detection, emphasizing the importance of one-to-one positive sample assignment and the role of classification cost in reducing redundant predictions.
Contribution
It reveals that incorporating classification cost in matching is crucial for achieving true one-to-one predictions in end-to-end object detection.
Findings
Classification cost enlarges the score gap, reducing noisy positives.
Introducing classification cost leads to immediate one-to-one predictions.
End-to-end detection shows advantages in crowded scenes.
Abstract
Object detection has recently achieved a breakthrough for removing the last one non-differentiable component in the pipeline, Non-Maximum Suppression (NMS), and building up an end-to-end system. However, what makes for its one-to-one prediction has not been well understood. In this paper, we first point out that one-to-one positive sample assignment is the key factor, while, one-to-many assignment in previous detectors causes redundant predictions in inference. Second, we surprisingly find that even training with one-to-one assignment, previous detectors still produce redundant predictions. We identify that classification cost in matching cost is the main ingredient: (1) previous detectors only consider location cost, (2) by additionally introducing classification cost, previous detectors immediately produce one-to-one prediction during inference. We introduce the concept of score gap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
