DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen, Peize Sun, Yibing Song, Ping Luo

TL;DR
DiffusionDet introduces a diffusion-based framework for object detection that progressively refines noisy bounding boxes into accurate detections, offering flexibility and improved performance over traditional methods.
Contribution
The paper presents a novel diffusion process approach for object detection, enabling dynamic box numbers and iterative refinement, which enhances detection accuracy and flexibility.
Findings
Achieves 5.3 AP improvement with more boxes and iterations on COCO to CrowdHuman transfer.
Demonstrates superior performance compared to previous detectors on standard benchmarks.
Enables flexible, iterative object detection with a diffusion model.
Abstract
We propose DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. Our work possesses an appealing property of flexibility, which enables the dynamic number of boxes and iterative evaluation. The extensive experiments on the standard benchmarks show that DiffusionDet achieves favorable performance compared to previous well-established detectors. For example, DiffusionDet achieves 5.3 AP and 4.8 AP gains when evaluated with more boxes and iteration steps, under a zero-shot transfer setting from COCO to CrowdHuman. Our code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
MethodsDiffusion
