YOLO-IOD: Towards Real Time Incremental Object Detection
Shizhou Zhang, Xueqiang Lv, Yinghui Xing, Qirui Wu, Di Xu, Chen Zhao, Yanning Zhang

TL;DR
YOLO-IOD is a real-time incremental object detection framework built on YOLO, addressing catastrophic forgetting through conflict-aware pseudo-label refinement, importance-based kernel selection, and asymmetric knowledge distillation, with improved performance on realistic benchmarks.
Contribution
The paper introduces YOLO-IOD, a novel real-time incremental object detection method that effectively mitigates knowledge conflicts and catastrophic forgetting in YOLO-based detectors.
Findings
YOLO-IOD outperforms existing methods on standard and realistic benchmarks.
It effectively reduces catastrophic forgetting in incremental detection.
The framework maintains real-time performance while improving accuracy.
Abstract
Current methods for incremental object detection (IOD) primarily rely on Faster R-CNN or DETR series detectors; however, these approaches do not accommodate the real-time YOLO detection frameworks. In this paper, we first identify three primary types of knowledge conflicts that contribute to catastrophic forgetting in YOLO-based incremental detectors: foreground-background confusion, parameter interference, and misaligned knowledge distillation. Subsequently, we introduce YOLO-IOD, a real-time Incremental Object Detection (IOD) framework that is constructed upon the pretrained YOLO-World model, facilitating incremental learning via a stage-wise parameter-efficient fine-tuning process. Specifically, YOLO-IOD encompasses three principal components: 1) Conflict-Aware Pseudo-Label Refinement (CPR), which mitigates the foreground-background confusion by leveraging the confidence levels of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
