Does YOLO Really Need to See Every Training Image in Every Epoch?
Xingxing Xie, Jiahua Dong, Junwei Han, Gong Cheng

TL;DR
This paper introduces AFSS, a dynamic sampling strategy for YOLO training that reduces redundancy, accelerates training by over 1.43 times, and improves detection accuracy by selectively focusing on informative images.
Contribution
The paper proposes AFSS, a novel adaptive sampling method that dynamically categorizes images by learning sufficiency, enabling faster and more effective YOLO training.
Findings
Over 1.43x training speedup on multiple datasets
Improved detection accuracy with AFSS
Effective reduction of redundant training images
Abstract
YOLO detectors are known for their fast inference speed, yet training them remains unexpectedly time-consuming due to their exhaustive pipeline that processes every training image in every epoch, even when many images have already been sufficiently learned. This stands in clear contrast to the efficiency suggested by the ``You Only Look Once'' philosophy. This naturally raises an important question: \textit{Does YOLO really need to see every training image in every epoch?} To explore this, we propose an Anti-Forgetting Sampling Strategy (AFSS) that dynamically determines which images should be used and which can be skipped during each epoch, allowing the detector to learn more effectively and efficiently. Specifically, AFSS measures the learning sufficiency of each training image as the minimum of its detection recall and precision, and dynamically categorizes training images into easy,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
