Does YOLO Really Need to See Every Training Image in Every Epoch?

Xingxing Xie; Jiahua Dong; Junwei Han; Gong Cheng

arXiv:2603.17684·cs.CV·March 19, 2026

Does YOLO Really Need to See Every Training Image in Every Epoch?

Xingxing Xie, Jiahua Dong, Junwei Han, Gong Cheng

PDF

Open Access

TL;DR

This paper introduces AFSS, a dynamic sampling strategy for YOLO training that reduces redundancy, accelerates training by over 1.43 times, and improves detection accuracy by selectively focusing on informative images.

Contribution

The paper proposes AFSS, a novel adaptive sampling method that dynamically categorizes images by learning sufficiency, enabling faster and more effective YOLO training.

Findings

01

Over 1.43x training speedup on multiple datasets

02

Improved detection accuracy with AFSS

03

Effective reduction of redundant training images

Abstract

YOLO detectors are known for their fast inference speed, yet training them remains unexpectedly time-consuming due to their exhaustive pipeline that processes every training image in every epoch, even when many images have already been sufficiently learned. This stands in clear contrast to the efficiency suggested by the ``You Only Look Once'' philosophy. This naturally raises an important question: \textit{Does YOLO really need to see every training image in every epoch?} To explore this, we propose an Anti-Forgetting Sampling Strategy (AFSS) that dynamically determines which images should be used and which can be skipped during each epoch, allowing the detector to learn more effectively and efficiently. Specifically, AFSS measures the learning sufficiency of each training image as the minimum of its detection recall and precision, and dynamically categorizes training images into easy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote-Sensing Image Classification · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques