YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen, Xinbin Yuan, Jiabao Wang, Ruiqi Wu, Xiang Li, Qibin Hou,, Ming-Ming Cheng

TL;DR
YOLO-MS introduces a new multi-scale feature representation strategy that significantly improves real-time object detection performance, outperforming recent state-of-the-art detectors on MS COCO without relying on pre-trained models.
Contribution
The paper proposes a novel multi-scale feature learning approach for YOLO detectors, enhancing detection accuracy and efficiency without external pre-training.
Findings
YOLO-MS achieves higher AP scores than state-of-the-art detectors.
The method improves detection of objects at various scales.
YOLO-MS can serve as a plug-and-play module for other YOLO models.
Abstract
We aim at providing the object detection community with an efficient and performant object detector, termed YOLO-MS. The core design is based on a series of investigations on how multi-branch features of the basic block and convolutions with different kernel sizes affect the detection performance of objects at different scales. The outcome is a new strategy that can significantly enhance multi-scale feature representations of real-time object detectors. To verify the effectiveness of our work, we train our YOLO-MS on the MS COCO dataset from scratch without relying on any other large-scale datasets, like ImageNet or pre-trained weights. Without bells and whistles, our YOLO-MS outperforms the recent state-of-the-art real-time object detectors, including YOLO-v7, RTMDet, and YOLO-v8. Taking the XS version of YOLO-MS as an example, it can achieve an AP score of 42+% on MS COCO, which is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsYou Only Look Once · RTMDet: An Empirical Study of Designing Real-Time Object Detectors
