CST-YOLO: A Novel Method for Blood Cell Detection Based on Improved   YOLOv7 and CNN-Swin Transformer

Ming Kang; Chee-Ming Ting; Fung Fung Ting; Rapha\"el Phan

arXiv:2306.14590·cs.CV·October 8, 2024

CST-YOLO: A Novel Method for Blood Cell Detection Based on Improved YOLOv7 and CNN-Swin Transformer

Ming Kang, Chee-Ming Ting, Fung Fung Ting, Rapha\"el Phan

PDF

Open Access 1 Repo

TL;DR

This paper introduces CST-YOLO, an improved blood cell detection model combining YOLOv7 with CNN-Swin Transformer and novel modules, achieving superior accuracy on blood cell datasets.

Contribution

The paper presents a novel fusion of CNN and Transformer in YOLOv7 for small object detection, with new modules enhancing detection precision.

Findings

01

Achieved 92.7% [email protected] on blood cell datasets.

02

Outperformed RT-DETR, YOLOv5, and YOLOv7 in experiments.

03

Demonstrated effectiveness of CNN-Transformer fusion and new modules.

Abstract

Blood cell detection is a typical small-scale object detection problem in computer vision. In this paper, we propose a CST-YOLO model for blood cell detection based on YOLOv7 architecture and enhance it with the CNN-Swin Transformer (CST), which is a new attempt at CNN-Transformer fusion. We also introduce three other useful modules: Weighted Efficient Layer Aggregation Networks (W-ELAN), Multiscale Channel Split (MCS), and Concatenate Convolutional Layers (CatConv) in our CST-YOLO to improve small-scale object detection precision. Experimental results show that the proposed CST-YOLO achieves 92.7%, 95.6%, and 91.1% [email protected], respectively, on three blood cell datasets, outperforming state-of-the-art object detectors, e.g., RT-DETR, YOLOv5, and YOLOv7. Our code is available at https://github.com/mkang315/CST-YOLO.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mkang315/CST-YOLO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Imaging for Blood Diseases · AI in cancer detection · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Linear Layer · Layer Normalization · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Adam · Byte Pair Encoding