A Curated and Re-annotated Peripheral Blood Cell Dataset Integrating Four Public Resources
Lu Gan, Xi Li, Xichun Wang

TL;DR
This paper introduces TXL-PBC, a high-quality, curated blood cell dataset integrating four sources, with detailed annotations and baseline detection model performances, supporting blood cell detection research.
Contribution
The creation of TXL-PBC dataset with rigorous annotation process and baseline detection results, providing a valuable resource for blood cell analysis and machine learning benchmarking.
Findings
High annotation accuracy and consistency achieved.
Diverse and balanced dataset demonstrated.
Baseline detection models show promising performance.
Abstract
We present TXL-PBC, a curated and re-annotated peripheral blood cell dataset constructed by integrating four publicly available resources: Blood Cell Count and Detection (BCCD), Blood Cell Detection Dataset (BCDD), Peripheral Blood Cells (PBC), and Raabin White Blood Cell (Raabin-WBC). Through rigorous sample selection, semi-automatic annotation using the YOLOv8n model, and comprehensive manual review, we ensured high annotation accuracy and consistency. The final dataset contains 1,260 images and 18,143 bounding box annotations for three major blood cell types: white blood cells (WBC), red blood cells (RBC), and platelets. We provide detailed visual analyses of the data distribution, demonstrating the diversity and balance of the dataset. To further validate the quality and utility of TXL-PBC, we trained several mainstream object detection models, including YOLOv5s, YOLOv8s, YOLOv11s,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Cancer Genomics and Diagnostics · Blood groups and transfusion
