ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification
Linghao Zhuang, Ying Zhang, Gege Yuan, Xingyue Zhao, Zhiping Jiang

TL;DR
ActiveSSF is a novel framework combining active learning and self-supervised pretraining to improve classification of megakaryocytes, especially rare subtypes, in stained slides with complex backgrounds and morphological variability.
Contribution
This work introduces ActiveSSF, integrating advanced region extraction, adaptive sample selection, and prototype clustering to address challenges in megakaryocyte classification.
Findings
Achieves state-of-the-art accuracy on clinical datasets.
Significantly improves recognition of rare megakaryocyte subtypes.
Demonstrates practical potential for clinical diagnosis.
Abstract
Precise classification of megakaryocytes is crucial for diagnosing myelodysplastic syndromes. Although self-supervised learning has shown promise in medical image analysis, its application to classifying megakaryocytes in stained slides faces three main challenges: (1) pervasive background noise that obscures cellular details, (2) a long-tailed distribution that limits data for rare subtypes, and (3) complex morphological variations leading to high intra-class variability. To address these issues, we propose the ActiveSSF framework, which integrates active learning with self-supervised pretraining. Specifically, our approach employs Gaussian filtering combined with K-means clustering and HSV analysis (augmented by clinical prior knowledge) for accurate region-of-interest extraction; an adaptive sample selection mechanism that dynamically adjusts similarity thresholds to mitigate class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaveolin-1 and cellular processes · Glycosylation and Glycoproteins Research · Machine Learning in Bioinformatics
Methodsk-Means Clustering
