Data Efficiency and Transfer Robustness in Biomedical Image Segmentation: A Study of Redundancy and Forgetting with Cellpose
Shuo Zhao, Jianxu Chen

TL;DR
This paper systematically analyzes data redundancy and transfer robustness in biomedical image segmentation using Cellpose, revealing that minimal data can achieve high performance and that strategic data replay mitigates forgetting during domain transfer.
Contribution
It introduces a dataset quantization strategy for efficient training and demonstrates effective methods to reduce catastrophic forgetting in cross domain transfer for biomedical segmentation.
Findings
Performance saturates with only 10% of data, indicating high redundancy.
Selective data replay restores source domain performance effectively.
Training domain sequencing improves generalization and reduces forgetting.
Abstract
Generalist biomedical image segmentation models such as Cellpose are increasingly applied across diverse imaging modalities and cell types. However, two critical challenges remain underexplored: (1) the extent of training data redundancy and (2) the impact of cross domain transfer on model retention. In this study, we conduct a systematic empirical analysis of these challenges using Cellpose as a case study. First, to assess data redundancy, we propose a simple dataset quantization (DQ) strategy for constructing compact yet diverse training subsets. Experiments on the Cyto dataset show that image segmentation performance saturates with only 10% of the data, revealing substantial redundancy and potential for training with minimal annotations. Latent space analysis using MAE embeddings and t-SNE confirms that DQ selected patches capture greater feature diversity than random sampling.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · AI in cancer detection · Domain Adaptation and Few-Shot Learning
