TL;DR
This paper introduces ConvNeXt Masked-Diffusion, a convolutional foundation model for cell-level dense prediction in pathology, outperforming ViT-based models especially with limited annotations.
Contribution
It proposes a fully convolutional, self-supervised pretraining framework using masked diffusion, demonstrating superior performance and robustness over ViT-based models in pathology tasks.
Findings
CMD outperforms existing ViT-based models in dense prediction tasks.
CMD surpasses state-of-the-art segmentation methods with fewer task-specific parameters.
CMD shows stronger robustness and generalization under limited annotations.
Abstract
Cell-level dense prediction is central to computational pathology, but remains challenging due to fine-grained histological structures, strong domain shifts, and costly dense annotations. Existing ViT-based pathology foundation models rely on patch tokenization, which can disrupt spatial continuity and weaken local morphological details needed for cell-level prediction. To address this, we propose Masked-Diffusion Convolutional Foundation Models, termed ConvNeXt Masked-Diffusion (CMD), a self-supervised convolutional generative pretraining framework for dense pathology representation learning. CMD uses a fully convolutional ConvNeXt-UNet backbone, performs masked-diffusion pretraining in pixel space, and incorporates frozen pathology foundation model features through adaptive normalization. Experimental results demonstrate that CMD consistently outperforms existing ViT-based pathology…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
