Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization

Ben Isselmann; Dilara G\"oksu; Heinz Neumann; Andreas Weinmann

arXiv:2604.10970·cs.CV·April 14, 2026

Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization

Ben Isselmann, Dilara G\"oksu, Heinz Neumann, Andreas Weinmann

PDF

TL;DR

This study demonstrates that self-supervised learning models pretrained on large datasets like ImageNet-1k and HPA FOV can effectively transfer to protein localization tasks in microscopy, even with minimal fine-tuning.

Contribution

It evaluates the generalizability of SSL models pretrained on domain-specific datasets for microscopy, highlighting their strong zero-shot and fine-tuned performance.

Findings

01

HPA FOV-pretrained DINO ViT achieves highest zero-shot macro F1 of 0.822.

02

Fine-tuning improves performance to macro F1 of 0.860.

03

Single-cell embeddings from HPA pretraining outperform others in k-NN tasks.

Abstract

Background: Task-specific microscopy datasets are often small, making it difficult to train deep learning models that learn robust features. While self-supervised learning (SSL) has shown promise through pretraining on large, domain-specific datasets, generalizability across datasets with differing staining protocols and channel configurations remains underexplored. We investigated the generalizability of SSL models pretrained on ImageNet-1k and HPA FOV, evaluating their embeddings on OpenCell with and without fine-tuning, two channel-mismatch strategies, and varying fine-tuning data fractions. We additionally analyzed single-cell embeddings on a labeled OpenCell subset. Result: DINO-based ViT backbones pretrained on HPA FOV or ImageNet-1k transfer well to OpenCell even without fine-tuning. The HPA FOV-pretrained model achieved the highest zero-shot performance (macro $F_{1}$ 0.822…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.