Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study
A. Nieto Juscafresa, \'A. Mazcu\~n\'an Herreros, J. Sullivan

TL;DR
This paper explores the use of diffusion models as feature extractors for fine-grained image classification, demonstrating their effectiveness in plankton monitoring and robustness under distribution shifts.
Contribution
It introduces a method to use frozen diffusion models as feature encoders for fine-grained recognition, outperforming other self-supervised methods in real-world plankton classification tasks.
Findings
Diffusion features are competitive with supervised baselines.
They outperform other self-supervised methods in balanced and long-tailed settings.
Features maintain accuracy under distribution shifts.
Abstract
Diffusion models have emerged as state-of-the-art generative methods for image synthesis, yet their potential as general-purpose feature encoders remains underexplored. Trained for denoising and generation without labels, they can be interpreted as self-supervised learners that capture both low- and high-level structure. We show that a frozen diffusion backbone enables strong fine-grained recognition by probing intermediate denoising features across layers and timesteps and training a linear classifier for each pair. We evaluate this in a real-world plankton-monitoring setting with practical impact, using controlled and comparable training setups against established supervised and self-supervised baselines. Frozen diffusion features are competitive with supervised baselines and outperform other self-supervised methods in both balanced and naturally long-tailed settings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Single-cell and spatial transcriptomics
