Multi-Label Plant Species Classification with Self-Supervised Vision Transformers
Murilo Gustineli, Anthony Miyaguchi, and Ian Stalter

TL;DR
This paper introduces a transfer learning approach using self-supervised Vision Transformers for multi-label plant species classification, combining advanced feature extraction with distributed data processing to improve accuracy and efficiency.
Contribution
It presents a novel method that leverages self-supervised Vision Transformers and distributed processing for large-scale multi-label plant classification tasks.
Findings
Effective multi-label classification achieved using DINOv2 embeddings.
Distributed data processing with Spark enhances scalability and efficiency.
Code availability facilitates reproducibility and further research.
Abstract
We present a transfer learning approach using a self-supervised Vision Transformer (DINOv2) for the PlantCLEF 2024 competition, focusing on the multi-label plant species classification. Our method leverages both base and fine-tuned DINOv2 models to extract generalized feature embeddings. We train classifiers to predict multiple plant species within a single image using these rich embeddings. To address the computational challenges of the large-scale dataset, we employ Spark for distributed data processing, ensuring efficient memory management and processing across a cluster of workers. Our data processing pipeline transforms images into grids of tiles, classifying each tile, and aggregating these predictions into a consolidated set of probabilities. Our results demonstrate the efficacy of combining transfer learning with advanced data processing techniques for multi-label image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Smart Agriculture and AI · Advanced Chemical Sensor Technologies
MethodsSparse Evolutionary Training · Attention Is All You Need · Softmax · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam
