Cross-modal learning for plankton recognition

Joona Kareinen; Veikka Immonen; Tuomas Eerola; Lumi Haraguchi; Lasse Lensu; Kaisa Kraft; Sanna Suikkanen; Heikki K\"alvi\"ainen

arXiv:2603.16427·cs.CV·April 20, 2026

Cross-modal learning for plankton recognition

Joona Kareinen, Veikka Immonen, Tuomas Eerola, Lumi Haraguchi, Lasse Lensu, Kaisa Kraft, Sanna Suikkanen, Heikki K\"alvi\"ainen

PDF

1 Repo

TL;DR

This paper introduces a self-supervised cross-modal learning approach for plankton recognition that leverages unlabeled image and optical measurement data, reducing the need for manual labeling and improving accuracy.

Contribution

It presents a novel multimodal training method inspired by CLIP, enabling recognition with minimal labeled data and outperforming image-only baselines.

Findings

01

Achieves high recognition accuracy with few labeled images.

02

Outperforms image-only self-supervised methods.

03

Utilizes both image and profile data for recognition.

Abstract

This paper considers self-supervised cross-modal coordination as a strategy enabling utilization of multiple modalities and large volumes of unlabeled plankton data to build models for plankton recognition. Automated imaging instruments facilitate the continuous collection of plankton image data on a large scale. Current methods for automatic plankton image recognition rely primarily on supervised approaches, which require labeled training sets that are labor-intensive to collect. On the other hand, some modern plankton imaging instruments complement image information with optical measurement data, such as scatter and fluorescence profiles, which currently are not widely utilized in plankton recognition. In this work, we explore the possibility of using such measurement data to guide the learning process without requiring manual labeling. Inspired by the concepts behind Contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Jookare/cross-modal-plankton
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.