Zero-shot Shark Tracking and Biometrics from Aerial Imagery
Chinmay K Lalgudi, Mark E Leone, Jaden V Clark, Sergio Madrigal-Mora,, Mario Espinoza

TL;DR
This paper introduces FLAIR, a zero-shot video analysis method leveraging SAM2 and CLIP for marine animal detection in drone imagery, outperforming traditional models and reducing human effort.
Contribution
The paper presents FLAIR, a novel zero-shot approach combining SAM2 and CLIP for marine animal segmentation in drone videos without requiring labeled data or model training.
Findings
FLAIR achieves a Dice score of 0.81 on shark segmentation.
Outperforms state-of-the-art object detection models.
Generalizes to different shark species without retraining.
Abstract
The recent widespread adoption of drones for studying marine animals provides opportunities for deriving biological information from aerial imagery. The large scale of imagery data acquired from drones is well suited for machine learning (ML) analysis. Development of ML models for analyzing marine animal aerial imagery has followed the classical paradigm of training, testing, and deploying a new model for each dataset, requiring significant time, human effort, and ML expertise. We introduce Frame Level ALIgment and tRacking (FLAIR), which leverages the video understanding of Segment Anything Model 2 (SAM2) and the vision-language capabilities of Contrastive Language-Image Pre-training (CLIP). FLAIR takes a drone video as input and outputs segmentation masks of the species of interest across the video. Notably, FLAIR leverages a zero-shot approach, eliminating the need for labeled data,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIchthyology and Marine Biology · Identification and Quantification in Food
