Detection and Identification of Penguins Using Appearance and Motion Features
Kasumi Seko, Hiroki Kinoshita, Raj Rajeshwar Malinda, Hiroaki Kawashima

TL;DR
This paper presents a novel framework combining appearance and motion features to improve penguin detection and identification in challenging environments, utilizing adapted YOLO and contrastive learning for better accuracy and individual tracking.
Contribution
It introduces a two-frame input adaptation of YOLO for detection and a tracklet-based contrastive learning method for identification, enhancing performance in complex scenarios.
Findings
Detection [email protected] improved from 0.922 to 0.933 with two-frame input
Motion cues help detect obscured targets better than static features
Feature embeddings show coherent clustering for same individuals
Abstract
In animal facilities, continuous surveillance of penguins is essential yet technically challenging due to their homogeneous visual characteristics, rapid and frequent posture changes, and substantial environmental noise such as water reflections. In this study, we propose a framework that enhances both detection and identification performance by integrating appearance and motion features. For detection, we adapted YOLO11 to process consecutive frames to overcome the lack of temporal consistency in single-frame detectors. This approach leverages motion cues to detect targets even when distinct visual features are obscured. Our evaluation shows that fine-tuning the model with two-frame inputs improves [email protected] from 0.922 to 0.933, outperforming the baseline, and successfully recovers individuals that are indistinguishable in static images. For identification, we introduce a tracklet-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Face recognition and analysis · Wildlife Ecology and Conservation
