Zero-Shot Multi-Animal Tracking in the Wild
Jan Frederik Meier, Timo L\"uddecke

TL;DR
This paper presents a zero-shot multi-animal tracking framework leveraging vision foundation models, enabling effective tracking across diverse species and environments without retraining or hyperparameter tuning.
Contribution
It introduces a novel zero-shot tracking approach combining Grounding Dino and SAM 2, eliminating the need for dataset-specific fine-tuning.
Findings
Strong performance across multiple datasets
Effective across diverse species and habitats
No retraining required
Abstract
Multi-animal tracking is crucial for understanding animal ecology and behavior. However, it remains a challenging task due to variations in habitat, motion patterns, and species appearance. Traditional approaches typically require extensive model fine-tuning and heuristic design for each application scenario. In this work, we explore the potential of recent vision foundation models for zero-shot multi-animal tracking. By combining a Grounding Dino object detector with the Segment Anything Model 2 (SAM 2) tracker and carefully designed heuristics, we develop a tracking framework that can be applied to new datasets without any retraining or hyperparameter adaptation. Evaluations on ChimpAct, Bird Flock Tracking, AnimalTrack, and a subset of GMOT-40 demonstrate strong and consistent performance across diverse species and environments. The code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · UAV Applications and Optimization · Wildlife Ecology and Conservation
