WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals
Luk\'a\v{s} Adam, Vojt\v{e}ch \v{C}erm\'ak, Kostas Papafitsoros, Lukas, Picek

TL;DR
WildlifeReID-10k is a comprehensive large-scale dataset for animal re-identification, designed to facilitate fair evaluation of models across diverse species with robust protocols to prevent data leakage.
Contribution
It introduces a new extensive wildlife re-identification dataset with a novel split protocol to improve evaluation fairness and robustness.
Findings
Provides over 140k images of 10k+ animals across 37 species.
Includes strong baseline models for evaluation.
Ensures fair comparison through time-aware and similarity-aware splits.
Abstract
This paper introduces WildlifeReID-10k, a new large-scale re-identification benchmark with more than 10k animal identities of around 33 species across more than 140k images, re-sampled from 37 existing datasets. WildlifeReID-10k covers diverse animal species and poses significant challenges for SoTA methods, ensuring fair and robust evaluation through its time-aware and similarity-aware split protocol. The latter is designed to address the common issue of training-to-test data leakage caused by visually similar images appearing in both training and test sets. The WildlifeReID-10k dataset and benchmark are publicly available on Kaggle, along with strong baselines for both closed-set and open-set evaluation, enabling fair, transparent, and standardized evaluation of not just multi-species animal re-identification models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWildlife Ecology and Conservation · Environmental DNA in Biodiversity Studies · Species Distribution and Climate Change
