Sifting through the haystack -- efficiently finding rare animal behaviors in large-scale datasets
Shir Bar, Or Hirschorn, Roi Holzman, Shai Avidan

TL;DR
This paper introduces an efficient pipeline for detecting and sampling rare animal behaviors from large-scale unlabeled video datasets, significantly reducing manual annotation effort and improving classifier training.
Contribution
It adapts a graph-based anomaly detection model to animal behavior data, enabling focused labeling of rare behaviors without prior assumptions about their nature.
Findings
Outperforms random sampling with 70% average improvement.
Creates effective datasets with behaviors as rare as 0.02%.
Reduces annotation effort by half even when behaviors are not rare.
Abstract
In the study of animal behavior, researchers often record long continuous videos, accumulating into large-scale datasets. However, the behaviors of interest are often rare compared to routine behaviors. This incurs a heavy cost on manual annotation, forcing users to sift through many samples before finding their needles. We propose a pipeline to efficiently sample rare behaviors from large datasets, enabling the creation of training datasets for rare behavior classifiers. Our method only needs an unlabeled animal pose or acceleration dataset as input and makes no assumptions regarding the type, number, or characteristics of the rare behaviors. Our pipeline is based on a recent graph-based anomaly detection model for human behavior, which we apply to this new data domain. It leverages anomaly scores to automatically label normal samples while directing human annotation efforts toward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock
