TL;DR
This paper presents a new dataset, model, and method for accurately estimating the pose and shape of birds from a single camera view, addressing occlusion challenges in social animal studies.
Contribution
It introduces a novel shape and pose model for live birds, a single-view reconstruction pipeline, and a comprehensive dataset with annotations for social birds.
Findings
Accurate single-view bird pose and shape recovery demonstrated.
New dataset with multi-view annotations of social birds.
Effective keypoint, mask, and shape regression methods developed.
Abstract
Automated capture of animal pose is transforming how we study neuroscience and social behavior. Movements carry important social cues, but current methods are not able to robustly estimate pose and shape of animals, particularly for social animals such as birds, which are often occluded by each other and objects in the environment. To address this problem, we first introduce a model and multi-view optimization approach, which we use to capture the unique shape and pose space displayed by live birds. We then introduce a pipeline and experiments for keypoint, mask, pose, and shape regression that recovers accurate avian postures from single views. Finally, we provide extensive multi-view keypoint and mask annotations collected from a group of 15 social birds housed together in an outdoor aviary. The project website with videos, results, code, mesh model, and the Penn Aviary Dataset can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
