AniMer+: Unified Pose and Shape Estimation Across Mammalia and Aves via Family-Aware Transformer
Liang An, Jin Lyu, Li Lin, Pujin Cheng, Yebin Liu, Xiaoying Tang

TL;DR
AniMer+ is a unified, high-capacity transformer framework that accurately estimates pose and shape across mammals and birds, leveraging synthetic datasets and a family-aware design for improved biological analysis.
Contribution
Introduces AniMer+, a novel family-aware Vision Transformer with Mixture-of-Experts, and large-scale synthetic datasets for multi-species animal pose and shape estimation.
Findings
Outperforms existing methods on multiple benchmarks.
Effective in zero-shot and out-of-domain scenarios.
Synthetic datasets significantly improve real-world performance.
Abstract
In the era of foundation models, achieving a unified understanding of different dynamic objects through a single network has the potential to empower stronger spatial intelligence. Moreover, accurate estimation of animal pose and shape across diverse species is essential for quantitative analysis in biological research. However, this topic remains underexplored due to the limited network capacity of previous methods and the scarcity of comprehensive multi-species datasets. To address these limitations, we introduce AniMer+, an extended version of our scalable AniMer framework. In this paper, we focus on a unified approach for reconstructing mammals (mammalia) and birds (aves). A key innovation of AniMer+ is its high-capacity, family-aware Vision Transformer (ViT) incorporating a Mixture-of-Experts (MoE) design. Its architecture partitions network layers into taxa-specific components…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Cell Image Analysis Techniques · Advanced Neural Network Applications
