Towards Multi-Modal Animal Pose Estimation: A Survey and In-Depth Analysis
Qianyi Deng, Oishi Deb, Amir Patel, Christian Rupprecht, Philip Torr,, Niki Trigoni, and Andrew Markham

TL;DR
This paper surveys multi-modal animal pose estimation, analyzing 176 studies to identify trends, challenges, and future directions across various sensor modalities, and discusses how innovations can benefit both animal and human pose estimation fields.
Contribution
It provides a comprehensive categorization and analysis of multi-modal APE methods, datasets, and evaluation metrics, highlighting current challenges and future research directions.
Findings
Multi-modal APE uses diverse sensors like RGB, LiDAR, and IMU.
Current trends show a shift towards multi-sensor integration.
Identified key challenges include data diversity and modality fusion.
Abstract
Animal pose estimation (APE) aims to locate the animal body parts using a diverse array of sensor and modality inputs (e.g. RGB cameras, LiDAR, infrared, IMU, acoustic and language cues), which is crucial for research across neuroscience, biomechanics, and veterinary medicine. By evaluating 176 papers since 2011, APE methods are categorised by their input sensor and modality types, output forms, learning paradigms, experimental setup, and application domains, presenting detailed analyses of current trends, challenges, and future directions in single- and multi-modality APE systems. The analysis also highlights the transition between human and animal pose estimation, and how innovations in APE can reciprocally enrich human pose estimation and the broader machine learning paradigm. Additionally, 2D and 3D APE datasets and evaluation metrics based on different sensors and modalities are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Human Motion and Animation
