BigMaQ: A Big Macaque Motion and Animation Dataset Bridging Image and 3D Pose Representations
Lucas Martini, Alexander Lappe, Anna Bogn\'ar, Rufin Vogels, Martin A. Giese

TL;DR
BigMaQ is a comprehensive dataset of rhesus macaque videos with detailed 3D pose and shape annotations, enabling improved animal behavior recognition and bridging the gap between 2D image data and 3D motion understanding.
Contribution
It introduces the first large-scale dataset with integrated 3D pose-shape representations for non-human primates, enhancing action recognition accuracy and advancing ethology research.
Findings
Pose features improve action recognition performance.
The dataset enables more accurate 3D animal motion analysis.
Code and data are publicly available for further research.
Abstract
The recognition of dynamic and social behavior in animals is fundamental for advancing ethology, ecology, medicine and neuroscience. Recent progress in deep learning has enabled automated behavior recognition from video, yet an accurate reconstruction of the three-dimensional (3D) pose and shape has not been integrated into this process. Especially for non-human primates, mesh-based tracking efforts lag behind those for other species, leaving pose descriptions restricted to sparse keypoints that are unable to fully capture the richness of action dynamics. To address this gap, we introduce the caue 3D Motion and Animation Dataset (), a large-scale dataset comprising more than 750 scenes of interacting rhesus macaques with detailed 3D pose descriptions. Extending previous surface-based animal tracking methods, we construct subject-specific…
Peer Reviews
Decision·ICLR 2026 Poster
Impactful dataset: fills an important gap for non-human primate research by linking dense 3D surface reconstructions with behavior labels. Practical pipeline: believable and reproducible-in-principle mesh-fitting and rendering pipeline that scales to hundreds of scenes. Demonstrated utility: empirical evidence that 3D pose/shape descriptors can improve action recognition across multiple visual backbones. Rich annotation set: identities, segmentation masks, 2D keypoints, calibrated views, per-
Ethical / animal welfare documentation (major). The recordings come from a neuroscientific laboratory and involve captive macaques. I cannot find any explicit statement of oversight, animal-use committee (e.g., IACUC or institutional equivalent) approval, or details of welfare protocols. The paper says recordings were “not induced by humans except feeding,” but ethical approval and animal care protocols must be stated explicitly. Statistical significance and variance. Report standard deviation
### **Strengths** * The paper introduces a large-scale dataset that integrates detailed 3D pose–shape representations with action labels for macaques, addressing an important gap in current non-human primate research. * The proposed annotation and reconstruction pipeline is innovative, combining markerless motion capture, subject-specific mesh tracking, and photometric texturing in a scalable way. * The BigMac500 benchmark is a valuable contribution, demonstrating quantifiable improvements in a
### **Main Weaknesses** The paper currently lacks sufficient detail about the dataset annotation pipeline. A clear, visual overview of the entire annotation process would be highly valuable—ideally presented as a dedicated figure (either in the main paper or, at minimum, in the appendix). Secondly, the loss functions introduced for optimizing the annotations are not supported by ablation studies. Without these, it remains unclear which losses meaningfully contribute to the final annotation qua
1.The dataset is unprecedented in scale and annotation richness for macaques, combining high-fidelity 3D surface tracking with ethogram-based behavioral labels across multi-individual interactions 2.The work builds on recent multi-view surface tracking advances (MAMMAL, AniMer+) but contributes practical improvements, such astemporal regularization, cropped differential rendering, and texture fitting, to enable efficient processing of thousands of sequences. 3.Comparisons against MAMMAL and An
1.The paper omits several critical details necessary for reproducibility, such as frame selection criteria, train/test split definitions, and subject-level separation (e.g., whether BigMac500 avoids identity leakage across splits). Although the authors describe a 70/10/20 split for the BigMac500 action recognition subset, this rule applies only to that benchmark, not the full BigMac3D dataset. There is no explicit statement on whether subjects are disjoint across splits. Furthermore, while the d
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Zebrafish Biomedical Research Applications · Human Motion and Animation
