QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture
Cuong Le, Pavlo Melnyk, Urs Waldmann, M{\aa}rten Wadenb\"ack, and Bastian Wandt

TL;DR
QuaMo introduces a quaternion differential equations approach for vision-based 3D human motion capture, effectively addressing discontinuities and improving accuracy over traditional methods, especially in online settings.
Contribution
The paper proposes a novel quaternion-based method using QDE and a state-space model with acceleration enhancement for more stable and accurate 3D human kinematics estimation.
Findings
Outperforms state-of-the-art methods on multiple datasets.
Produces continuous, stable motion reconstructions without discontinuities.
Accurately estimates 3D human kinematics in real-time.
Abstract
Vision-based 3D human motion capture from videos remains a challenge in computer vision. Traditional 3D pose estimation approaches often ignore the temporal consistency between frames, causing implausible and jittery motion. The emerging field of kinematics-based 3D motion capture addresses these issues by estimating the temporal transitioning between poses instead. A major drawback in current kinematics approaches is their reliance on Euler angles. Despite their simplicity, Euler angles suffer from discontinuity that leads to unstable motion reconstructions, especially in online settings where trajectory refinement is unavailable. Contrarily, quaternions have no discontinuity and can produce continuous transitions between poses. In this paper, we propose QuaMo, a novel Quaternion Motions method using quaternion differential equations (QDE) for human kinematics capture. We utilize the…
Peer Reviews
Decision·ICLR 2026 Poster
* The paper targets the challenging problem of 3D human motion capture based on visual input, which is of great importance to the industry applications. * The idea of using quaternion differential equations for human kinematics capture is intersting. * It reports reasonable results on several benchmarks like Human3.6M, Fit3D, SportsPose and AIST.
* It should include the recent references published in the recent two years. Currently, there is no reference published in 2025. * For the experiments, there are several releated works which are not compared. For example, [R1] is referenced in the paper but not compared in Table 1. By checking the results report in [R1], it would have obviously better results compared with the prposed algorithm. Please involve more papers published in the recent two years (2024-2025) for comparisons. [R1] Jih
Clear articulation of the discontinuity/gimbal lock problem and why Euler angles are problematic in online capture. Correct quaternion-based formulation with integration respecting the unit-sphere constraint. Adaptive acceleration term adds responsiveness to the PD controller, helping in fast motion regimes. Solid empirical evaluation across multiple datasets, and thorough ablation on rotation representations and pipeline components.
Novelty: Quaternions for rotation representation are standard in robotics, graphics, and physics simulation. The specific QDE formulation and its integration here are sound, but not a fundamentally new concept. No cost analysis: The paper doesn’t compare training/inference speed, memory usage, or computational overhead with Euler/axis-angle setups. The practical trade-off is unclear. Input dependency: Performance varies greatly depending on upstream reference pose quality (TRACE vs HMR2.0). Th
- A key strength of QuaMo is its fully online state-space formulation, which does not rely on future observations, enabling real-time refinement of off-the-shelf 3D pose estimators through iterative updates of angular velocity and quaternion states. - Unlike integration approximation methods, which risk moving the estimated quaternion outside the unit sphere $S^3$ and therefore require normalization to mitigate this issue, this work uses the Hamilton product between the quaternion representation
- As shown in table 3, although the proposed acceleration term $\alpha$ enhances the joint-based errors, it increases the acceleration error. It would be useful to clarify when to disable/attenuate the term to prevent the rise of acceleration error. - Although the addition of Euler integration for root translation shows an improvement in GRE, it is a first-order numerical integration method, and I suspect that small errors could accumulate over time. A figure or analysis of translation error ver
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Hand Gesture Recognition Systems
