EgoPoseVR: Spatiotemporal Multi-Modal Reasoning for Egocentric Full-Body Pose in Virtual Reality
Haojie Cheng, Shaun Jing Heng Ong, Shaoyu Cai, Aiden Tat Yang Koh, Fuxi Ouyang, and Eng Tat Khoo

TL;DR
EgoPoseVR is a novel framework that combines multi-modal data and kinematic constraints to improve real-time, accurate, and stable full-body pose estimation in virtual reality, enhancing user embodiment and interaction.
Contribution
It introduces an end-to-end spatiotemporal fusion approach with a large synthetic dataset for training and evaluation of egocentric full-body pose estimation in VR.
Findings
Outperforms existing egocentric pose estimation models.
Achieves higher accuracy and stability in real-world VR scenes.
User study shows improved embodiment and intention recognition.
Abstract
Immersive virtual reality (VR) applications demand accurate, temporally coherent full-body pose tracking. Recent head-mounted camera-based approaches show promise in egocentric pose estimation, but encounter challenges when applied to VR head-mounted displays (HMDs), including temporal instability, inaccurate lower-body estimation, and the lack of real-time performance. To address these limitations, we present EgoPoseVR, an end-to-end framework for accurate egocentric full-body pose estimation in VR that integrates headset motion cues with egocentric RGB-D observations through a dual-modality fusion pipeline. A spatiotemporal encoder extracts frame- and joint-level representations, which are fused via cross-attention to fully exploit complementary motion cues across modalities. A kinematic optimization module then imposes constraints from HMD signals, enhancing the accuracy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Virtual Reality Applications and Impacts
