TL;DR
This paper introduces DT-Pose, a two-phase framework for WiFi-based human pose estimation that addresses domain discrepancies and structural inaccuracies, achieving superior results on benchmark datasets.
Contribution
It proposes a novel self-supervised learning strategy and a topology-constrained decoder to improve robustness and realism in WiFi-based human pose estimation.
Findings
Outperforms existing methods on benchmark datasets.
Effectively mitigates domain gap and structural fidelity issues.
Produces more realistic and accurate human skeleton predictions.
Abstract
Robust WiFi-based human pose estimation (HPE) is a challenging task that bridges discrete and subtle WiFi signals to human skeletons. We revisit this problem and reveal two critical yet overlooked issues: 1) cross-domain gap, i.e., due to significant discrepancies in pose distributions between source and target domains; and 2) structural fidelity gap, i.e., predicted skeletal poses manifest distorted topology, usually with misplaced joints and disproportionate bone lengths. This paper fills these gaps by reformulating the task into a novel two-phase framework dubbed DT-Pose: Domain-consistent representation learning and Topology-constrained Pose decoding. Concretely, we first propose a temporal consistency contrastive learning strategy with uniformity regularization, integrated into a self-supervised masked pretraining paradigm. This design facilitates robust learning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
