NanoHTNet: Nano Human Topology Network for Efficient 3D Human Pose Estimation
Jialun Cai, Mengyuan Liu, Hong Liu, Shuheng Zhou, Wenhao Li

TL;DR
NanoHTNet is a compact 3D human pose estimation network that leverages explicit and implicit human body priors, achieving high efficiency suitable for resource-limited edge devices through innovative model design and pre-training techniques.
Contribution
The paper introduces NanoHTNet, a novel tiny 3D HPE network with hierarchical mixers and ETST, along with PoseCLR pre-training, to improve efficiency and performance on edge devices.
Findings
NanoHTNet outperforms state-of-the-art methods in efficiency.
PoseCLR enhances the implicit understanding of human topology.
NanoHTNet is suitable for deployment on resource-constrained devices.
Abstract
The widespread application of 3D human pose estimation (HPE) is limited by resource-constrained edge devices, requiring more efficient models. A key approach to enhancing efficiency involves designing networks based on the structural characteristics of input data. However, effectively utilizing the structural priors in human skeletal inputs remains challenging. To address this, we leverage both explicit and implicit spatio-temporal priors of the human body through innovative model design and a pre-training proxy task. First, we propose a Nano Human Topology Network (NanoHTNet), a tiny 3D HPE network with stacked Hierarchical Mixers to capture explicit features. Specifically, the spatial Hierarchical Mixer efficiently learns the human physical topology across multiple semantic levels, while the temporal Hierarchical Mixer with discrete cosine transform and low-pass filtering captures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Gait Recognition and Analysis
MethodsDiscrete Cosine Transform · Contrastive Learning
