AugLift: Depth-Aware Input Reparameterization Improves Domain Generalization in 2D-to-3D Pose Lifting
Nikolai Warner, Wenjin Zhang, Hamid Badiozamani, Irfan Essa, Apaar Sadhwani

TL;DR
AugLift enhances 2D-to-3D human pose lifting by reparameterizing input with depth-aware geometric descriptors, significantly improving cross-dataset generalization without architectural changes.
Contribution
It introduces a novel representation format using depth descriptors that can be integrated with existing models to improve domain generalization in pose lifting.
Findings
Reduces cross-dataset MPJPE by 10.1% on average across datasets.
Achieves state-of-the-art cross-dataset performance in 3D pose estimation.
Improves in-distribution accuracy by 4.0%.
Abstract
Lifting-based 3D human pose estimation infers 3D joints from 2D keypoints but generalizes poorly because coordinates alone are an ill-posed, sparse representation that discards geometric information modern foundation models can recover. We propose \emph{AugLift}, which changes the representation format of lifting from 2D coordinates to a 6D geometric descriptor via two modules: (1) an \emph{Uncertainty-Aware Depth Descriptor} (UADD) -- a compact tuple extracted from a confidence-scaled neighborhood of an off-the-shelf monocular depth map -- and (2) a scale normalization component that handles train/test distance shifts. AugLift requires no new sensors, no new data collection, and no architectural changes beyond widening the input layer; because it operates at the representation level, it is composable with any lifting architecture or domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
