TL;DR
TempCLR is a novel time-coherent contrastive learning method that leverages unlabelled videos to improve 3D hand reconstruction, achieving state-of-the-art results with enhanced temporal consistency and robustness.
Contribution
It introduces a time-coherent contrastive learning framework for hand reconstruction that does not rely on synthetic data or pseudo-labels, improving performance and temporal smoothness.
Findings
Improves PA-V2V performance by 15.9% and 7.6% on HO-3D and FreiHAND datasets.
Produces smoother and more temporally consistent hand reconstructions.
More robust to occlusions compared to previous methods.
Abstract
We introduce TempCLR, a new time-coherent contrastive learning approach for the structured regression task of 3D hand reconstruction. Unlike previous time-contrastive methods for hand pose estimation, our framework considers temporal consistency in its augmentation scheme, and accounts for the differences of hand poses along the temporal direction. Our data-driven method leverages unlabelled videos and a standard CNN, without relying on synthetic data, pseudo-labels, or specialized architectures. Our approach improves the performance of fully-supervised hand reconstruction methods by 15.9% and 7.6% in PA-V2V on the HO-3D and FreiHAND datasets respectively, thus establishing new state-of-the-art performance. Finally, we demonstrate that our approach produces smoother hand reconstructions through time, and is more robust to heavy occlusions compared to the previous state-of-the-art which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
