Contrastive Representation Learning for Hand Shape Estimation

Christian Zimmermann; Max Argus; Thomas Brox

arXiv:2106.04324·cs.CV·July 5, 2021

Contrastive Representation Learning for Hand Shape Estimation

Christian Zimmermann, Max Argus, Thomas Brox

PDF

TL;DR

This paper enhances monocular hand shape estimation by leveraging contrastive learning with a new dataset, HanCo, and multi-view data, achieving significant accuracy improvements over baseline methods.

Contribution

It introduces HanCo, a structured hand image dataset, and demonstrates how contrastive learning with background removal and multi-view data improves hand shape estimation.

Findings

01

4.7% reduction in mesh error

02

3.6% improvement in F-score

03

Enhanced representation quality for hand shape estimation

Abstract

This work presents improvements in monocular hand shape estimation by building on top of recent advances in unsupervised learning. We extend momentum contrastive learning and contribute a structured collection of hand images, well suited for visual representation learning, which we call HanCo. We find that the representation learned by established contrastive learning methods can be improved significantly by exploiting advanced background removal techniques and multi-view information. These allow us to generate more diverse instance pairs than those obtained by augmentations commonly used in exemplar based approaches. Our method leads to a more suitable representation for the hand shape estimation task and shows a 4.7% reduction in mesh error and a 3.6% improvement in F-score compared to an ImageNet pretrained baseline. We make our benchmark dataset publicly available, to encourage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning