HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and   Objects from Video

Zicong Fan; Maria Parelli; Maria Eleni Kadoglou; Muhammed Kocabas; Xu; Chen; Michael J. Black; Otmar Hilliges

arXiv:2311.18448·cs.CV·December 1, 2023·1 cites

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Zicong Fan, Maria Parelli, Maria Eleni Kadoglou, Muhammed Kocabas, Xu, Chen, Michael J. Black, Otmar Hilliges

PDF

Open Access 1 Repo

TL;DR

HOLD is a novel category-agnostic method that jointly reconstructs 3D hands and objects from monocular videos without relying on 3D annotations, outperforming supervised methods in diverse settings.

Contribution

The paper introduces HOLD, the first category-agnostic approach for joint 3D reconstruction of hands and objects from monocular videos without needing 3D annotations.

Findings

01

Outperforms supervised baselines in lab and wild settings

02

Does not rely on 3D hand-object annotations

03

Robustly reconstructs from in-the-wild videos

Abstract

Since humans interact with diverse objects every day, the holistic 3D capture of these interactions is important to understand and model human behaviour. However, most existing methods for hand-object reconstruction from RGB either assume pre-scanned object templates or heavily rely on limited 3D hand-object data, restricting their ability to scale and generalize to more unconstrained interaction settings. To this end, we introduce HOLD -- the first category-agnostic method that reconstructs an articulated hand and object jointly from a monocular interaction video. We develop a compositional articulated implicit model that can reconstruct disentangled 3D hand and object from 2D images. We also further incorporate hand-object constraints to improve hand-object poses and consequently the reconstruction quality. Our method does not rely on 3D hand-object annotations while outperforming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zc-alexfan/hold
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications