A Versatile and Differentiable Hand-Object Interaction Representation

Th\'eo Morales; Omid Taheri; Gerard Lacey

arXiv:2409.16855·cs.CV·December 2, 2024

A Versatile and Differentiable Hand-Object Interaction Representation

Th\'eo Morales, Omid Taheri, Gerard Lacey

PDF

Open Access

TL;DR

This paper introduces CHOIR, a fully differentiable and versatile hand-object interaction representation, and JointDiffusion, a diffusion model that improves HOI synthesis and refinement with superior contact accuracy and realism.

Contribution

The paper presents CHOIR, a novel continuous HOI representation, and JointDiffusion, a diffusion-based model for improved HOI synthesis and refinement.

Findings

01

Increases contact F1 score by 5% in refinement tasks.

02

Reduces simulation displacement by 46% in synthesis.

03

Outperforms state-of-the-art methods in contact accuracy and realism.

Abstract

Synthesizing accurate hands-object interactions (HOI) is critical for applications in Computer Vision, Augmented Reality (AR), and Mixed Reality (MR). Despite recent advances, the accuracy of reconstructed or generated HOI leaves room for refinement. Some techniques have improved the accuracy of dense correspondences by shifting focus from generating explicit contacts to using rich HOI fields. Still, they lack full differentiability or continuity and are tailored to specific tasks. In contrast, we present a Coarse Hand-Object Interaction Representation (CHOIR), a novel, versatile and fully differentiable field for HOI modelling. CHOIR leverages discrete unsigned distances for continuous shape and pose encoding, alongside multivariate Gaussian distributions to represent dense contact maps with few parameters. To demonstrate the versatility of CHOIR we design JointDiffusion, a diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Speech and dialogue systems · Human Pose and Action Recognition

MethodsDiffusion · Focus