InterCap: Joint Markerless 3D Tracking of Humans and Objects in   Interaction

Yinghao Huang (1); Omid Tehari (1); Michael J. Black (1); Dimitrios; Tzionas (2) ((1) Max Planck Institute for Intelligent Systems; T\"ubingen,; Germany; (2) University of Amsterdam; Amsterdam; The Netherlands)

arXiv:2209.12354·cs.CV·October 4, 2022

InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction

Yinghao Huang (1), Omid Tehari (1), Michael J. Black (1), Dimitrios, Tzionas (2) ((1) Max Planck Institute for Intelligent Systems, T\"ubingen,, Germany, (2) University of Amsterdam, Amsterdam, The Netherlands)

PDF

Open Access

TL;DR

InterCap introduces a multi-view RGB-D approach for detailed 3D reconstruction of whole-body interactions with objects, leveraging contact cues and a new dataset to advance understanding of human-object interactions.

Contribution

The paper presents a novel method combining multi-view RGB-D data and contact information to reconstruct detailed whole-body and object interactions, along with a new dataset capturing diverse human-object interactions.

Findings

01

Successfully reconstructs whole-body and object interactions in 3D.

02

Provides a large, multi-view RGB-D dataset with contact annotations.

03

Demonstrates improved pose estimation using contact cues.

Abstract

Humans constantly interact with daily objects to accomplish tasks. To understand such interactions, computers need to reconstruct these from cameras observing whole-body interaction with scenes. This is challenging due to occlusion between the body and objects, motion blur, depth/scale ambiguities, and the low image resolution of hands and graspable object parts. To make the problem tractable, the community focuses either on interacting hands, ignoring the body, or on interacting bodies, ignoring hands. The GRAB dataset addresses dexterous whole-body interaction but uses marker-based MoCap and lacks images, while BEHAVE captures video of body object interaction but lacks hand detail. We address the limitations of prior work with InterCap, a novel method that reconstructs interacting whole-bodies and objects from multi-view RGB-D data, using the parametric whole-body model SMPL-X and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Stroke Rehabilitation and Recovery · Hand Gesture Recognition Systems