StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing   Flow with Offset

Chaofan Huo; Ye Shi; Yuexin Ma; Lan Xu; Jingyi Yu; Jingya Wang

arXiv:2407.20545·cs.CV·July 31, 2024

StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset

Chaofan Huo, Ye Shi, Yuexin Ma, Lan Xu, Jingyi Yu, Jingya Wang

PDF

1 Repo

TL;DR

StackFLOW introduces a novel normalizing flow-based approach for detailed 3D human-object spatial relation modeling from monocular images, enabling improved pose and interaction understanding.

Contribution

The paper presents a new human-object offset representation and a stacked normalizing flow model for efficient 3D spatial relation inference from monocular images.

Findings

01

Achieves state-of-the-art results on BEHAVE and InterCap datasets.

02

Effectively finetunes human and object poses using the proposed probabilistic framework.

03

Demonstrates superior modeling of human-object interactions compared to previous methods.

Abstract

Modeling and capturing the 3D spatial arrangement of the human and the object is the key to perceiving 3D human-object interaction from monocular images. In this work, we propose to use the Human-Object Offset between anchors which are densely sampled from the surface of human mesh and object mesh to represent human-object spatial relation. Compared with previous works which use contact map or implicit distance filed to encode 3D human-object spatial relations, our method is a simple and efficient way to encode the highly detailed spatial correlation between the human and object. Based on this representation, we propose Stacked Normalizing Flow (StackFLOW) to infer the posterior distribution of human-object spatial relations from the image. During the optimization stage, we finetune the human body pose and object 6D pose by maximizing the likelihood of samples based on this posterior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huochf/StackFLOW
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.