GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images

Chengfeng Wang; Wei Zhai; Yuhang Yang; Yang Cao; Zhengjun Zha

arXiv:2505.06575·cs.CV·May 13, 2025

GRACE: Estimating Geometry-level 3D Human-Scene Contact from 2D Images

Chengfeng Wang, Wei Zhai, Yuhang Yang, Yang Cao, Zhengjun Zha

PDF

Open Access

TL;DR

GRACE introduces a novel 3D human-scene contact estimation method that effectively integrates geometric structures with 2D image semantics, achieving state-of-the-art accuracy and strong generalization across diverse human geometries.

Contribution

The paper presents GRACE, a point cloud-based framework that models geometry-level contact estimation, overcoming limitations of parametric models and enhancing generalization.

Findings

01

Achieves state-of-the-art contact estimation accuracy.

02

Demonstrates robust generalization to unstructured human point clouds.

03

Outperforms existing methods on multiple benchmarks.

Abstract

Estimating the geometry level of human-scene contact aims to ground specific contact surface points at 3D human geometries, which provides a spatial prior and bridges the interaction between human and scene, supporting applications such as human behavior analysis, embodied AI, and AR/VR. To complete the task, existing approaches predominantly rely on parametric human models (e.g., SMPL), which establish correspondences between images and contact regions through fixed SMPL vertex sequences. This actually completes the mapping from image features to an ordered sequence. However, this approach lacks consideration of geometry, limiting its generalizability in distinct human geometries. In this paper, we introduce GRACE (Geometry-level Reasoning for 3D Human-scene Contact Estimation), a new paradigm for 3D human contact estimation. GRACE incorporates a point cloud encoder-decoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Robot Manipulation and Learning