HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images

Anilkumar Swamy; Vincent Leroy; Philippe Weinzaepfel; Jean-S\'ebastien Franco; Gr\'egory Rogez

arXiv:2508.16465·cs.CV·August 26, 2025

HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images

Anilkumar Swamy, Vincent Leroy, Philippe Weinzaepfel, Jean-S\'ebastien Franco, Gr\'egory Rogez

PDF

TL;DR

HOSt3R introduces a keypoint-free method for 3D hand-object reconstruction from RGB videos, improving robustness and generalization without relying on keypoint detection or object templates.

Contribution

The paper presents HOSt3R, a novel approach that eliminates keypoint detection for hand-object 3D reconstruction, enabling more scalable and generalizable applications.

Findings

01

Achieves state-of-the-art results on SHOWMe benchmark.

02

Demonstrates effective generalization to unseen object categories.

03

Operates without requiring camera intrinsics or pre-scanned templates.

Abstract

Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.