SAM 3D Body: Robust Full-Body Human Mesh Recovery

Xitong Yang; Devansh Kukreja; Don Pinkus; Anushka Sagar; Taosha Fan; Jinhyung Park; Soyong Shin; Jinkun Cao; Jiawei Liu; Nicolas Ugrinovic; Matt Feiszli; Jitendra Malik; Piotr Dollar; Kris Kitani

arXiv:2602.15989·cs.CV·February 19, 2026

SAM 3D Body: Robust Full-Body Human Mesh Recovery

Xitong Yang, Devansh Kukreja, Don Pinkus, Anushka Sagar, Taosha Fan, Jinhyung Park, Soyong Shin, Jinkun Cao, Jiawei Liu, Nicolas Ugrinovic, Matt Feiszli, Jitendra Malik, Piotr Dollar, Kris Kitani

PDF

Open Access

TL;DR

SAM 3D Body (3DB) is a novel promptable model for single-image 3D human mesh recovery that achieves state-of-the-art accuracy and generalization across diverse conditions using a new mesh representation and auxiliary prompts.

Contribution

Introduces the Momentum Human Rig (MHR) for decoupled skeletal and surface modeling, and a promptable architecture supporting auxiliary inputs for improved 3D human mesh recovery.

Findings

01

Achieves superior generalization in diverse in-the-wild conditions

02

Outperforms prior methods in qualitative and quantitative evaluations

03

Provides open-source models and datasets for further research

Abstract

We introduce SAM 3D Body (3DB), a promptable model for single-image full-body 3D human mesh recovery (HMR) that demonstrates state-of-the-art performance, with strong generalization and consistent accuracy in diverse in-the-wild conditions. 3DB estimates the human pose of the body, feet, and hands. It is the first model to use a new parametric mesh representation, Momentum Human Rig (MHR), which decouples skeletal structure and surface shape. 3DB employs an encoder-decoder architecture and supports auxiliary prompts, including 2D keypoints and masks, enabling user-guided inference similar to the SAM family of models. We derive high-quality annotations from a multi-stage annotation pipeline that uses various combinations of manual keypoint annotation, differentiable optimization, multi-view geometry, and dense keypoint detection. Our data engine efficiently selects and processes data to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Neural Network Applications