SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image

Dimitrije Anti\'c; Georgios Paschalidis; Shashank Tripathi; Theo Gevers; Sai Kumar Dwivedi; Dimitrios Tzionas

arXiv:2409.16178·cs.CV·August 1, 2025

SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image

Dimitrije Anti\'c, Georgios Paschalidis, Shashank Tripathi, Theo Gevers, Sai Kumar Dwivedi, Dimitrios Tzionas

PDF

Open Access

TL;DR

SDFit is a novel optimization framework that accurately recovers 3D object pose and shape from a single image by fitting a morphable signed-distance-function model, robustly handling occlusions and unseen shapes without retraining.

Contribution

It introduces a category-specific morphable SDF model, an efficient shape retrieval method, and a pose initialization technique, enabling robust 3D inference from single images in the wild.

Findings

01

Performs comparably to state-of-the-art methods on unoccluded images.

02

Robustly handles occlusions and uncommon poses.

03

Requires no retraining for new images.

Abstract

Recovering 3D object pose and shape from a single image is a challenging and ill-posed problem. This is due to strong (self-)occlusions, depth ambiguities, the vast intra- and inter-class shape variance, and the lack of 3D ground truth for natural images. Existing deep-network methods are trained on synthetic datasets to predict 3D shapes, so they often struggle generalizing to real-world images. Moreover, they lack an explicit feedback loop for refining noisy estimates, and primarily focus on geometry without directly considering pixel alignment. To tackle these limitations, we develop a novel render-and-compare optimization framework, called SDFit. This has three key innovations: First, it uses a learned category-specific and morphable signed-distance-function (mSDF) model, and fits this to an image by iteratively refining both 3D pose and shape. The mSDF robustifies inference by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Human Motion and Animation

MethodsFocus