TL;DR
ReconPhys is a novel feedforward framework that jointly estimates physical attributes and reconstructs non-rigid objects from a single video, enabling fast, practical, and generalizable 3D reconstruction without manual labels.
Contribution
It introduces the first self-supervised, feedforward approach for joint physical attribute estimation and 3D reconstruction from monocular video, outperforming optimization-based methods.
Findings
Achieves 21.64 PSNR in future prediction, surpassing 13.27 of baselines.
Reduces Chamfer Distance from 0.349 to 0.004.
Enables inference in under 1 second, much faster than hours needed by prior methods.
Abstract
Reconstructing non-rigid objects with physical plausibility remains a significant challenge. Existing approaches leverage differentiable rendering for per-scene optimization, recovering geometry and dynamics but requiring expensive tuning or manual annotation, which limits practicality and generalizability. To address this, we propose ReconPhys, the first feedforward framework that jointly learns physical attribute estimation and 3D Gaussian Splatting reconstruction from a single monocular video. Our method employs a dual-branch architecture trained via a self-supervised strategy, eliminating the need for ground-truth physics labels. Given a video sequence, ReconPhys simultaneously infers geometry, appearance, and physical attributes. Experiments on a large-scale synthetic dataset demonstrate superior performance: our method achieves 21.64 PSNR in future prediction compared to 13.27 by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
