LARM: A Large Articulated-Object Reconstruction Model

Sylvia Yuan; Ruoxi Shi; Xinyue Wei; Xiaoshuai Zhang; Hao Su; Minghua Liu

arXiv:2511.11563·cs.CV·November 17, 2025

LARM: A Large Articulated-Object Reconstruction Model

Sylvia Yuan, Ruoxi Shi, Xinyue Wei, Xiaoshuai Zhang, Hao Su, Minghua Liu

PDF

Open Access

TL;DR

LARM is a unified feedforward framework that reconstructs detailed, textured 3D articulated objects from sparse images, advancing accuracy and scalability over prior methods by jointly reasoning over geometry, textures, and joint structures.

Contribution

LARM extends a static view synthesis approach to articulated objects, enabling joint reasoning over camera pose and articulation with a transformer-based architecture.

Findings

01

Outperforms state-of-the-art in view synthesis and 3D reconstruction.

02

Produces high-quality meshes closely matching input images.

03

Supports high-fidelity reconstruction across diverse categories.

Abstract

Modeling 3D articulated objects with realistic geometry, textures, and kinematics is essential for a wide range of applications. However, existing optimization-based reconstruction methods often require dense multi-view inputs and expensive per-instance optimization, limiting their scalability. Recent feedforward approaches offer faster alternatives but frequently produce coarse geometry, lack texture reconstruction, and rely on brittle, complex multi-stage pipelines. We introduce LARM, a unified feedforward framework that reconstructs 3D articulated objects from sparse-view images by jointly recovering detailed geometry, realistic textures, and accurate joint structures. LARM extends LVSM a recent novel view synthesis (NVS) approach for static 3D objects into the articulated setting by jointly reasoning over camera pose and articulation variation using a transformer-based architecture,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging