GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Hao Sun; Lei Fan; Donglin Di; Shaohui Liu

arXiv:2512.03566·cs.CV·December 4, 2025

GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

PDF

Open Access

TL;DR

GAOT introduces a three-phase framework that generates detailed 3D articulated objects from text prompts by combining diffusion models and hypergraph learning, filling a key gap in text-conditioned 3D object synthesis.

Contribution

The paper presents a novel three-step method integrating diffusion models and hypergraph learning to generate articulated objects from text, a capability lacking in prior models.

Findings

01

Achieves superior performance on PartNet-Mobility dataset

02

Effectively generates articulated objects from text prompts

03

Outperforms previous methods in qualitative and quantitative evaluations

Abstract

Articulated object generation has seen increasing advancements, yet existing models often lack the ability to be conditioned on text prompts. To address the significant gap between textual descriptions and 3D articulated object representations, we propose GAOT, a three-phase framework that generates articulated objects from text prompts, leveraging diffusion models and hypergraph learning in a three-step process. First, we fine-tune a point cloud generation model to produce a coarse representation of objects from text prompts. Given the inherent connection between articulated objects and graph structures, we design a hypergraph-based learning method to refine these coarse representations, representing object parts as graph vertices. Finally, leveraging a diffusion model, the joints of articulated objects-represented as graph edges-are generated based on the object parts. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis