TL;DR
TORA is a topological representation alignment framework that improves 3D shape assembly by guiding flow-matching methods with relational structure, leading to faster convergence and better accuracy, especially in zero-shot scenarios.
Contribution
Introducing TORA, a novel topology-first representation alignment method that enhances flow-matching for 3D shape assembly without additional inference overhead.
Findings
TORA achieves up to 6.9× faster convergence.
It improves accuracy in-distribution and under domain shift.
State-of-the-art results on five diverse benchmarks.
Abstract
Flow-matching methods for 3D shape assembly learn point-wise velocity fields that transport parts toward assembled configurations, yet they receive no explicit guidance about which cross-part interactions should drive the motion. We introduce TORA, a topology-first representation alignment framework that distills relational structure from a frozen pretrained 3D encoder into the flow-matching backbone during training. We first realize this via simple instantiation, token-wise cosine matching, which injects the learned geometric descriptors from the teacher representation. We then extend to employ a Centered Kernel Alignment (CKA) loss to match the similarity structure between student and teacher representations for enhanced topological alignment. Through systematic probing of diverse 3D encoders, we show that geometry- and contact-centric teacher properties, not semantic classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
