MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation
Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao Guo

TL;DR
MoCA introduces a scalable compositional 3D generation model that efficiently handles numerous components by using importance-based routing and component compression, outperforming existing methods in object and scene creation.
Contribution
MoCA's novel importance-based routing and component compression enable scalable, fine-grained 3D generation with improved efficiency and quality.
Findings
MoCA outperforms baselines in compositional 3D object generation.
MoCA effectively scales to a large number of components.
MoCA achieves superior scene generation results.
Abstract
Compositionality is critical for 3D object and scene generation, but existing part-aware 3D generation methods suffer from poor scalability due to quadratic global attention costs when increasing the number of components. In this work, we present MoCA, a compositional 3D generative model with two key designs: (1) importance-based component routing that selects top-k relevant components for sparse global attention, and (2) unimportant components compression that preserve contextual priors of unselected components while reducing computational complexity of global attention. With these designs, MoCA enables efficient, fine-grained compositional 3D asset creation with scalable number of components. Extensive experiments show MoCA outperforms baselines on both compositional object and scene generation tasks. Project page: https://lizhiqi49.github.io/MoCA
Peer Reviews
Decision·Submitted to ICLR 2026
- Good motivation: The most challenging aspect of part generation is having a large number of parts. This works tries to improve performance on such important task. - Good results. The method can generate more parts than previous work. - Sound technical approach. The general design of the method is sound. And the idea of using routers is interesting.
- Complicated system. The system seems to be very complicated, and the technical details are hard to read. Method pipeline figure is challenging to understand - Maybe a better way is to decompose the figure into multiple figures so it is easier to understand part by part. - Limited insight. Although the Routing mechanism seems valid, it is a general method and the author does not further utilize properties that are unique to 3D part structures.
1. The proposed Mixture-of-Components Attention is well motivated for addressing the quadratic global attention cost. 2. Complete ablations: All design choices (compression, gating, activation, multi-head routing) are completely ablated. 3. Strong performance: better experimental results have been observed against baselines like PartPacker, PartCrafter, MIDI
1. No apperance: It seems that all methods, including baselines, only generate meshes without textures, which might limit the real-world applications. Can the authors provide more details about this?
- Although the method itself is standard, the results appear strong, largely due to the high quality of current datasets rather than novel modeling contributions. - The focus on structured, compositional 3D generation is a meaningful and valuable research direction that deserves further attention.
- The transformer design lacks novelty and is quite widely used and studied in the literature of scene generation and part-based 3D object generation. Given the small latent space (fewer than 100 parts), any sufficiently large transformer could model the distribution; the architectural choices do not seem critical. I would believe this is a "fake" contribution. - The reported results are not diverse, raising concerns about overfitting to the dataset. - The paper only evaluates on synthetic data
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
