Fused-Planes: Why Train a Thousand Tri-Planes When You Can Share?
Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Flavian Vasile, Jeremie Mary, Andrew Comport, Val\'erie Gouet-Brunet

TL;DR
Fused-Planes introduces a shared-base plane approach for 3D object modeling that significantly reduces training time and memory usage compared to traditional Tri-Planes, while maintaining high rendering quality.
Contribution
The paper proposes Fused-Planes, a novel shared-base plane representation that captures structural similarities across objects, improving efficiency over independent Tri-Plane training.
Findings
7.2× faster training than Tri-Planes
3.2× lower memory footprint
1875× reduction in per-object memory with minimal quality loss
Abstract
Tri-Planar NeRFs enable the application of powerful 2D vision models for 3D tasks, by representing 3D objects using 2D planar structures. This has made them the prevailing choice to model large collections of 3D objects. However, training Tri-Planes to model such large collections is computationally intensive and remains largely inefficient. This is because the current approaches independently train one Tri-Plane per object, hence overlooking structural similarities in large classes of objects. In response to this issue, we introduce Fused-Planes, a novel object representation that improves the resource efficiency of Tri-Planes when reconstructing object classes, all while retaining the same planar structure. Our approach explicitly captures structural similarities across objects through a latent space and a set of globally shared base planes. Each individual Fused-Planes is then…
Peer Reviews
Decision·ICLR 2026 Poster
* **Clear and practical contribution.** The paper tackles a real inefficiency in Tri-Plane training and offers an intuitive solution that effectively shares structure within a class. * **Strong empirical gains.** Training speed, memory footprint, and quality all improve substantially. The ULW version demonstrates impressive compression with minimal degradation. * **Elegant design.** The micro–macro split is simple yet powerful, allowing reuse of planar architectures while amortizing cost across
* **Single-class limitation.** The method assumes one class per model. Scaling to diverse datasets would require separate sets of base planes, reducing its flexibility. * **Limited analysis of shared bases.** The learned base planes are not explored in depth. It is unclear what structures they capture or how weights vary across instances. * **Restricted evaluation scope.** Experiments focus only on novel view synthesis of synthetic datasets. No tests on real or multi-class data, or on downstream
1. The paper introduces a 3D aware latent space as a form of shared representation in the object reconstruction domain, enabling the model to better capture structural similarities across object classes. According to the ablation study, employing this latent representation rather than directly optimizing in RGB space leads to faster convergence while maintaining comparable rendering quality. 2. Compared to C3 NeRF, which scales only up to around 20 scenes, the proposed approach remains scalable
1. Although the paper claims that Fused-Planes remains scalable to thousands of objects, I do not observe convincing evidence of this property from the presented experimental results or supplementary videos. 2. In line 103, the paper states that TensoRF, 3DGS, and Instant-NGP cannot be reshaped into image-like tensors. However, to my knowledge, several recent works in the 3DGS domain, such as Animatable Gaussians, ASH, GaussianAvatar, and Reperformer, have successfully employed 2D UV unwrapping
The major strength of this paper is the significant resource savings (7.2× speed, 3.2× memory reduction vs Tri-Planes). Besides, the ultra-lightweight variant achieves 1875× memory reduction. And it maintains the rendering quality.
The method only works within a single object class. For multiple classes with large visual variations, you need multiple instances of Fused-Planes. This significantly limits practical applicability. For diverse datasets, the overhead of multiple base plane sets could negate efficiency gains. For open surfaces and unbounded scenes, this method is still limited like other triplane methods. Table 4, Fused-Planes (Micro)(latent space without macro planes) performs worse than Tri-Planes. This sugge
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · 3D Shape Modeling and Analysis
MethodsSparse Evolutionary Training · Focus
