On Scaling Up 3D Gaussian Splatting Training
Hexu Zhao, Haoyang Weng, Daohan Lu, Ang Li, Jinyang Li, Aurojit Panda, Saining Xie

TL;DR
This paper introduces Grendel, a distributed system that enables scalable 3D Gaussian Splatting training across multiple GPUs, significantly improving reconstruction quality and handling large-scale scenes efficiently.
Contribution
We propose Grendel, a novel distributed framework for 3D Gaussian Splatting that supports multi-GPU training with sparse communication and dynamic load balancing.
Findings
Enhanced rendering quality with larger Gaussian models.
Achieved higher PSNR on large-scale scenes using multiple GPUs.
Demonstrated effective hyperparameter scaling strategies.
Abstract
3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its superior visual quality and rendering speed. However, 3DGS training currently occurs on a single GPU, limiting its ability to handle high-resolution and large-scale 3D reconstruction tasks due to memory constraints. We introduce Grendel, a distributed system designed to partition 3DGS parameters and parallelize computation across multiple GPUs. As each Gaussian affects a small, dynamic subset of rendered pixels, Grendel employs sparse all-to-all communication to transfer the necessary Gaussians to pixel partitions and performs dynamic load balancing. Unlike existing 3DGS systems that train using one camera view image at a time, Grendel supports batched training with multiple views. We explore various optimization hyperparameter scaling strategies and find that a simple sqrt(batch size) scaling rule is…
Peer Reviews
Decision·ICLR 2025 Oral
(1) This work enables the distributed training of 3DGS, which can be very useful to accelerate the training of 3DGS. (2) The paper is well-written and easy to understand. (2) This work did exhaustive experiments on very large-scale datasets (e.g., the MatrixCity dataset)
(1) The related works are put into the last section of the paper, which is weird to me. It lacks discussion of some existing distributed methods, such as **DOGS** (Chen and Lee, NeurIPS 2024), and RetinaGS(Li, et, al, arXiv 2024). (2) In Sec.6 Related Works, The discussion of VastGaussian, CityGaussian, and Hierarchical Gaussian is wrong. They do not merge the resulting images. Instead, they merge the sub-models. And the authors also claimed that "None of these systems can consider a full large
**Motivation** * This paper has a strong motivation of expanding the training of 3DGS into multiple GPUs/machines. The vanilla 3DGS has a limited number of primitives supported due to the GPU memory limit, but the exceptional performance of 3DGS indicates its great usefulness in large-scale reconstruction. This paper is monumental in providing support in terms of parallel training. **Method** * This paper correctly identifies the per-Gaussian and per-pixel stages of the training pipeline of 3DG
**Method** * The paper follows the vanilla 3DGS design and does not consider the Level of Detail support. Since this paper focuses on the large-scale reconstruction, LOD is essential for downstream applications. It is not clear how the proposed method can support LOD. **Experiment** * The quality improvement is not as significant as the number of primitives or the training speed. However, I would like to argue that this is because of the lack of an appropriate dataset to evaluate a very large-s
1. Proposal of parallelization with respect to different stages of the rendering pipeline of 3DGS is interesting. 2. With the proposed method, training 3DGS on high-resolution images is possible. 3. Batch processing can significantly increase the training speed.
1. This paper lacks a conclusion and limitations, making it difficult to understand its overall contribution contribution and appear incomplete. 2. Similar to the proposed learning rate scaling, RAdam [1] also suggests a method for variance rectification. It also rectifies the learning rate using variance to enable adaptive learning and, I believe, takes batch size into account. However, the advantage of the proposed method remains unclear. 3. The rendering speed is expected to be slow due to
Code & Models
Videos
Taxonomy
TopicsSpectroscopy Techniques in Biomedical and Chemical Research
