Feed-Forward 3D Gaussian Splatting Compression with Long-Context Modeling
Zhening Liu, Rui Song, Yushi Huang, Yingdong Hu, Xinjie Zhang, Jiawei Shao, Zehong Lin, Jun Zhang

TL;DR
This paper introduces a novel feed-forward compression framework for 3D Gaussian Splatting that models long-range dependencies, significantly reducing data size while maintaining high-quality 3D representations.
Contribution
It proposes a large-scale context structure and an attention-based transform coding model to enhance long-range correlation modeling in 3DGS compression.
Findings
Achieves a 20x compression ratio for 3DGS.
Outperforms existing codecs in state-of-the-art benchmarks.
Enables highly compact and generalizable 3D representations.
Abstract
3D Gaussian Splatting (3DGS) has emerged as a revolutionary 3D representation. However, its substantial data size poses a major barrier to widespread adoption. While feed-forward 3DGS compression offers a practical alternative to costly per-scene per-train compressors, existing methods struggle to model long-range spatial dependencies, due to the limited receptive field of transform coding networks and the inadequate context capacity in entropy models. In this work, we propose a novel feed-forward 3DGS compression framework that effectively models long-range correlations to enable highly compact and generalizable 3D representations. Central to our approach is a large-scale context structure that comprises thousands of Gaussians based on Morton serialization. We then design a fine-grained space-channel auto-regressive entropy model to fully leverage this expansive context. Furthermore,…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The paper effectively addresses long-range dependencies by introducing a large-scale context structure. This is a significant improvement over voxel-based local contexts. 2. The combination of Morton serialization, self-attention transform coding, and space-channel context modeling is well-motivated. The design ensures spatial coherence and enables efficient information aggregation across distant Gaussians. 3. The method demonstrates consistent improvements across multiple datasets
1. Shortcomings still exist in the experiments. They primarily include the lack of comparison on computational complexity and the lack of ablation studies on the design choice of using both lossy and lossless color compression simultaneously. 2. The paper lacks discussion on training overhead and hardware requirements
- It indicates a meaningful technical limitation of the prior feed-forward 3DGS compression method, the limited receptive field of the transform coding network, and addresses it by designing an architecture for modeling long contexts. - The proposed fine-grained space-channel context model enables more accurate probability distribution estimation, resulting in improved BD-rate. - The manuscript is well organized to follow the logical flow, while the evaluations successfully validate the contri
- Despite fast compression due to a feed-forward framework, several optimization-based compression methods achieve better BD-rate performance. - The training process demands a large-scale 3DGS dataset, which incurs substantial computational costs associated with optimizing thousands of 3D scenes. - This method does not achieve a noticeable improvement in compression speed, which is also an important factor for feed-forward compression. - The feed-forward compression may require excessive GPU
1. Morton serialization: Introduces Morton-order sorting to enable long-range spatial dependency modeling among thousands of Gaussian primitives while maintaining spatial locality. 2. Space–channel autoregressive entropy model: Proposes a fine-grained space-channel context modeling strategy that jointly captures inter- and intra-correlations of Gaussian primitives.
1. Limited novelty beyond integration: The approach mainly combines Morton serialization, attention-based transform coding, and autoregressive entropy modeling, all known components in point cloud compression, with limited theoretical innovation. 2. Computational and memory overhead: While the attention-based transform coding expands the receptive field, the paper does not quantify the added GPU memory or computational cost compared to simpler MLP-based feed-forward designs. 3. No evaluation wit
1. The paper provides abundant subjective and objective evaluations and thorough ablation studies. 2. The proposed method achieves about 10% BD-Rate gain compared with FCGS while maintaining excellent parallelism.
1. Clarity of the model The proposed method exhibits a high degree of overlap with FCGS. Therefore, exact implementation differences and the precise source of the reported gains must be explicitly clarified: (a) Context window splitting: 3D Morton code suffers from discontinuity: consecutive Morton codes can map to voxels that are far apart in Euclidean space. How does the proposed method solve this problem? (b) DGCNN and the attention module: In Fig. 4 left and Supplement A.1, the DGCNN bl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Data Compression Techniques
