TL;DR
SplitGaussian introduces a novel scene representation that explicitly separates static and dynamic components, improving the quality, stability, and interpretability of dynamic scene reconstructions from monocular videos.
Contribution
It proposes a decoupled framework that disentangles static and dynamic scene elements, addressing motion leakage and geometric distortions in Gaussian Splatting-based reconstructions.
Findings
Outperforms prior methods in rendering quality.
Enhances temporal consistency and geometric stability.
Accelerates convergence in scene reconstruction.
Abstract
Reconstructing dynamic 3D scenes from monocular video remains fundamentally challenging due to the need to jointly infer motion, structure, and appearance from limited observations. Existing dynamic scene reconstruction methods based on Gaussian Splatting often entangle static and dynamic elements in a shared representation, leading to motion leakage, geometric distortions, and temporal flickering. We identify that the root cause lies in the coupled modeling of geometry and appearance across time, which hampers both stability and interpretability. To address this, we propose \textbf{SplitGaussian}, a novel framework that explicitly decomposes scene representations into static and dynamic components. By decoupling motion modeling from background geometry and allowing only the dynamic branch to deform over time, our method prevents motion artifacts in static regions while supporting view-…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The paper proposes a reasonable two-branch framework that explicitly separates static and dynamic Gaussians. By integrating spatiotemporal encoding, deformation networks, and visibility-driven pruning, the approach provides a clear structure for handling motion and contributes to more stable reconstruction results, even if the overall architectural novelty is moderate. 2. The inclusion of depth-aware pretraining and visibility-based pruning improves geometric stability, accelerates convergenc
1. The proposed method heavily relies on the accuracy of the binary mask used to separate static and dynamic regions. Since this mask plays a crucial role in guiding the disentanglement process, the overall reconstruction quality may be highly sensitive to mask quality. Therefore, it is important to demonstrate the robustness of the method across different mask generation models or noise levels. An ablation or sensitivity analysis on mask reliability would strengthen the claims. 2. Lack of comp
* The paper identifies a well-defined weakness in prior dynamic 3DGS works (coupled geometry–appearance modeling) and provides a clear, theoretically sound decomposition strategy. * The paper reports extensive quantitative and qualitative results across multiple datasets, including ablation studies that isolate each module's contribution. * The paper is easy to follow, with well-organized sections and clear figures that effectively convey the pipeline and improvements.
While I agree that prior approaches such as Deformable 3DGS, which apply a unified deformation network to both static and dynamic regions, indeed suffer from some problems, my main concern with this paper lies in its insufficient novelty and contribution. * The proposed solution to the coupling issue primarily relies on using external masks to separate static and dynamic regions and then applying conventional deformation modeling to the dynamic part. This idea has already been explored in many
[S1] Clearly written equations The equations are presented in a concise and intuitive manner, making it easy for readers to follow the mathematical formulation. They effectively connect the theoretical design with implementation details. This clarity significantly enhances the overall readability and technical understanding of the paper. [S2] Clear and informative figures The figures visually convey the proposed method and its workflow with high clarity. They effectively complement the textua
[W1] Large overlap with RoDyGS [1]. The overall concept of SplitGaussian shows substantial overlap with RoDyGS, which also separates static and dynamic components of SfM points. Moreover, on the iPhone benchmark—a common evaluation dataset between SplitGaussian and RoDyGS—RoDyGS significantly outperforms SplitGaussian, despite being trained under a pose-free setup. The authors should clarify the key distinctions and contributions of SplitGaussian compared to RoDyGS, which was first published in
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
