Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps

Sungjun Cho

arXiv:2605.00052·cs.CV·May 4, 2026

Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps

Sungjun Cho

PDF

TL;DR

This paper demonstrates that training with two views per optimizer step significantly improves hybrid-capture 3D Gaussian Splatting, explained by a variance-decomposition framework highlighting the importance of structured view pairing.

Contribution

The key novelty is identifying two-view accumulation as the primary training lever, supported by a variance-based explanation for its effectiveness.

Findings

01

Two views per step outperform other methods in hybrid-capture 3DGS.

02

Variance decomposition explains the effectiveness of two-view training.

03

The two-view approach transfers to other Gaussian Splatting backbones.

Abstract

Hybrid-capture novel view synthesis combines images at substantially different camera distances (e.g., aerial drone and ground-level views). Standard 3D Gaussian Splatting (3DGS), trained for 30K iterations with one rendered view per optimizer step, under-fits the minority regime by 1-3 dB on five hybrid-capture benchmarks. We isolate the lever that closes this gap. Among compute-matched alternatives -- vanilla 60K iterations, magnitude corrections (GradNorm), direction-aware near/far gradient surgery, projective preconditioning, confidence-gated sample-level surgery, and a random two-view-per-step control -- the simplest structural change wins: rendering two views per optimizer step. The pairing rule (geometry-defined near/far, random, or active loss-disparity) does not change PSNR beyond seed variance on any of the five scenes; the structural change of having two views per step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.