SA-ResGS: Self-Augmented Residual 3D Gaussian Splatting for Next Best View Selection
Kim Jun-Seong, Tae-Hyun Oh, Eduardo P\'erez-Pellitero, Youngkyoon Jang

TL;DR
SA-ResGS introduces a novel framework that enhances uncertainty estimation and view selection in 3D scene reconstruction by combining residual learning, physically guided view strategies, and uncertainty-aware supervision.
Contribution
It presents a new residual learning strategy for 3D Gaussian Splatting and a physically grounded view selection method to improve active scene reconstruction.
Findings
Outperforms state-of-the-art in reconstruction quality
Improves robustness of view selection
Enhances uncertainty estimation accuracy
Abstract
We propose Self-Augmented Residual 3D Gaussian Splatting (SA-ResGS), a novel framework to stabilize uncertainty quantification and enhancing uncertainty-aware supervision in next-best-view (NBV) selection for active scene reconstruction. SA-ResGS improves both the reliability of uncertainty estimates and their effectiveness for supervision by generating Self-Augmented point clouds (SA-Points) via triangulation between a training view and a rasterized extrapolated view, enabling efficient scene coverage estimation. While improving scene coverage through physically guided view selection, SA-ResGS also addresses the challenge of under-supervised Gaussians, exacerbated by sparse and wide-baseline views, by introducing the first residual learning strategy tailored for 3D Gaussian Splatting. This targeted supervision enhances gradient flow in high-uncertainty Gaussians by combining…
Peer Reviews
Decision·ICLR 2026 Conference Desk Rejected Submission
1. It clearly points out three specific limitations of existing NBV methods, which is valuable and insightful for readers. 2. The paper introduces an innovative approach to decouple view selection from early-stage uncertainty estimation through Self-Augmented Points (SA-Points). Addresses a real limitation as the early-stage uncertainty estimates in 3DGS are unreliable due to sparse geometry and training instability.
1. The paper positions itself as contributing to sparse-view 3D reconstruction but only compares against view selection methods, does not compare against specialized sparse-view reconstruction methods for example FSGS, SparseGS, DNGaussian, MVPGaussian, RegGaussian. Need to clarify how much the 'Next Best View Selection' useful, for example, given 20 images, carefully select views (SA-ResGS) + standard training compared with uniform sampling + strong regularization (FSGS, SparseGS, ...), also n
Conceptually novel SA-Points mechanism: The idea of generating pseudo-3D points from a synthetically perturbed virtual view for coverage-aware NBV selection is original and well-motivated. It bridges geometric reasoning and learning-based active view planning in a lightweight manner. Stability improvement without heavy computation: The method improves early-stage training stability and uncertainty estimation without adding expensive optimization or extra 3D supervision. Practical training refi
ResGS innovation is incremental: While effective, the residual supervision is conceptually similar to uncertainty-weighted or hard-example reweighting schemes known in NeRF and 3DGS literature; the novelty lies more in integration than in principle. Lack of robustness and generalization analysis: No experiments test how SA-Points or ResGS behave when correspondence quality or uncertainty estimation deteriorates. Ablation depth: It remains unclear whether the gains are primarily due to SA-Point
Originality: The paper makes a notable and original contribution by integrating physical constraints and a residual learning strategy into the active 3D Gaussian Splatting (3DGS) paradigm. The introduction of Self-Augmented Points (SA-Points) for physically grounded next-best-view selection represents a creative and conceptually elegant solution to the long-standing problem of unreliable uncertainty estimation under sparse-view settings. Moreover, the residual supervision mechanism—the first of
While the proposed framework is conceptually strong and methodologically sound, the experimental results reveal certain limitations that warrant attention. Specifically, although SA-ResGS achieves clear improvements over several baselines, its quantitative performance does not consistently reach state-of-the-art levels in all metrics—particularly SSIM and LPIPS—suggesting room for further refinement in perceptual quality and structural consistency. Additionally, some experimental descriptions co
The paper’s main strength is the attempt to stabilize early NBV decisions by introducing a physically grounded coverage prefilter that does not depend on potentially unreliable uncertainty estimates. This is a clean and reusable idea that could be integrated into other active-reconstruction pipelines. The residual supervision mechanism is also practical, as it works at the level of data selection without modifying the renderer, and it is easy to plug into existing 3DGS training code. Finally, th
A major weakness of this work is that several core design choices are heuristic and insufficiently analyzed: 1. The coverage prefilter relies on a hash-encoded voxel grid that can suffer from collisions and is sensitive to voxel size, yet no robustness study regarding the parameter is provided. 2. The uncertainty-guided residual supervision selects Gaussians using opacity/scale; however, the author did not discuss the superiority against Fisher-infomation based uncertainty. [2,3] 3. The pip
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Optical Sensing Technologies
