TL;DR
AIR introduces a self-supervised, feed-forward framework for 2D Gaussian splatting that eliminates iterative optimization, enabling fast and high-quality image reconstruction.
Contribution
It proposes a novel stage-wise residual architecture with a stage control mechanism and a training strategy that stabilizes multi-stage prediction without per-image optimization.
Findings
Achieves better reconstruction quality than Gaussian-based baselines.
Reduces encoding time to 160--300 ms.
Operates without per-image iterative optimization.
Abstract
2D Gaussian splatting provides an efficient explicit representation for image reconstruction, but existing methods still require costly per-image iterative optimization or rely on handcrafted priors for primitive allocation. We present AIR, a self-supervised feed-forward framework that amortizes iterative Gaussian fitting into a single network pass, eliminating per-image test-time optimization. AIR adopts a stage-wise residual architecture that progressively predicts additional Gaussian primitives from reconstruction residuals, together with an explicit Stage Control mechanism that activates new primitives only in under-reconstructed regions. A Predict--Optimize--Distill training strategy stabilizes multi-stage prediction by distilling short-horizon optimized Gaussian increments back into the predictor. The stabilized predictor is then jointly finetuned across stages and equipped with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
