ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-Compare

Freeman Cheng; Botao Ye; Xueting Li; Junqi You; Fangneng Zhan; Ming-Hsuan Yang

arXiv:2603.09968·cs.CV·March 11, 2026

ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-Compare

Freeman Cheng, Botao Ye, Xueting Li, Junqi You, Fangneng Zhan, Ming-Hsuan Yang

PDF

Open Access

TL;DR

ReCoSplat is a novel autoregressive Gaussian Splatting model for online view synthesis that uses a Render-and-Compare module to improve robustness against pose errors and employs a cache compression strategy for long sequences.

Contribution

The paper introduces ReCoSplat, which supports posed and unposed inputs, and incorporates a Render-and-Compare module to handle pose inaccuracies, along with a cache compression method for long sequences.

Findings

01

Achieves state-of-the-art performance on various benchmarks.

02

Effectively handles unposed and posed input scenarios.

03

Reduces KV cache size by over 90% for sequences over 100 frames.

Abstract

Online novel view synthesis remains challenging, requiring robust scene reconstruction from sequential, often unposed, observations. We present ReCoSplat, an autoregressive feed-forward Gaussian Splatting model supporting posed or unposed inputs, with or without camera intrinsics. While assembling local Gaussians using camera poses scales better than canonical-space prediction, it creates a dilemma during training: using ground-truth poses ensures stability but causes a distribution mismatch when predicted poses are used at inference. To address this, we introduce a Render-and-Compare (ReCo) module. ReCo renders the current reconstruction from the predicted viewpoint and compares it with the incoming observation, providing a stable conditioning signal that compensates for pose errors. To support long sequences, we propose a hybrid KV cache compression strategy combining early-layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques