ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-Compare
Freeman Cheng, Botao Ye, Xueting Li, Junqi You, Fangneng Zhan, Ming-Hsuan Yang

TL;DR
ReCoSplat is a novel autoregressive Gaussian Splatting model for online view synthesis that uses a Render-and-Compare module to improve robustness against pose errors and employs a cache compression strategy for long sequences.
Contribution
The paper introduces ReCoSplat, which supports posed and unposed inputs, and incorporates a Render-and-Compare module to handle pose inaccuracies, along with a cache compression method for long sequences.
Findings
Achieves state-of-the-art performance on various benchmarks.
Effectively handles unposed and posed input scenarios.
Reduces KV cache size by over 90% for sequences over 100 frames.
Abstract
Online novel view synthesis remains challenging, requiring robust scene reconstruction from sequential, often unposed, observations. We present ReCoSplat, an autoregressive feed-forward Gaussian Splatting model supporting posed or unposed inputs, with or without camera intrinsics. While assembling local Gaussians using camera poses scales better than canonical-space prediction, it creates a dilemma during training: using ground-truth poses ensures stability but causes a distribution mismatch when predicted poses are used at inference. To address this, we introduce a Render-and-Compare (ReCo) module. ReCo renders the current reconstruction from the predicted viewpoint and compares it with the incoming observation, providing a stable conditioning signal that compensates for pose errors. To support long sequences, we propose a hybrid KV cache compression strategy combining early-layer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques
