Towards Temporal Fusion Beyond the Field of View for Camera-based Semantic Scene Completion

Jongseong Bae; Junwoo Ha; Jinnyeong Heo; Yeongin Lee; Ha Young Kim

arXiv:2511.12498·cs.CV·January 21, 2026

Towards Temporal Fusion Beyond the Field of View for Camera-based Semantic Scene Completion

Jongseong Bae, Junwoo Ha, Jinnyeong Heo, Yeongin Lee, Ha Young Kim

PDF

Open Access

TL;DR

This paper introduces C3DFusion, a novel temporal fusion module that improves camera-based semantic scene completion by effectively reconstructing out-of-view regions using aligned historical and current frame features, leading to state-of-the-art results.

Contribution

The paper presents C3DFusion, a new module for temporal fusion that explicitly aligns and enhances features from past and current frames to better reconstruct out-of-view regions in SSC.

Findings

01

C3DFusion outperforms existing methods on SemanticKITTI and SSCBench-KITTI-360 datasets.

02

It generalizes well across different baseline architectures.

03

Significant accuracy improvements in out-of-frame region reconstruction.

Abstract

Recent camera-based 3D semantic scene completion (SSC) methods have increasingly explored leveraging temporal cues to enrich the features of the current frame. However, while these approaches primarily focus on enhancing in-frame regions, they often struggle to reconstruct critical out-of-frame areas near the sides of the ego-vehicle, although previous frames commonly contain valuable contextual information about these unseen regions. To address this limitation, we propose the Current-Centric Contextual 3D Fusion (C3DFusion) module, which generates hidden region-aware 3D feature geometry by explicitly aligning 3D-lifted point features from both current and historical frames. C3DFusion performs enhanced temporal fusion through two complementary techniques-historical context blurring and current-centric feature densification-which suppress noise from inaccurately warped historical point…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization