Hierarchical Temporal Context Learning for Camera-based Semantic Scene   Completion

Bohan Li; Jiajun Deng; Wenyao Zhang; Zhujin Liang; Dalong Du; Xin Jin,; Wenjun Zeng

arXiv:2407.02077·cs.CV·November 7, 2024

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin,, Wenjun Zeng

PDF

Open Access 1 Repo

TL;DR

This paper introduces HTCL, a hierarchical temporal context learning method that enhances camera-based 3D semantic scene completion by effectively modeling relevant temporal information, outperforming existing approaches on benchmark datasets.

Contribution

The work proposes a novel two-step hierarchical approach for temporal context learning, including affinity measurement and dynamic refinement, improving scene completion accuracy.

Findings

01

Ranks 1st on SemanticKITTI benchmark.

02

Surpasses LiDAR-based methods in mIoU on OpenOccupancy.

03

Demonstrates effective temporal modeling for scene completion.

Abstract

Camera-based 3D semantic scene completion (SSC) is pivotal for predicting complicated 3D layouts with limited 2D image observations. The existing mainstream solutions generally leverage temporal information by roughly stacking history frames to supplement the current frame, such straightforward temporal modeling inevitably diminishes valid clues and increases learning difficulty. To address this problem, we present HTCL, a novel Hierarchical Temporal Context Learning paradigm for improving camera-based semantic scene completion. The primary innovation of this work involves decomposing temporal context learning into two hierarchical steps: (a) cross-frame affinity measurement and (b) affinity-based dynamic refinement. Firstly, to separate critical relevant context from redundant information, we introduce the pattern affinity with scale-aware isolation and multiple independent learners…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arlo0o/htcl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Time Series Analysis and Forecasting