Beyond the Field-of-View: Enhancing Scene Visibility and Perception with   Clip-Recurrent Transformer

Hao Shi; Qi Jiang; Kailun Yang; Xiaoting Yin; Ze Wang; Kaiwei Wang

arXiv:2211.11293·cs.CV·June 25, 2024·1 cites

Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer

Hao Shi, Qi Jiang, Kailun Yang, Xiaoting Yin, Ze Wang, Kaiwei Wang

PDF

Open Access 3 Repos

TL;DR

This paper introduces FlowLens, a novel transformer-based architecture for online video inpainting that extends camera field-of-view in autonomous vehicles, improving scene perception and safety by reconstructing unseen areas from past video streams.

Contribution

The paper presents FlowLens, a new clip-recurrent transformer architecture with innovative attention and fusion modules for effective beyond-FoV scene reconstruction in autonomous driving.

Findings

01

FlowLens achieves state-of-the-art performance in beyond-FoV scene reconstruction.

02

Reconstructed scenes enhance perception accuracy within the camera's original field of view.

03

The method improves object detection and semantic understanding in extended-view scenarios.

Abstract

Vision sensors are widely applied in vehicles, robots, and roadside infrastructure. However, due to limitations in hardware cost and system size, camera Field-of-View (FoV) is often restricted and may not provide sufficient coverage. Nevertheless, from a spatiotemporal perspective, it is possible to obtain information beyond the camera's physical FoV from past video streams. In this paper, we propose the concept of online video inpainting for autonomous vehicles to expand the field of view, thereby enhancing scene visibility, perception, and system safety. To achieve this, we introduce the FlowLens architecture, which explicitly employs optical flow and implicitly incorporates a novel clip-recurrent transformer for feature propagation. FlowLens offers two key features: 1) FlowLens includes a newly designed Clip-Recurrent Hub with 3D-Decoupled Cross Attention (DDCA) to progressively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Vision and Imaging

MethodsInpainting